1
00:00:00,000 --> 00:00:19,090
36c3 preroll music
2
00:00:19,090 --> 00:00:24,929
Herald: Our next talk will be "The
ultimate Acorn Archimedes Talk", in which
3
00:00:24,929 --> 00:00:28,819
there will be spoken about everything
about the Archimedes computer. There's a
4
00:00:28,819 --> 00:00:33,360
promise in advance that there will be no
heureka jokes in there. Give a warm
5
00:00:33,360 --> 00:00:35,483
welcome to Matt Evans.
6
00:00:35,483 --> 00:00:40,790
applause
7
00:00:40,790 --> 00:00:48,060
Matt Evans: Thank you. Okay. Little bit of
retro computing first thing in the
8
00:00:48,060 --> 00:00:54,949
morning, sort of. Welcome. My name is Matt
Evans. The Acorn Archimedes was my
9
00:00:54,949 --> 00:00:59,379
favorite computer when I was a small
hacker and I'm privileged to be able to
10
00:00:59,379 --> 00:01:04,780
talk a bit little bit about it with you
today. Let's start with: What is an Acorn
11
00:01:04,780 --> 00:01:08,720
Archimedes? So I'd like an interactive
session, I'm afraid. Please indulge me,
12
00:01:08,720 --> 00:01:15,130
like a show of hands. Who's heard of the
Acorn Archimedes before? Ah, OK, maybe 50,
13
00:01:15,130 --> 00:01:23,090
60%. Who has used one? Maybe 10%,
maybe. Okay. Who has programs -
14
00:01:23,090 --> 00:01:30,139
who has coded on an Archimedes? Maybe
half? Two, three people. Great. Okay.
15
00:01:30,139 --> 00:01:34,180
Three. laughs Okay, so a small
percentage. I don't see these machines as
16
00:01:34,180 --> 00:01:39,650
being as famous as say the Apple Macintosh
or IBM PC. And certainly outside of Europe
17
00:01:39,650 --> 00:01:44,030
they were not that common. So this is kind
of interesting just how many people here
18
00:01:44,030 --> 00:01:49,840
have seen this. So it was the first ARM-
based computer. This is an astonishingly
19
00:01:49,840 --> 00:01:55,530
1980s - I think one of them is drawing,
actually. But they're not just the first
20
00:01:55,530 --> 00:02:01,439
ARM-based machine, but the machine that
the ARM was originally designed to drive.
21
00:02:01,439 --> 00:02:07,230
It's a... Is that a comment for me?
Mic?
22
00:02:07,230 --> 00:02:13,750
I'm being heckled already. It's only slide
two. Let's see how this goes. So it's a
23
00:02:13,750 --> 00:02:18,849
two box computer. It looks a bit like a
Mega S.T. ... to me. Its main unit with
24
00:02:18,849 --> 00:02:26,480
the processor and disks and expansion
cards and so on. Now this is an A3000.
25
00:02:26,480 --> 00:02:30,519
This is mine, in fact, and I didn't bother
to clean it before taking the photo. And
26
00:02:30,519 --> 00:02:33,335
now it's on this huge screen. That was a
really bad idea. You can see all the
27
00:02:33,335 --> 00:02:37,429
disgusting muck in the keyboard. It has a
bit of ink on it, I don't know why. But
28
00:02:37,429 --> 00:02:41,660
this machine is 30 years old. And
this was luckily my machine, as I said, as
29
00:02:41,660 --> 00:02:45,069
a small hacker. And this is why I'm doing
the talk today. This had a big influence
30
00:02:45,069 --> 00:02:52,540
on me. I'd like to say as a person, but
more as an engineer. In terms of what my
31
00:02:52,540 --> 00:02:57,170
programing experience when I was learning
to program and so on. So I live and work
32
00:02:57,170 --> 00:03:02,040
in Cambridge in the U.K., where this
machine was designed. And through the
33
00:03:02,040 --> 00:03:05,470
funny sort of turn of events, I ended up
there and actually work in the building
34
00:03:05,470 --> 00:03:09,310
next to the building where this was
designed. And a bunch of the people that
35
00:03:09,310 --> 00:03:13,720
were on that original team that designed
this system are still around and
36
00:03:13,720 --> 00:03:18,280
relatively contactable. And I thought this
is a good opportunity to get on the phone
37
00:03:18,280 --> 00:03:21,760
and call them up or go for a beer with a
couple of them and ask them: Why are
38
00:03:21,760 --> 00:03:25,280
things the way they are? There's all sorts
of weird quirks to this machine. I was
39
00:03:25,280 --> 00:03:28,901
always wondering this, for 20 years. Can
you please tell me - why did you do it
40
00:03:28,901 --> 00:03:33,330
this way? And they were a really good bunch
of people. So I talked to Steve Ferber,
41
00:03:33,330 --> 00:03:37,790
who led the hardware design, Sophie
Wilson, who was the same with software.
42
00:03:37,790 --> 00:03:43,350
Tudor Brown, who did the video system.
Mike Miller, the IO system. John Biggs and
43
00:03:43,350 --> 00:03:46,489
Jamie Urquhart , who did the silicon
design, I spoiled one of the
44
00:03:46,489 --> 00:03:50,140
surprises here. There's been some silicon
design that's gone on in building this
45
00:03:50,140 --> 00:03:55,060
Acorn. And they were all wonderful people
that gave me their time and told me a
46
00:03:55,060 --> 00:03:59,550
bunch of anecdotes that I will pass on to
you. So I'm going to talk about the
47
00:03:59,550 --> 00:04:04,520
classic Arc. There's a bunch of different
machines that Acorn built into the 1990s.
48
00:04:04,520 --> 00:04:08,960
But the ones I'm talking about started in
1987. There were 2 models, effectively a
49
00:04:08,960 --> 00:04:14,970
low end and a high end. One had an option
for a hard disk, 20 megabytes, 2300
50
00:04:14,970 --> 00:04:20,700
pounds, up to 4MB of RAM. They all share
the same basic architecture, they're all
51
00:04:20,700 --> 00:04:25,820
basically the same. So the A3000 that I
just showed you came out in 1989. That was
52
00:04:25,820 --> 00:04:29,600
the machine I had. Those again, the same.
It had the memory controller slightly
53
00:04:29,600 --> 00:04:35,970
updated, was slightly faster. They all had
an ARM 2. This was the released version of
54
00:04:35,970 --> 00:04:40,910
the ARM processor designed for this
machine, at 8 MHz. And then finally in
55
00:04:40,910 --> 00:04:46,250
1990, what I call the last of the classic
Arc, Archimedes, is the A540. This was the
56
00:04:46,250 --> 00:04:50,720
top end machine - could have up to
16 MB of memory, which is a fair bit
57
00:04:50,720 --> 00:04:57,600
even in 1990. It had a 30 MHz ARM 3. The
ARM 3 was the evolution of the ARM 2, but
58
00:04:57,600 --> 00:05:02,130
with a cache and a lot faster. So this
talk will be centered around how these
59
00:05:02,130 --> 00:05:08,820
machines work, not the more modern
machines. So around 1987, what else
60
00:05:08,820 --> 00:05:13,760
was available? This is a random selection
of machines. Apologies if your favorite
61
00:05:13,760 --> 00:05:18,490
machine is not on this list. It wouldn't
fit on the slide otherwise. So at the
62
00:05:18,490 --> 00:05:22,110
start of the 80s, we had the exotic things
like the Apple Lisa and the Apple Mac.
63
00:05:22,110 --> 00:05:28,720
Very expensive machines. The Amiga - I had
to put in here. Started off relatively
64
00:05:28,720 --> 00:05:32,530
expensive because the Amiga 500 was, you
know, very good value for money, very
65
00:05:32,530 --> 00:05:37,160
capable machine. But I'm comparing this
more to PCs and Macs, because that was the
66
00:05:37,160 --> 00:05:41,950
sort of, you know, market it was going
for. And although it was an expensive
67
00:05:41,950 --> 00:05:46,790
machine compared to Macintosh, it was
pretty cheap. Even put NeXT Cube on there,
68
00:05:46,790 --> 00:05:49,890
I figured that... I'd heard that they were
incredibly expensive. And actually
69
00:05:49,890 --> 00:05:53,640
compared to the Macintosh, they're not
that expensive at all. Well I don't know
70
00:05:53,640 --> 00:05:57,930
which one I would have preferred. So the
first question I asked them - the first
71
00:05:57,930 --> 00:06:02,970
thing they told me: Why was it built? I've
used them in school and as I said, had one
72
00:06:02,970 --> 00:06:08,560
at home. But I was never really quite sure
what it was for. And I think a lot of the
73
00:06:08,560 --> 00:06:11,850
Acorn marketing wasn't quite sure what it
was for either. They told me it was the
74
00:06:11,850 --> 00:06:15,940
successor to the BBC Micro, this 8 bit
machine. Lovely 6502 machine, incredibly
75
00:06:15,940 --> 00:06:20,100
popular, especially in the UK. And the
goal was to make a machine that was 10
76
00:06:20,100 --> 00:06:23,770
times the performance of this. The
successor would be 10 times faster at the
77
00:06:23,770 --> 00:06:29,680
same price. And the thing I didn't know is
they had been inspired. The team Acorn had
78
00:06:29,680 --> 00:06:35,620
seen the Apple Lisa and the Xerox Star,
which comes from the famous Xerox Alto,
79
00:06:35,620 --> 00:06:41,140
Xerox PARC, first GUI workstation in the
70s, monumental machine. They'd been
80
00:06:41,140 --> 00:06:44,690
inspired by these machines and they wanted
to make something very similar. So this is
81
00:06:44,690 --> 00:06:49,190
the same story as the Macintosh. They
wanted to make something that was desktop
82
00:06:49,190 --> 00:06:52,310
machine for business, for office
automation, desktop publishing and that
83
00:06:52,310 --> 00:06:56,270
kind of thing. But I never really
understood this before. So this was this
84
00:06:56,270 --> 00:07:01,650
inspiration came from the Xerox machines.
It was supposed to be obviously a lot more
85
00:07:01,650 --> 00:07:06,680
affordable and a lot faster. So this is
what happens when Acorn marketing gets
86
00:07:06,680 --> 00:07:12,020
hold of this vision. So Xerox Star on the
left is this nice, sensible business
87
00:07:12,020 --> 00:07:15,212
machine. Someone's wearing nice, crisp
suit bumps microphon banging their
88
00:07:15,212 --> 00:07:20,470
microphone - and it gets turned into the
very Cambridge Tweed version on the right.
89
00:07:20,470 --> 00:07:24,410
It's apparently illegal to program one of
these if you're not wearing a top hat. But
90
00:07:24,410 --> 00:07:28,850
no one told me that when I was a kid. And
my court case comes up next week. So
91
00:07:28,850 --> 00:07:32,240
Cambridge is a bit of a funny place. And
for those that been there, this picture on
92
00:07:32,240 --> 00:07:38,680
the right sums it all up. So they began
Project A, which was build this new
93
00:07:38,680 --> 00:07:43,240
machine. And they looked at the
alternatives. They looked at the
94
00:07:43,240 --> 00:07:49,560
processors that were available at that
time, the 286, the 68 K, then that semi
95
00:07:49,560 --> 00:07:55,056
32016, which was an early 32 bit
machine, a bit of a weird processor. And
96
00:07:55,056 --> 00:07:58,030
they all had something in common that
they're ridiculously expensive and in
97
00:07:58,030 --> 00:08:02,760
Tudors words a bit crap. They weren't a
lot faster than the BBC Micro. They're a
98
00:08:02,760 --> 00:08:06,620
lot more expensive. They're much more
complicated in terms of the processor
99
00:08:06,620 --> 00:08:10,490
itself. But also the system around them
was very complicated. They need lots of
100
00:08:10,490 --> 00:08:15,400
weird support chips. This just drove the
price up of the system and it wasn't going
101
00:08:15,400 --> 00:08:20,400
to hit that 10 times performance, let
alone at the same price point. They'd
102
00:08:20,400 --> 00:08:24,100
visited a couple of other companies
designing their own custom silicon. They
103
00:08:24,100 --> 00:08:28,090
got this idea in about 1983. They were
looking at some of the RISC papers coming
104
00:08:28,090 --> 00:08:31,330
out of Berkeley and they were quite
impressed by what a bunch of grad students
105
00:08:31,330 --> 00:08:38,070
were doing. They managed to get a working
RISC processor and they went to Western
106
00:08:38,070 --> 00:08:42,140
Design Center and looked at 6502
successors being design there. They had a
107
00:08:42,140 --> 00:08:45,210
positive experience. They saw a bunch of
high school kids with Apple 2s doing
108
00:08:45,210 --> 00:08:48,930
silicon layout. And they though "OK,
well". They'd never designed a CPU before
109
00:08:48,930 --> 00:08:53,310
at ACORN. ACORN hadn't done any custom
silicon to this degree, but they were
110
00:08:53,310 --> 00:08:57,160
buoyed by this and they thought, okay,
well, maybe RISC is the secret and we can
111
00:08:57,160 --> 00:09:02,250
do this. And this was not really the done
thing in this timeframe and not for a
112
00:09:02,250 --> 00:09:05,890
company the size of ACORN, but they
designed their computer from scratch. They
113
00:09:05,890 --> 00:09:09,200
designed all of the major pieces of
silicon in this machine. And it wasn't
114
00:09:09,200 --> 00:09:12,380
about designing the ARM chip. Hey, we've
got a processor core. What should we do
115
00:09:12,380 --> 00:09:16,000
with it? But it was about designing the
machine that ARM and the history of that
116
00:09:16,000 --> 00:09:20,310
company has kind of benefited from. But
this is all about designing the machine as
117
00:09:20,310 --> 00:09:26,710
a whole. They're a tiny team. They're a
handful of people - about a dozen...ish
118
00:09:26,710 --> 00:09:30,780
that did the hardware design, a similar
sort of order for software and operating
119
00:09:30,780 --> 00:09:36,210
systems on top, which is orders of
magnitude different from IBM and Motorola
120
00:09:36,210 --> 00:09:40,950
and so forth that were designing computers
at this time. RISC was the key. They
121
00:09:40,950 --> 00:09:44,323
needed to be incredibly simple. One of the
other experiences they had was they went
122
00:09:44,323 --> 00:09:48,820
to a CISC processor design center. They
had a team in a couple of hundred people
123
00:09:48,820 --> 00:09:52,650
and they were on revision H and it still
had bugs and it was just this unwieldy,
124
00:09:52,650 --> 00:09:58,160
complex machine. So RISC was the secret.
Steve Ferber has an interview somewhere.
125
00:09:58,160 --> 00:10:03,470
He jokes about ACORN management giving him
two things. Special sauce was two things
126
00:10:03,470 --> 00:10:07,810
that no one else had: He'd no people and
no money. So it had to be incredibly
127
00:10:07,810 --> 00:10:14,710
simple. It had to be built on a
shoestring, as Jamie said to me. So there
128
00:10:14,710 --> 00:10:18,460
are lots of corners cut, but in the right
way. I would say "corners cut", that
129
00:10:18,460 --> 00:10:23,220
sounds ungenerous. There's some very
shrewd design decisions, always weighing
130
00:10:23,220 --> 00:10:30,210
up cost versus benefit. And I think they
erred on the correct side for all of them.
131
00:10:30,210 --> 00:10:34,480
So Steve sent me this picture. That's he's
got a cameo here. That's the outline of
132
00:10:34,480 --> 00:10:39,180
him in the reflection on the glass there.
He's got this up in his office. So he
133
00:10:39,180 --> 00:10:43,630
led the hardware design of all of these
chips at ACORN. Across the top, we've got
134
00:10:43,630 --> 00:10:49,450
the original ARM, the ARM 1, ARM 2 and the
ARM 3 - guess the naming scheme - and the
135
00:10:49,450 --> 00:10:53,090
video controller, memory controller and IO
controller. Think, sort of see their
136
00:10:53,090 --> 00:10:57,320
relative sizes and it's kind of pretty.
This was also on a processor where you
137
00:10:57,320 --> 00:11:00,930
could really point at that and say, "oh,
that's the register file and you can see
138
00:11:00,930 --> 00:11:07,210
the cache over there". You can't really do
that nowadays with modern processors. So
139
00:11:07,210 --> 00:11:11,080
the bit about the specification, what it
could do, the end product. So I mentioned
140
00:11:11,080 --> 00:11:16,850
they all had this ARM 2 8MHz, up to four
MB of RAM, 26-bit addresses, remember
141
00:11:16,850 --> 00:11:21,670
that. That's weird. So a lot of 32-bit
machines, had 32-bit addresses or the ones
142
00:11:21,670 --> 00:11:25,550
that we know today do. That wasn't the
case here. And I'll explain why in a
143
00:11:25,550 --> 00:11:32,610
minute. The A540 had a updated CPU. The
memory controller had an MMU, which was
144
00:11:32,610 --> 00:11:39,350
unusual for machines of the mid 80s. So it
could support, the hardware would support
145
00:11:39,350 --> 00:11:45,620
virtual memory, page faults and so on. It
had decent sound, it had 8-channel sound,
146
00:11:45,620 --> 00:11:49,460
hardware mixed and stereo. It was 8 bit,
but it was logarithmic - so it was a bit
147
00:11:49,460 --> 00:11:53,240
like u-law, if anyone knows that - instead
of PCM, so you got more precision at the
148
00:11:53,240 --> 00:11:58,300
low end and it sounded to me a little bit
like 12 bit PCM sound. So this is quite
149
00:11:58,300 --> 00:12:04,840
good. Storage wise, it's the same floppy
controller as the Atari S.T.. It's fairly
150
00:12:04,840 --> 00:12:09,690
boring. Hard disk controller was a
horrible standard called ST506, MFM
151
00:12:09,690 --> 00:12:16,420
drives, which were very, very crude
compared to disks we have today. Keyboard
152
00:12:16,420 --> 00:12:19,980
and mouse, nothing to write home about. I
mean, it was a normal keyboard. It was
153
00:12:19,980 --> 00:12:23,430
nothing special going on there. And
printer port, serial port and some
154
00:12:23,430 --> 00:12:29,380
expansion slots which, I'll
outline later on. The thing I really liked
155
00:12:29,380 --> 00:12:32,650
about the ARC was the graphics
capabilities. It's fairly capable,
156
00:12:32,650 --> 00:12:37,800
especially for a machine of that era and
of the price. It just had a flat frame
157
00:12:37,800 --> 00:12:42,170
buffer so it didn't have sprites, which is
unfortunate. It didn't have a blitter and
158
00:12:42,170 --> 00:12:47,270
a bitplanes and so forth. But the upshot
of that is dead simple to program. It had
159
00:12:47,270 --> 00:12:52,320
a 256 color mode, 8 bits per pixel, so
it's a byte, and it's all just laid out as
160
00:12:52,320 --> 00:12:55,890
a linear string of bytes. So it was dead
easy to just write some really nice
161
00:12:55,890 --> 00:12:59,910
optimized code to just blit stuff to the
screen. Part of the reason why there isn't
162
00:12:59,910 --> 00:13:05,090
a blitter is actually the CPU was so good
at doing this. Colorwise, it's got
163
00:13:05,090 --> 00:13:10,620
paletted modes out of a 4096 color
palette, same as the Amiga. It has this
164
00:13:10,620 --> 00:13:16,350
256 color mode, which is different. The
big high end machines, the top end
165
00:13:16,350 --> 00:13:21,290
machines, the A540 and the A400 series
could also do this very high res 1152 by
166
00:13:21,290 --> 00:13:24,235
900, which was more of a workstation
resolution. If you bought a Sun
167
00:13:24,235 --> 00:13:28,140
workstation a Sun 3 in those days, could
do this and some higher resolutions. But
168
00:13:28,140 --> 00:13:32,890
this is really not seen on computers that
might have in the office or school or
169
00:13:32,890 --> 00:13:36,370
education at the end of the market. And
it's quite clever the way they did that.
170
00:13:36,370 --> 00:13:40,450
I'll come back to that in a sec. But for
me, the thing about the ARC: For the
171
00:13:40,450 --> 00:13:45,920
money, it was the fastest machine around.
It was definitely faster than 386s and all
172
00:13:45,920 --> 00:13:49,548
the stuff that Motorola was doing at the
time by quite a long way. It is almost
173
00:13:49,548 --> 00:13:53,580
eight times faster than a 68k at about the
same clock speed. And it's to do with it's
174
00:13:53,580 --> 00:13:57,020
pipelineing and to do with it having a 32
bit word and a couple of other tricks
175
00:13:57,020 --> 00:14:01,070
again. I'll show you later on what the
secret to that performance was. About
176
00:14:01,070 --> 00:14:04,850
minicomputer speed and compared to some of
the other RISC machines at the time, it
177
00:14:04,850 --> 00:14:09,450
wasn't the first RISC in the world, it was
the first cheap RISC and the first RISC
178
00:14:09,450 --> 00:14:14,020
machine that people could feasibly buy and
have on their desks at work or in
179
00:14:14,020 --> 00:14:19,222
education. And if you compare it to
something like the MIPS or the SPARC, it
180
00:14:19,222 --> 00:14:25,300
was not as fast as a MIPS or SPARC chip.
It was also a lot smaller, a lot cheaper.
181
00:14:25,300 --> 00:14:29,240
Both of those other processers had very
big Die. They needed other support chips.
182
00:14:29,240 --> 00:14:33,040
They had huge packages, lots of pins, lots
of cooling requirements. So all this
183
00:14:33,040 --> 00:14:36,180
really added up. So I priced up
a Sun 4 workstation at the time and
184
00:14:36,180 --> 00:14:40,050
it was well over four times the price of
one of these machines. And that was before
185
00:14:40,050 --> 00:14:44,400
you add on extras such as disks and
network interfaces and things like that.
186
00:14:44,400 --> 00:14:47,480
So it's very good, very competitive for
the money. And if you think about building
187
00:14:47,480 --> 00:14:50,140
a cluster, then you could get a lot more
throughput, you could network them
188
00:14:50,140 --> 00:14:56,980
together. So this is about as far as I got
when I was a youngster, I was wasn't brave
189
00:14:56,980 --> 00:15:03,230
enough to really take the machine apart
and poke around. Fortunately, now it's 30
190
00:15:03,230 --> 00:15:07,180
years old and I'm fine. I'm qualified and
doing this. I'm going to take it apart.
191
00:15:07,180 --> 00:15:12,089
Here's the motherboard. Quite a nice clean
design. This was built in Wales for anyone
192
00:15:12,089 --> 00:15:17,510
that's been to the UK. Very unusual these
days. Anything to be built in the UK. It's
193
00:15:17,510 --> 00:15:23,420
got several main sections around these
four chips. Remember the Steve photo
194
00:15:23,420 --> 00:15:29,470
earlier on? This is the chip set: the ARM
BMC, PDC, IOC. So the IOC side of things
195
00:15:29,470 --> 00:15:34,090
happens over on the left video and sound
in the top right. And the memory and the
196
00:15:34,090 --> 00:15:38,399
processor in the middle. It's got a
megabyte onboard and you can plug in an
197
00:15:38,399 --> 00:15:43,640
expansion for 4 MB. So memory map
from the software view. I mentioned this
198
00:15:43,640 --> 00:15:46,930
26-bit addressing and I think this is one
of the key characteristics of one of these
199
00:15:46,930 --> 00:15:52,210
machines. So you have a 64MB address
space, it's quite packed. That's quite a
200
00:15:52,210 --> 00:15:56,980
lot of stuff shoehorned into here. So
there's the memory. The bottom half of the
201
00:15:56,980 --> 00:16:02,040
address space, 32MB of that is the
processor. It's got user space and
202
00:16:02,040 --> 00:16:08,100
privilege mode. It's got a concept of
privilege within the processor execution.
203
00:16:08,100 --> 00:16:11,851
So when you're in user mode, you only get
to see the bottom half and that's the
204
00:16:11,851 --> 00:16:16,250
virtual maps. There's the MMU, that will
map pages into that space and then when
205
00:16:16,250 --> 00:16:18,980
you're in supervisor mode, you get to see
the whole of the rest of the memory,
206
00:16:18,980 --> 00:16:23,380
including the physical memory and various
registers up the top. The thing to notice
207
00:16:23,380 --> 00:16:27,460
here is: there's stuff hidden behind the
ROM, this address space is very packed
208
00:16:27,460 --> 00:16:31,390
together. So there's a requirement
for control registers, for the memory
209
00:16:31,390 --> 00:16:34,770
controller, for the video controller and
so on, and they write only registers in
210
00:16:34,770 --> 00:16:39,700
ROM basically. So you write to the ROM and
you get to hit these registers. Kind of
211
00:16:39,700 --> 00:16:43,730
weird when you first see it, but it was
quite a clever way to fit this stuff into
212
00:16:43,730 --> 00:16:50,810
the address space. So it will start with
the ARM1. So Sophie Wilson designed the
213
00:16:50,810 --> 00:16:59,070
instruction set late 1983, Steve took the
instruction set and designed the top
214
00:16:59,070 --> 00:17:02,880
level, the block, the micro architecture
of this processor. So this is the data
215
00:17:02,880 --> 00:17:08,140
path and how the control logic works. And
then the VLSI team, then implemented this,
216
00:17:08,140 --> 00:17:12,420
did their own custom cells. There's a
custom data path and custom logic
217
00:17:12,420 --> 00:17:18,179
throughout this. It took them about a
year, all in. Well, 1984, that sort of...
218
00:17:18,179 --> 00:17:23,832
This project A really kicked off early
1984. And this staked out first thing
219
00:17:23,832 --> 00:17:34,690
early 1985. The design process the guys
gave me a little bit of... So Jamie
220
00:17:34,690 --> 00:17:40,800
Urquhart and John Biggs gave me a bit of
an insight into how they worked on the
221
00:17:40,800 --> 00:17:46,870
VLSI side of things. So they had an Apollo
workstation, just one Apollo workstation,
222
00:17:46,870 --> 00:17:51,760
the DN600. This is a 68K based washing
machine, as Jamie described it. It's this
223
00:17:51,760 --> 00:17:56,180
huge thing. It cost about 50˙000 £.
It's incredibly expensive. And they
224
00:17:56,180 --> 00:18:00,220
designed all of this with just one of
these workstations. Jamie got in at 5:00
225
00:18:00,220 --> 00:18:04,060
a.m., worked until the afternoon and then
let someone else on the machine. So they
226
00:18:04,060 --> 00:18:06,760
shared the workstation, they worked
shifts so that they could design this
227
00:18:06,760 --> 00:18:10,020
whole thing on one workstation. So this
comes back to that. It was designed on a
228
00:18:10,020 --> 00:18:13,660
bit of a shoestring budget. When they got
a couple of other workstations later on in
229
00:18:13,660 --> 00:18:17,760
the projects, there was an allegation that
the software might not have been licensed
230
00:18:17,760 --> 00:18:21,950
initially on the other workstations and
the CAD software might have been. I can
231
00:18:21,950 --> 00:18:28,450
neither confirm nor deny whether that's
true. So Steve wrote a BBC Basic
232
00:18:28,450 --> 00:18:33,300
simulator for this. When he's designing
this block level micro architecture run on
233
00:18:33,300 --> 00:18:38,750
his BBC Micro. So this could then run real
software. There could be a certain amount
234
00:18:38,750 --> 00:18:41,570
of software development, but then they
could also validate that the design was
235
00:18:41,570 --> 00:18:46,820
correct. There's no cache on this. This is
a quite a large chip. 50 square
236
00:18:46,820 --> 00:18:52,290
millimeters was the economic limit of
those days for this part of the market.
237
00:18:52,290 --> 00:18:56,100
There's no cache. That also would have
been far too complicated. So this was
238
00:18:56,100 --> 00:19:03,120
also, I think, quite a big risk, no pun
intended. The aim of doing this
239
00:19:03,120 --> 00:19:07,620
with such a small team that they're all
very clever people. But they hadn't all
240
00:19:07,620 --> 00:19:11,490
got experience in building chips before.
And I think they knew what they were up
241
00:19:11,490 --> 00:19:15,100
against. And so not having a cache of
complicated things like that was the right
242
00:19:15,100 --> 00:19:20,910
choice to make. I'll show you later that
that didn't actually affect things. So
243
00:19:20,910 --> 00:19:24,810
this was a RISC machine. If anyone has not
programmed ARM in this room then get out
244
00:19:24,810 --> 00:19:29,400
at once. But if you have programed ARM
this is quite familiar with some
245
00:19:29,400 --> 00:19:36,210
differences. It's a classical three
operand RISC, its got three shift on one of
246
00:19:36,210 --> 00:19:38,790
the operands for most of the instructions.
So you can do things like static
247
00:19:38,790 --> 00:19:43,820
multiplies quite easily. It's not purist
RISC though. It does have load or store
248
00:19:43,820 --> 00:19:47,980
multiple instructions. So these will, as
the name implies, load or store multiple
249
00:19:47,980 --> 00:19:51,460
number of registers in one go. So one
register per cycle, but it's all done
250
00:19:51,460 --> 00:19:54,970
through one instruction. This is not RISC.
Again, there's a good reason for doing
251
00:19:54,970 --> 00:19:59,300
that. So when one comes back and it gets
plugged into a board that looks a bit like
252
00:19:59,300 --> 00:20:07,400
this. This is called the A2P, the ARM second
processor. It plugs into a BBC Micro. It's
253
00:20:07,400 --> 00:20:11,280
basically there's a thing called the Tube,
which is sort of a FIFO like arrangement.
254
00:20:11,280 --> 00:20:15,230
The BBC Micro can send messages one way
and this can send messages back. And the
255
00:20:15,230 --> 00:20:20,250
BBC Micro has the discs, it has the I/O,
keyboard and so on. And that's used as the
256
00:20:20,250 --> 00:20:23,960
hosts to then download code into one
megabytes of RAM up here and then you
257
00:20:23,960 --> 00:20:29,010
combine the code on the ARM. So this was
the initial system, 6 MHz. The
258
00:20:29,010 --> 00:20:32,350
thing I found quite interesting about
this, I mentioned that Steve had built
259
00:20:32,350 --> 00:20:37,200
this BBC Basic simulation, one of the
early bits of software that could run on
260
00:20:37,200 --> 00:20:41,870
this. So he'd ported BBC Basic to ARM and
written an ARM version of it. The Basic
261
00:20:41,870 --> 00:20:47,780
interpreter was very fast, very lean, and
it was running on this board early on.
262
00:20:47,780 --> 00:20:51,750
They then built a simulator called ASIM,
which was an event based simulator for
263
00:20:51,750 --> 00:20:55,240
doing logic design and all of the other
chips in the chips on the chipset that
264
00:20:55,240 --> 00:20:59,020
were simulated using ASIM on ARM1 which is
quite nice. So this was the fastest
265
00:20:59,020 --> 00:21:02,480
machine that they had around. They didn't
have, you know, the thousands of machines
266
00:21:02,480 --> 00:21:07,730
in the cluster like you'd have in a
modern company doing EDA. They had
267
00:21:07,730 --> 00:21:11,370
a very small number of machines and these
were the fastest ones they had about. So
268
00:21:11,370 --> 00:21:17,450
ARM2 was simulated on ARM1 and all the
other chipset. So then ARM2 comes along.
269
00:21:17,450 --> 00:21:21,590
So it's a year later, this is a shrink of
the design. It's based on the same basic
270
00:21:21,590 --> 00:21:26,000
micro architecture but has a multiplier
now. It's a booth multiplier , so it is at
271
00:21:26,000 --> 00:21:32,090
worst case, 16 cycle, multiply just two
bits per clock. Again, no cache. But one
272
00:21:32,090 --> 00:21:36,950
thing they did add in on to is banked
registers. Some of the processor modes I
273
00:21:36,950 --> 00:21:42,940
mentioned there's an interrupt mode. Next
slide, some of the processor modes will
274
00:21:42,940 --> 00:21:47,960
basically give you different view on
registers, which is very useful. These
275
00:21:47,960 --> 00:21:51,090
were all validated at 8 MHz. So
the product was designed for 8 MHz.
276
00:21:51,090 --> 00:21:54,020
The company that built them
said, okay, put the stamp on the outside
277
00:21:54,020 --> 00:21:57,681
saying 8 MHz. There's two
versions of this chip and I think they're
278
00:21:57,681 --> 00:22:01,390
actually the same silicon. I've got a
suspicion that they're the same. They just
279
00:22:01,390 --> 00:22:05,420
tested this batch saying that works at 10
or 12. So on my project list is
280
00:22:05,420 --> 00:22:12,270
overclocking my A3000 to see how fast
it'll go and see if I can get it to 12 MHz.
281
00:22:12,270 --> 00:22:18,559
Okay. So the banking of the registers.
ARM has got this even modern 32 bit
282
00:22:18,559 --> 00:22:25,060
type of interrupts and an IRQ
pronounced "erk" in English and FIQ
283
00:22:25,060 --> 00:22:28,559
pronounced "fic" in English. I appreciate it
doesn't mean quite the same thing in
284
00:22:28,559 --> 00:22:34,290
German. So I call if FIQ from here on in
and FIQ mode has this property where
285
00:22:34,290 --> 00:22:37,830
the top half of the registers are effectively
different registers when you get into
286
00:22:37,830 --> 00:22:42,670
this mode. So this lets you first of all
you don't have to back up those registers.
287
00:22:42,670 --> 00:22:47,950
I mean your FIQ handler. And
secondly if you can write an FIQ handler
288
00:22:47,950 --> 00:22:51,970
using just those registers and there's
enough for doing most basic tasks, you
289
00:22:51,970 --> 00:22:55,940
don't have to save and restore anything
when you get an interrupt. So this is
290
00:22:55,940 --> 00:23:02,510
designed specifically to be very, very low
overhead interrupt mode. So I'm coming to
291
00:23:02,510 --> 00:23:07,890
why there's a 26 bit address space. And so
I found this link very unintuitive. So
292
00:23:07,890 --> 00:23:13,520
unlike 32 bit ARM, the more modern
1990s onwards ARMs, the program counter
293
00:23:13,520 --> 00:23:17,020
register 15 doesn't just contain the
program counter, but also contains the
294
00:23:17,020 --> 00:23:20,420
status flags and processor mode and
effectively all of the machine state is
295
00:23:20,420 --> 00:23:24,200
packed in there as well. So I asked the
question, well why, why 64 megabytes of
296
00:23:24,200 --> 00:23:27,700
address space? What's special about 64.
And Mike told me, well, you're asking the
297
00:23:27,700 --> 00:23:31,980
wrong question. It's the other way round.
What we wanted was this property that all
298
00:23:31,980 --> 00:23:35,990
of the machine state is in one register.
So this means you just have to save one
299
00:23:35,990 --> 00:23:40,000
register. Well, you know, what's the harm
in saving two registers? And he reminded
300
00:23:40,000 --> 00:23:43,490
me of this FIQ mode. Well, if you're
already in a state where you've really
301
00:23:43,490 --> 00:23:47,890
optimized your interrupt handler so that
you don't need any other registers to deal
302
00:23:47,890 --> 00:23:51,390
with, you're not saving restoring anything
apart from your PC, then saving another
303
00:23:51,390 --> 00:23:56,000
register is 50 percent overhead on that
operation. So that was the prime motivator
304
00:23:56,000 --> 00:24:00,500
was to keep all of the state in one word.
And then once you take all of the flags
305
00:24:00,500 --> 00:24:04,600
away, you're left with 24 bits for a word
aligned program counter, which leads to
306
00:24:04,600 --> 00:24:09,799
26 bit addressing. And that was then seen
as well, 64 MB is enough. There were
307
00:24:09,799 --> 00:24:14,690
machines in 1985 that, you know, could
conceivably have more memory than that.
308
00:24:14,690 --> 00:24:18,260
But for a desktop that was still seen as a
very large, very expensive amount of
309
00:24:18,260 --> 00:24:24,450
memory. The other thing, you don't need to
reinvent another instruction to do
310
00:24:24,450 --> 00:24:28,170
return from exception so you can return
using one of your existing instructions.
311
00:24:28,170 --> 00:24:32,740
In this case, it's the subtract into PC
which looks a bit strange, but trust me,
312
00:24:32,740 --> 00:24:39,030
that does the right thing. So the memory
controller. This is - I mentioned the
313
00:24:39,030 --> 00:24:43,040
address translation, so this has an MMU in
it. In fact, the thing directly on the
314
00:24:43,040 --> 00:24:46,080
left hand side. I was
worried that these slides actually might
315
00:24:46,080 --> 00:24:49,520
not be the right resolution and they might
be sort of too small for people to see
316
00:24:49,520 --> 00:24:53,570
this. And in fact, it's the size of a
house is really useful here. So the left
317
00:24:53,570 --> 00:24:58,500
hand side of this chip is the MMU. This
chip is the same size as ARM2. Yeah,
318
00:24:58,500 --> 00:25:02,380
pretty much. So that's part of the reason
why the MMU is on another chip ARM2 was
319
00:25:02,380 --> 00:25:06,610
as big as they could make it to fit the
price as you don't have anyone here done
320
00:25:06,610 --> 00:25:10,810
silicon design. But as the area goes
up effectively your yield goes down and
321
00:25:10,810 --> 00:25:14,690
the price it's a non-linear effect on
price. So the MMU had to be on a separate
322
00:25:14,690 --> 00:25:19,910
chip and it's half the size of that as
well. MEMC does most mundane things
323
00:25:19,910 --> 00:25:23,920
like it drives DRAM, it does refresh for
DRAM and it converts from linear addresses
324
00:25:23,920 --> 00:25:33,799
into row and column addresses which DRAM
takes. So the key thing about this
325
00:25:33,799 --> 00:25:39,090
ARM and MEMC binding is the key factor of
performance is making use of memory
326
00:25:39,090 --> 00:25:43,740
bandwidth. When the team had looked at all
the other processors in Project A before
327
00:25:43,740 --> 00:25:49,380
designing their own, one of the things
they looked at was how well they utilized
328
00:25:49,380 --> 00:25:56,320
DRAM and 68K and the semi chips made very,
very poor use of DRAM bandwidth.
329
00:25:56,320 --> 00:25:59,940
Steve said, well, okay. The DRAM is the
most expensive component of any of these
330
00:25:59,940 --> 00:26:04,280
machines and they're making poor use of
it. And I think a key insight here is if
331
00:26:04,280 --> 00:26:07,740
you maximize that use of the DRAM, then
you're going to be able to get much higher
332
00:26:07,740 --> 00:26:13,490
performance in those machines. And so it's
32 bits wide. The ARM is pipelined, so it can
333
00:26:13,490 --> 00:26:18,730
do a 32 bit word every cycle. And it also
indicates whether it's sequential or non
334
00:26:18,730 --> 00:26:25,250
sequential addressing. This
then lets your MEMC
335
00:26:25,250 --> 00:26:31,200
decide whether to do an N cycle or an S
cycle. So there's a fast one and a slow
336
00:26:31,200 --> 00:26:35,220
one basically. So when you access a new
random address and DRAM, you have to open
337
00:26:35,220 --> 00:26:40,710
that row and that takes twice the time.
It's a 4 MHz cycle. But then once
338
00:26:40,710 --> 00:26:45,150
you've access that address and then once
you're accessing linearly ahead of that
339
00:26:45,150 --> 00:26:49,599
address, you can do fast page mode
accesses, which are 8 MHz cycles.
340
00:26:49,599 --> 00:26:54,030
So ultimately, that's the reason
why these load store multiples exist. The
341
00:26:54,030 --> 00:26:57,820
non-RISC instructions, they're there so
that you can stream out registers and back
342
00:26:57,820 --> 00:27:03,100
in and make use of this DRAM bandwidth. So
store multiple. This is just a simple
343
00:27:03,100 --> 00:27:07,860
calculation for 14 registers, you're
hitting about 25 megabytes a second out of
344
00:27:07,860 --> 00:27:13,083
30. So this is it's not 100%, but it's way
more than a 10th or an 8th.
345
00:27:13,083 --> 00:27:16,880
Which a lot of the other processors
were using. So this was really good. This
346
00:27:16,880 --> 00:27:21,170
is the prime factor of why this machine
was so fast. It's effectively the load store
347
00:27:21,170 --> 00:27:28,069
multiple instructions and being able to
access the stuff linearly. So the MMU is
348
00:27:28,069 --> 00:27:36,980
weird. It's not TLB in the traditional
sense, so TLB's today, if you take your
349
00:27:36,980 --> 00:27:43,040
MIPS chip or something where the TLB is
visible to software, it will map a virtual
350
00:27:43,040 --> 00:27:47,760
address into a chosen physical address and
you'll have some number of entries and you
351
00:27:47,760 --> 00:27:53,880
more or less arbitrarily, you know, poke
an entry and with the set mapping in it.
352
00:27:53,880 --> 00:27:57,789
The MEMC does it upside down. So it says it's
got a fixed number of entries for every
353
00:27:57,789 --> 00:28:02,380
page in DRAM. And then for each of those
entries, it checks an incoming address to
354
00:28:02,380 --> 00:28:08,600
see whether it matches. So it has all of
those entries that we've showed on the
355
00:28:08,600 --> 00:28:13,500
chip diagram a couple of slides ago. That
big left hand side had that big array. All
356
00:28:13,500 --> 00:28:16,831
of those effectively just storing a
virtual address and then matching it and
357
00:28:16,831 --> 00:28:20,030
have a comparator. And then one of them
lights up and says yes, it's mine. So
358
00:28:20,030 --> 00:28:24,551
effectively, the aphysical page says that
virtual address is mine instead of the
359
00:28:24,551 --> 00:28:30,030
other way round. So this also limits your
memory. If you're saying I have to have
360
00:28:30,030 --> 00:28:34,480
one of these entries on chip per page of
physical memory and you don't want pages
361
00:28:34,480 --> 00:28:40,720
to be enormous. The 32 K if you do the
maths is 4 MB over 128 pages, it's a
362
00:28:40,720 --> 00:28:44,460
32K page. If you don't want the page to
get much bigger than that and trust me you
363
00:28:44,460 --> 00:28:47,890
don't, then you need to add more of these
entries and it's already half the size of
364
00:28:47,890 --> 00:28:52,540
the chip. So effectively, this is one of
the limits of why you can only have 4 MB
365
00:28:52,540 --> 00:28:58,360
on one of these memory
controller chips. OK. So VIDC is the core
366
00:28:58,360 --> 00:29:05,230
of the video and sound system. It's a set
of FIFOs and a set of shift digital analog
367
00:29:05,230 --> 00:29:09,970
converters for doing video and sound. You
stream stuff into the FIFOs and it does
368
00:29:09,970 --> 00:29:14,850
the display timing and pallet lookup and
so forth. It has an 8 bit mode I
369
00:29:14,850 --> 00:29:21,210
mentioned. It's slightly strange. It also
has an output for transparency bit. So in
370
00:29:21,210 --> 00:29:23,830
your palette you can set 12 bits of
color, but you can set a bit of
371
00:29:23,830 --> 00:29:31,580
transparency as well so you can do video
gen- looking quite easily with this. So
372
00:29:31,580 --> 00:29:36,701
there was a revision later on Tudor
explains that the very first one had a bit
373
00:29:36,701 --> 00:29:41,230
of crosstalk between the video and the
sound, so you'd get sound with noise on
374
00:29:41,230 --> 00:29:45,480
it. That was basically video noise and
it's quite hard to get rid of. And so they
375
00:29:45,480 --> 00:29:50,000
did this revision and the way he fixed it
was quite cool. They shuffled the power
376
00:29:50,000 --> 00:29:53,690
supply around and did all the sensible
engineering things. But he also filtered
377
00:29:53,690 --> 00:29:58,050
out a bit of the noise that is being
output on the sound. He
378
00:29:58,050 --> 00:30:02,630
inverted it and then fed that back in as
the reference current for the DACs. So that
379
00:30:02,630 --> 00:30:06,090
sort of self compensating and took the
noise a bit like the noise canceling
380
00:30:06,090 --> 00:30:13,239
headphones. It was kind of a nice hack.
And that was that was VIDC1. OK, the final
381
00:30:13,239 --> 00:30:17,700
one, I'm going to stop showing you chip
plots after this, unfortunately, but just
382
00:30:17,700 --> 00:30:20,980
get your fill while we're here. And again,
I'm really glad this is enormous for the
383
00:30:20,980 --> 00:30:25,590
people in the room and maybe those zooming
in online. There's a cool little
384
00:30:25,590 --> 00:30:29,510
Illuminati eye logo in the bottom left
corner. So I feared that you weren't gonna
385
00:30:29,510 --> 00:30:34,010
be able to see and I didn't have time to
do zoomed in version, but. Okay. So IOC
386
00:30:34,010 --> 00:30:37,720
is the center of the IO system as much of
the IO system as possible, all the random
387
00:30:37,720 --> 00:30:41,030
bits of glue logic to do things like
timing. Some peripherals are slower than
388
00:30:41,030 --> 00:30:47,309
others lives in IOC. It contains a UART
for the keyboard, so the keyboard is
389
00:30:47,309 --> 00:30:52,020
looked after by an 8051 microcontroller. Just
nice and easy, you don't have to do scanning
390
00:30:52,020 --> 00:30:57,429
in software. This microcontroller just sends
stuff up of serial port to this chip. So
391
00:30:57,429 --> 00:31:02,039
UART keyboard, asynchronous receiver and
transmitter. It was at one point called
392
00:31:02,039 --> 00:31:06,080
the fast asynchronous receiver and
transmitter. Mike got forced to change the
393
00:31:06,080 --> 00:31:11,900
name. Not everyone has a 12 year old sense
of humor, but I admire his spirit. So the
394
00:31:11,900 --> 00:31:15,630
other thing it does is interrupts all the
interrupts go into IOC and it's got masks
395
00:31:15,630 --> 00:31:20,341
and consolidates them effectively for
sending an interrupt up to the on the ARM.
396
00:31:20,341 --> 00:31:24,690
The ARM can then check the status and do
fast response to it. So the eye of providence
397
00:31:24,690 --> 00:31:27,540
there, the little logo I pointed out, Mike
said he put that in for future
398
00:31:27,540 --> 00:31:35,799
archaeologists to wonder about. Okay.
That was it. I was hoping there'd be
399
00:31:35,799 --> 00:31:39,440
this big back story about, you know, he
was in the Illuminati or something. Maybe
400
00:31:39,440 --> 00:31:44,690
he is, but not allowed to say anyway. So just
like the other dev board I showed you so
401
00:31:44,690 --> 00:31:49,930
this one's A 500 2P, it's still a second
processor that plugs into a BBC Micro.
402
00:31:49,930 --> 00:31:54,460
It's still got this host having disk
drives and so forth attached to it and
403
00:31:54,460 --> 00:32:00,289
pushing stuff down the tube into the
memory here. But now, finally
404
00:32:00,289 --> 00:32:04,730
all of this, the chip set now
assembled in one place. So this is
405
00:32:04,730 --> 00:32:08,100
starting to look like an Archimedes. It
got video out. It's got keyboard
406
00:32:08,100 --> 00:32:11,620
interface. It's got some expansion stuff.
So this is bring up an early software
407
00:32:11,620 --> 00:32:17,720
headstart. But very shortly afterwards, we
got the a five A500 internal to Acorn. And
408
00:32:17,720 --> 00:32:21,460
this is really the first Archimedes. This
is the prototype Archimedes. Actually got
409
00:32:21,460 --> 00:32:27,300
a gorgeous gray brick sort of look to it,
kind of concrete. It weighs like concrete,
410
00:32:27,300 --> 00:32:31,480
too, but it has all the hallmarks. It's
got the IO interfaces, it's got the
411
00:32:31,480 --> 00:32:36,810
expansion slots. You can see at the back.
It's got all, it runs the same operating
412
00:32:36,810 --> 00:32:39,550
system. Now, this was used for the OS
development. There's only a couple of
413
00:32:39,550 --> 00:32:44,540
hundred of these made. Well, this is a
serial 222. So this is one of the last,
414
00:32:44,540 --> 00:32:50,730
I think. But yeah. Only an internal to
ACORN. There are lots of nice tweaks to this
415
00:32:50,730 --> 00:32:55,700
machine. So the hardware team had designed
this, Tudor designed this as well as the
416
00:32:55,700 --> 00:33:01,390
video system. And he said, well, his A500
was the special one that he had a video
417
00:33:01,390 --> 00:33:05,409
controller. He'd hand-picked one
of the VCs so that instead of running
418
00:33:05,409 --> 00:33:10,855
at 24 MHz to run at 56, so some silicon
variations in manufacturer. So he found a
419
00:33:10,855 --> 00:33:16,169
56 MHz part so he could do. I
think it was 1024 x 768, which is way out
420
00:33:16,169 --> 00:33:22,400
of respect for the rest of the Archimedes.
So he had the really, really cool machine.
421
00:33:22,400 --> 00:33:26,050
They also ran some of them at 12 MHz
as well instead of 8. This is a massive
422
00:33:26,050 --> 00:33:30,500
performance improvement. I think it used
expensive memory, which is kind of out of
423
00:33:30,500 --> 00:33:37,180
reach for the product. Right. So
believe me, this is the simplified
424
00:33:37,180 --> 00:33:41,240
circuit diagram. The technical reference
manuals are available online if anyone wants
425
00:33:41,240 --> 00:33:47,969
the complicated one. The main parts of the
display are ARM, MEMC, VIDC and some RAM
426
00:33:47,969 --> 00:33:52,049
and we have a little walk through them. So
the clocks are generated actually by the
427
00:33:52,049 --> 00:33:56,815
memory controller. Memory controller gives
the clocks to the ARM. The main reason for
428
00:33:56,815 --> 00:34:00,327
this is that the memory controller has to
do some slow things now and then. It has
429
00:34:00,327 --> 00:34:05,860
to open pages of DRAMs, refresh cycles and
things. So it stops the CPU and generates
430
00:34:05,860 --> 00:34:11,559
the clock and it pauses the CPU by
stopping that clock from time to time.
431
00:34:11,559 --> 00:34:15,929
When you do a DRAM access, your adress on
bus along the top, the ARM outputs an
432
00:34:15,929 --> 00:34:19,720
address that goes into the MEMC. The
MEMC then converts that, it does an address
433
00:34:19,720 --> 00:34:23,339
translation and then it converts that into
a row and column addresses suitable for
434
00:34:23,339 --> 00:34:27,139
DRAM. And then if you're doing a read
DRAM outputs the address, outputs the data
435
00:34:27,139 --> 00:34:33,419
onto the data bus, which ARM then sees.
MEMC is the the critical path on
436
00:34:33,419 --> 00:34:37,109
this, but the address flows through MEMC
effectively. Notice that MEMC is not on
437
00:34:37,109 --> 00:34:41,329
the data bus. It just gets addresses
flowing through it, this is important later
438
00:34:41,329 --> 00:34:44,892
on. ROM is another slow thing.
439
00:34:44,892 --> 00:34:49,204
Another reason why MEMC might slow down
the access from the CPU, it works in a
440
00:34:49,204 --> 00:34:54,099
similar sort of way. There is also a
permission check done when you're doing
441
00:34:54,099 --> 00:35:00,259
the address translation per... user
permission versus OS, a supervisor.
442
00:35:00,259 --> 00:35:05,356
And so this information is output as part
of the cycle when the ARM does that access.
443
00:35:05,356 --> 00:35:09,730
If you miss in that translation, you get
a page fault or permission fault, then an
444
00:35:09,730 --> 00:35:13,391
abort signal comes back and you
take an exception.
445
00:35:13,391 --> 00:35:17,410
And the ARM deals with that in software.
446
00:35:17,410 --> 00:35:22,289
The data bus is a critical path, and so
the IO stuff is buffered, it is kept away
447
00:35:22,289 --> 00:35:27,599
from that. So the IO bus is 16 bits and
not a lot 32 bit peripherals were around
448
00:35:27,599 --> 00:35:32,599
in those days. All the peripherals 8 or
16 bits. So that's the right thing to do.
449
00:35:32,599 --> 00:35:36,150
The IOC decodes that and there's a
handshake with MEMC. If it needs more
450
00:35:36,150 --> 00:35:39,809
time, if it's accessing one of the
expansion cards and the expansion card
451
00:35:39,809 --> 00:35:47,691
has something slow on it then that's dealt
with in the IOC. So I mentioned the
452
00:35:47,691 --> 00:35:53,680
interrupt status that gets funneled into
IOC and then back out again. There's a
453
00:35:53,680 --> 00:35:57,599
VSync interrupt, but not an HSync
interrupt. You have to use timers for that,
454
00:35:57,599 --> 00:36:01,500
really annoyingly. There's one timer and
there's a 2 MHz timer available. I
455
00:36:01,500 --> 00:36:05,199
think I had that in a previous slide,
forgot to mention it. So if you want to
456
00:36:05,199 --> 00:36:09,730
do funny palette switching stuff or copper
bars or something - that's possible with the
457
00:36:09,730 --> 00:36:13,400
timers, it's also simple hardware mod to
make a real HSync interrupt as well.
458
00:36:13,400 --> 00:36:18,529
There's some spare interrupt inputs on the
IOC as an exercise for you . So the bit I
459
00:36:18,529 --> 00:36:23,440
really like about this system, I mentioned
that MEMC is not on the data bus. The VIDC
460
00:36:23,440 --> 00:36:28,079
is only on the data bus and it doesn't
have an address bus either. The VIDC is the
461
00:36:28,079 --> 00:36:31,200
thing responsible for turning the frame
buffer into video, reading that frame
462
00:36:31,200 --> 00:36:35,509
buffer out of RAM, so on. So how does it
actually do that RAM read without the
463
00:36:35,509 --> 00:36:40,780
address? Well, the MEMC contains all of
the registers for doing this DMA: the
464
00:36:40,780 --> 00:36:44,970
start of the frame buffer, the current
position and size, and so on. They all
465
00:36:44,970 --> 00:36:51,410
live in the MEMC. So there's a handshake
where VIDC sends a request up to the MEMC.
466
00:36:51,410 --> 00:36:55,239
When it's FIFO gets low, the MEMC then
actually generates the address into the
467
00:36:55,239 --> 00:37:01,102
DRAM, DRAM outputs that data and
then the MEMC, gives an acknowledge
468
00:37:01,102 --> 00:37:05,509
to the ARM Excuse me - too many
chips. The MEMC gives an acknowledged to
469
00:37:05,509 --> 00:37:11,210
VIDC, which then latches that data
into the FIFO. So this partitioning is
470
00:37:11,210 --> 00:37:16,710
quite neat. A lot of the video, DMA.
The video DMA stuff all lives in MEMC and
471
00:37:16,710 --> 00:37:20,799
there's this kind of split across the two
chips. The sound one I've just
472
00:37:20,799 --> 00:37:24,839
highlighted one interrupt that comes from
MEMC. Sound works exactly the same way,
473
00:37:24,839 --> 00:37:27,730
except there's a double buffering scheme
that goes on. And when one half of it
474
00:37:27,730 --> 00:37:32,359
becomes empty, you get an interrupt so you
can refill that so you don't glitch your
475
00:37:32,359 --> 00:37:39,700
sound. So this all works really very
smoothly. So finally the high res- mono
476
00:37:39,700 --> 00:37:44,509
thing that I mentioned before is quite
novel way they did that. Tudor had realized
477
00:37:44,509 --> 00:37:49,931
that with one external component to the
shift register and running very fast, he
478
00:37:49,931 --> 00:37:53,400
could implement this very high resolution
mode without really affecting the rest of
479
00:37:53,400 --> 00:37:59,276
the chip. So VIDC still runs at
24 MHz to sort of VGA resolution. It
480
00:37:59,276 --> 00:38:05,290
outputs on a digital bus that was a test
board, originally. It outputs 4 bits. So 4
481
00:38:05,290 --> 00:38:09,420
pixels in one chunk at 24 MHz and
this external component then shifts
482
00:38:09,420 --> 00:38:13,880
through that 4 times the speed. There's
one component. I mean, this is a
483
00:38:13,880 --> 00:38:17,569
very cheap way of doing this. And as I
said, this high res- mode is very
484
00:38:17,569 --> 00:38:23,009
unusual for machines of this era.
I've got a feeling an A500 the top end
485
00:38:23,009 --> 00:38:26,979
machine, if anyone's got one of these and
wants to try this trick and please get in
486
00:38:26,979 --> 00:38:31,080
touch, I've got a feeling an
A500 will do 1280 x 1024 by
487
00:38:31,080 --> 00:38:35,750
overclocking this. I think all of the
parts survive it. But for some reason,
488
00:38:35,750 --> 00:38:40,369
ACORN didn't support that on the board.
And finally, clock selection VIDC on
489
00:38:40,369 --> 00:38:44,839
some of the machines, quite flexible set
of clocks for different resolutions,
490
00:38:44,839 --> 00:38:51,170
basically. So MEMC is not on the data bus.
How do we program it? It's got registers
491
00:38:51,170 --> 00:38:55,259
for DMA and it's got all this address
translation. So the memory map I showed
492
00:38:55,259 --> 00:39:00,909
before has an 8 MB space reserved for
the address translation registers. It
493
00:39:00,909 --> 00:39:04,690
doesn't have 8 MB of it. I mean,
doesn't have two million... 32 bit registers
494
00:39:04,690 --> 00:39:09,819
behind there, which is a hint of what's
going on here. So what you do is you write
495
00:39:09,819 --> 00:39:14,410
any value to this space and you encode the
information that you want to put into one
496
00:39:14,410 --> 00:39:19,539
of these registers in the address. So this
address, the top three bits are 1 - it's
497
00:39:19,539 --> 00:39:25,230
in the top 8 MB of the 64 MB
address space and you format your
498
00:39:25,230 --> 00:39:28,999
logical physical page information in this
address and then you write any byte
499
00:39:28,999 --> 00:39:35,479
effectively. This sort of feels
really dirty, but also really a very nice
500
00:39:35,479 --> 00:39:39,779
way of doing it because there's no other
space in the address map. And this reads
501
00:39:39,779 --> 00:39:45,069
to the the price balance. So it's not
worth having an address bus going into
502
00:39:45,069 --> 00:39:49,809
MEMC costing 32 more pins just to write
these registers as opposed to playing this
503
00:39:49,809 --> 00:39:55,849
sort of trick. If you have that address
bus just for that data bus, just for
504
00:39:55,849 --> 00:39:59,990
that, then you have to get to a more
expensive package. And this was
505
00:39:59,990 --> 00:40:05,140
really in their minds: a 68 pin chip
versus an 84 pin chip. It was a big deal.
506
00:40:05,140 --> 00:40:08,719
So everything they really strived
to make sure it was in the very smallest
507
00:40:08,719 --> 00:40:13,250
package possible. And this system
partitioning effort led to these sorts of
508
00:40:13,250 --> 00:40:22,890
tricks to then program it. So on the
A540, we get multiple MEMCs. Each one is
509
00:40:22,890 --> 00:40:27,329
assigned a colored stripe here of the
physical address space. So you have a
510
00:40:27,329 --> 00:40:31,049
16 MB space, each one looks after
4 MB of it. But then when you do a
511
00:40:31,049 --> 00:40:36,039
virtual access in the bottom half of the
user space, regular program access, all of
512
00:40:36,039 --> 00:40:39,362
them light up and all of them will
translate that address in parallel. And
513
00:40:39,362 --> 00:40:43,663
one of them hopefully will translate and
then energize the RAM to do the read, for
514
00:40:43,663 --> 00:40:49,930
example. When you put an ARM 3 in this
system, the ARM 3 has its cache and then
515
00:40:49,930 --> 00:40:54,420
the address leads into the MEMC. So then
that means that the address is being
516
00:40:54,420 --> 00:40:58,240
translated outside of the cache or after
the cache. So your caching virtual
517
00:40:58,240 --> 00:41:02,900
addresses and as we all know, this is kind
of bad for performance because whenever
518
00:41:02,900 --> 00:41:07,459
you change that virtual address space, you
have to invalidate your cache. Or tag it,
519
00:41:07,459 --> 00:41:11,459
but they didn't do that. There's other ways
of solving this problem. Basically on this
520
00:41:11,459 --> 00:41:14,950
machine, what you need to do is invalidate
the whole cache. It's quite a quick
521
00:41:14,950 --> 00:41:23,540
operation, but it's still not good for
performance to have an empty cache. The
522
00:41:23,540 --> 00:41:28,393
only DMA present in the system is for the
video, for the video and sound. I/O
523
00:41:28,393 --> 00:41:32,569
doesn't have any DMA at all. And this is
another area where as younger engineer
524
00:41:32,569 --> 00:41:35,969
"crap, why didn't they have DMA? That
would be way better." DMA is the solution
525
00:41:35,969 --> 00:41:40,989
to everyone's problems, as we all know.
And I think the quote on the right
526
00:41:40,989 --> 00:41:47,390
ties in with the ACORN team's discovery
that all of these other processes needed
527
00:41:47,390 --> 00:41:51,969
quite complex chipsets, quite expensive
support chips. So the quote on the right
528
00:41:51,969 --> 00:41:56,539
says that if you've got some chips, that
vendors will be charging more for their
529
00:41:56,539 --> 00:42:03,259
DMA devices even than the CPU. So not
having dedicated DMA engine on board is a
530
00:42:03,259 --> 00:42:08,930
massive cost saving. The comment I made on
the previous 2 slides about the system
531
00:42:08,930 --> 00:42:14,440
partitioning, putting a lot of attention
into how many pins were on one chip versus
532
00:42:14,440 --> 00:42:19,380
another, how many buses were going around
the place. Not having IOC having to access
533
00:42:19,380 --> 00:42:25,019
memory was a massive saving in cost for
the number of pins and the system as a
534
00:42:25,019 --> 00:42:33,539
whole. The other thing is the FIQ mode
was effectively the means for doing IO.
535
00:42:33,539 --> 00:42:37,999
Therefore, FIQ Mode was designed to be an
incredibly low overhead way of doing
536
00:42:37,999 --> 00:42:44,010
programed IO, having the CPU do the
IO. So this was saying that the CPU is
537
00:42:44,010 --> 00:42:48,850
going to be doing all of the IO stuff, but
lets just optimize it, let's make it make
538
00:42:48,850 --> 00:42:53,930
it as good as it could be and that's
what led to the programmed IO. I also
539
00:42:53,930 --> 00:42:57,849
remember ARM 2 didn't have a cache. If you
don't have a cache on your CPU then
540
00:42:57,849 --> 00:43:03,099
DMA is going to hold up the CPU anyway,
so no cycles. DMA is not any
541
00:43:03,099 --> 00:43:06,960
performance gain. You may as well get
the CPU to do it and then get the CPU to
542
00:43:06,960 --> 00:43:13,029
do it in the lowest overhead way as possible.
I think this can be summarized as bringing
543
00:43:13,029 --> 00:43:17,410
the "RISC principles" to the system. So
the RISC principle, say for your CPU, you
544
00:43:17,410 --> 00:43:21,420
don't put anything in the CPU that you can
do in software and this is saying, okay,
545
00:43:21,420 --> 00:43:26,789
we'll actually software can do the IO just
as well without a cache as the DMA
546
00:43:26,789 --> 00:43:29,799
system. So let's get software to do that.
And I think this is a kind of a nice way
547
00:43:29,799 --> 00:43:34,339
of seeing it. This is part of the cost
optimization for really very little
548
00:43:34,339 --> 00:43:39,910
degradation in performance compared to
doing in hardware. So this is an IO card.
549
00:43:39,910 --> 00:43:43,380
The euro cards then nice and easy. The
only thing I wanted to say here was this
550
00:43:43,380 --> 00:43:48,839
is my SCSI card and it has a ROM on the
left hand side. And so. This is the
551
00:43:48,839 --> 00:43:53,731
expansion ROM basically many, many years
before PCI made this popular. Your drivers
552
00:43:53,731 --> 00:43:58,950
are on this ROM. This is a SCSI disc
plugging into this and you can plug this
553
00:43:58,950 --> 00:44:02,990
card in and then boot off the disk. You
don't need any other software to make it
554
00:44:02,990 --> 00:44:07,670
work. So this is just a very nice user
experience. There is no messing around
555
00:44:07,670 --> 00:44:11,690
with configuring IO windows or interrupts
or any of the iSCSI sort of stuff that was
556
00:44:11,690 --> 00:44:17,869
going on at the time. So to summarize some
of the the hardware stuff that we've seen,
557
00:44:17,869 --> 00:44:21,950
the ARM is pipelined and it has the load-
store-multiple -instructions which make
558
00:44:21,950 --> 00:44:27,950
for a very high bandwidth utilization.
That's what gives it its high performance.
559
00:44:27,950 --> 00:44:32,670
The machine was really simple. So
attention to detail about separating,
560
00:44:32,670 --> 00:44:37,239
partitioning the work between the chips
and reducing the chip cost as much as
561
00:44:37,239 --> 00:44:44,569
possible. Keeping that balanced was really
a good idea. The machine was designed when
562
00:44:44,569 --> 00:44:49,400
memory and CPUs were about the same speed.
So this is before that kind of flipped
563
00:44:49,400 --> 00:44:52,910
over. An 8 MHz ARM 2 was
designed to use 8 MHz memory.
564
00:44:52,910 --> 00:44:56,509
There's no need to have a cache at all on
there these days it sounds really crazy
565
00:44:56,509 --> 00:45:01,410
not to have a cache on the CPU, but if your
memory is not that much slower than this
566
00:45:01,410 --> 00:45:07,809
is a huge cost saving, but it is also risk
saving. This was the first real proper CPU.
567
00:45:07,809 --> 00:45:11,670
If we don't count ARM 1 to say ARM 1 was a
test, but ARM 2 is that, you know, the
568
00:45:11,670 --> 00:45:16,490
first product CPU. And having a cache on
that would have been a huge risk for a
569
00:45:16,490 --> 00:45:20,640
design team that hadn't dealt with the
structures that complicated at that
570
00:45:20,640 --> 00:45:22,599
point. So that was the right
thing to do, I think
571
00:45:22,599 --> 00:45:25,569
and I talked about DMA. I'm actually
572
00:45:25,569 --> 00:45:28,636
converse on this. I thought this was crap.
And actually, I think this was a really
573
00:45:28,636 --> 00:45:33,319
good example of balanced design. What's
the right tool for the job? Software is
574
00:45:33,319 --> 00:45:37,757
going to do the IO, so let's make sure
that FIQ mode, it makes sure that
575
00:45:37,757 --> 00:45:44,640
there's low overhead as possible. We
talked about system partitioning. The MMU.
576
00:45:44,640 --> 00:45:49,299
I still think it's weird and
backward. I think there is a
577
00:45:49,299 --> 00:45:56,029
strong argument though that a more
familiar TLB is a massively complicated
578
00:45:56,029 --> 00:45:59,339
compared to what they did here. And I
think the main drive here was not just
579
00:45:59,339 --> 00:46:06,120
area on the chip, but also to make it much
simpler to implement. So it worked. And I
580
00:46:06,120 --> 00:46:09,450
think this was they really didn't have
that many shots of doing this. This wasn't
581
00:46:09,450 --> 00:46:14,779
a company or a team that could afford to
have many goes at this product. And I
582
00:46:14,779 --> 00:46:20,660
think that says it all. I think they did a
great job. Okay. So the OS story is a
583
00:46:20,660 --> 00:46:24,599
little bit more complicated. Remember,
it's gonna be this office automation
584
00:46:24,599 --> 00:46:28,920
machine a bit like a Xerox star. Was going
to have this wonderful high res mono mode
585
00:46:28,920 --> 00:46:33,729
and people gonna be laser printing from
it. So just like Xerox PARC, Acorn started
586
00:46:33,729 --> 00:46:37,911
Palo Alto based research center.
Californians and beanbags writing an
587
00:46:37,911 --> 00:46:43,319
operating system using a micro kernel in
Modula-2 all of the trendy boxes ticked
588
00:46:43,319 --> 00:46:49,400
here for the mid 80s. It was by the sounds
a very advanced operating system and it
589
00:46:49,400 --> 00:46:54,029
did virtual memory and so on, is very
resource hungry, though. And it was never
590
00:46:54,029 --> 00:47:00,130
really very performant. Ultimately, the
hardware got done quicker than the
591
00:47:00,130 --> 00:47:05,403
software. And after a year or two.
Management got the jitters. Hardware was
592
00:47:05,403 --> 00:47:09,320
looming and said, well, next year we're
going to have the computer ready. Where's
593
00:47:09,320 --> 00:47:13,170
the operating system? And the project got
canned. And this is a real shame. I'd love
594
00:47:13,170 --> 00:47:16,599
to know more about this operating system.
Virtually nothing is documented outside of
595
00:47:16,599 --> 00:47:21,569
Acorn. Even the people, I spoke to, didn't
work on this. A bunch of people in
596
00:47:21,569 --> 00:47:25,250
California that kind of disappeared with
it. So if anyone has this software
597
00:47:25,250 --> 00:47:29,259
archived anywhere, then get in touch.
Computer Museum around the corner from me
598
00:47:29,259 --> 00:47:35,369
is raring to go on that. That'll be really
cool thing to archive. So anyway, they
599
00:47:35,369 --> 00:47:39,979
had now a desperate situation. They had to
go to Plan B, which was in under a year write
600
00:47:39,979 --> 00:47:43,239
an operating system for the machine
that was on its way to being delivered.
601
00:47:43,239 --> 00:47:48,260
And it kind of shows Arthur was I mean, I
think the team did a really good job in
602
00:47:48,260 --> 00:47:53,160
getting something out of the door in half
a year, but it was a little bit flaky.
603
00:47:53,160 --> 00:47:57,160
RISC OS then a year later, developed
from Arthur. I don't know if anyone's
604
00:47:57,160 --> 00:48:01,609
heard of RISC OS, but Arthur is
very, very niche and basically got
605
00:48:01,609 --> 00:48:07,170
completely replaced by RISC OS because
it was a bit less usable than RISC OS.
606
00:48:07,170 --> 00:48:12,059
Another really strong point that this
had it's quite a big ROM. So 2 MB going
607
00:48:12,059 --> 00:48:17,400
up...sorry, 0,5 MB in the 80s going
up to 2 MB in the early 90s.
608
00:48:17,400 --> 00:48:21,739
There's a lot of stuff in ROM. One of
those things is BBC Basic 5. I know
609
00:48:21,739 --> 00:48:29,289
it's 2019, and I know Basic is basic, but
BBC Basic is actually quite good. It has
610
00:48:29,289 --> 00:48:32,859
procedures and it's got support for all
the graphics and sound. You could write GUI
611
00:48:32,859 --> 00:48:36,660
applications in Basic and a lot of people
did. It's also very fast. So Sophie Wilson
612
00:48:36,660 --> 00:48:42,920
wrote this very, very optimized Basic
interpreter. I talked about the modules
613
00:48:42,920 --> 00:48:45,589
and podules. This is the expansion
ROM things. And a really great user
614
00:48:45,589 --> 00:48:50,589
experience there. But speaking of user
experience, this was ARTHUR . I never used
615
00:48:50,589 --> 00:48:57,969
ARTHUR. I just dug out a ROM and had a
play with it. It's bloody horrible. So that
616
00:48:57,969 --> 00:49:03,819
went away quickly. At the time also. So
part of this emergency plan B was to take
617
00:49:03,819 --> 00:49:08,210
the Acorn soft team who were supposed to
be writing applications for this and get
618
00:49:08,210 --> 00:49:12,079
them to quickly knock out an operating
system. So at launch, basically, this is
619
00:49:12,079 --> 00:49:15,750
one of the only things that you could do
with the machine. Had a great demo called
620
00:49:15,750 --> 00:49:20,569
Lander, of a great game called Zarch,
which is 3D space. You could fly around,
621
00:49:20,569 --> 00:49:27,029
it didn't have serious business
applications. And, you know, it was very
622
00:49:27,029 --> 00:49:31,079
there was not much you could do with this
really expensive machine at launch and
623
00:49:31,079 --> 00:49:35,450
that really hurt it, I think. So let me
get RISC OS 2 in 1988 and this is now
624
00:49:35,450 --> 00:49:42,219
looking less like a vomit sort of thing,
much nicer machine. And then eventually
625
00:49:42,219 --> 00:49:46,749
RISC OS 3. It was drag and drop between
applications. It's all multitasking,
626
00:49:46,749 --> 00:49:52,849
does outline font anti aliasing
and so on. So just lastly, I want to
627
00:49:52,849 --> 00:49:55,769
quickly touch on the really interesting
operating systems that ACORN had a Unix
628
00:49:55,769 --> 00:49:59,079
operating system. So as well as being a
geek, I'm also UNIX geek and I've always
629
00:49:59,079 --> 00:50:04,609
been fascinated by RISCiX. These machines
are astonishingly expensive. They were
630
00:50:04,609 --> 00:50:08,191
the existing Archimedes machines with a
different sticker on. So that's A540 with
631
00:50:08,191 --> 00:50:14,850
a sticker on the front. And this OS
was developed after the Archimedes was
632
00:50:14,850 --> 00:50:18,359
already designed at that point when this
OS was being developed. So
633
00:50:18,359 --> 00:50:20,950
there's a lot of stuff about the hardware
that wasn't quite right for a Unix
634
00:50:20,950 --> 00:50:26,230
operating system. 32K page size on a 4
megabyte machine really, really killed you
635
00:50:26,230 --> 00:50:29,900
in terms of your page cache and and that
kind of thing. They turned this into a bit
636
00:50:29,900 --> 00:50:35,089
of an opportunity. At least they made good
on some of this. There was a quite a novel
637
00:50:35,089 --> 00:50:42,380
online decompression scheme for you to
demand a page- text from a binary
638
00:50:42,380 --> 00:50:46,170
and it would decompress into your 32K
page, but it was stored in a
639
00:50:46,170 --> 00:50:53,659
sparse way on disk. So actually on disk
use was a lot less than you'd expect. The
640
00:50:53,659 --> 00:50:56,638
only way it fit on some of the
smaller machines.
641
00:50:56,638 --> 00:51:02,160
Also Acorn TechL the department that
designed the cyber truck it turns out.
642
00:51:02,160 --> 00:51:06,228
This was their view of the A680,
which is an unreleased workstation.
643
00:51:06,228 --> 00:51:08,940
I love this picture.
I like that piece of cheese or
644
00:51:08,940 --> 00:51:13,379
cake as the mouse. That's my favorite
part. But this is the real machine. So
645
00:51:13,379 --> 00:51:18,730
this is an unreleased prototype I found at
the computer museum. It's notable. And
646
00:51:18,730 --> 00:51:22,130
it's got 2 MEMCs. It's got a 8MB of
RAM. It's only designed to run RISC iX,
647
00:51:22,130 --> 00:51:26,099
the Unix operating system and has highres
monitor only doesn't have color, who's
648
00:51:26,099 --> 00:51:30,279
designed to run frame maker and driver
laser printers and be a kind of desktop
649
00:51:30,279 --> 00:51:35,249
publishing workstation. I've always been
fascinated by RISC iX, as I said a while
650
00:51:35,249 --> 00:51:41,450
ago I hacked around on ArcEm for a while.
I got it booting in ArcEm. I'd never seen
651
00:51:41,450 --> 00:51:46,640
this before. I never used a RISC iX
machine. So there we go, it boots, it is
652
00:51:46,640 --> 00:51:51,130
multi-user. But wait, there's more. It has
a really cool X-Server, a very fast one. I
653
00:51:51,130 --> 00:51:54,730
think Sophie Wilson again worked on
the X server here. So it's very well
654
00:51:54,730 --> 00:51:58,019
optimized and very fast for a machine of
its era. And it makes quite a nice little
655
00:51:58,019 --> 00:52:02,900
Unix workstation. It's quite a cool little
system, by the way Tudor, the guy that
656
00:52:02,900 --> 00:52:07,099
designed the VIDC and the IO system called
me a sado forgetting this working in
657
00:52:07,099 --> 00:52:14,150
there. That's my claim to fame. Finally,
and I want to leave some time for
658
00:52:14,150 --> 00:52:19,510
questions. There's a lot of useful stuff
in ROM. One of them is BBC Basic. Basic
659
00:52:19,510 --> 00:52:23,009
has an assembler so you can walk up to
this machine with a floppy disk and write
660
00:52:23,009 --> 00:52:29,529
assembler has a special bit of syntax
there and then you can just call it. And
661
00:52:29,529 --> 00:52:32,460
so this is really powerful. So at school
or something with the floppy disk, you can
662
00:52:32,460 --> 00:52:37,199
do something that's a bit more than basic
programing. Bizarrely, I mostly write that
663
00:52:37,199 --> 00:52:41,420
with only two or three tiny syntax errors
after about 20 years away from this. It's
664
00:52:41,420 --> 00:52:46,059
in there somewhere. Legacy wise, the
machine didn't sell very many under a
665
00:52:46,059 --> 00:52:50,930
hundred thousand easily. I don't think it
really made a massive impact. PCs had
666
00:52:50,930 --> 00:52:54,640
already taken off by then. The ARM
processor, not going to go on about the
667
00:52:54,640 --> 00:52:58,920
company. That's clear that that
obviously has changed the world in many
668
00:52:58,920 --> 00:53:04,140
ways. The thing I really took away from
this exercise was that a handful of smart
669
00:53:04,140 --> 00:53:10,089
people. Not that many. No, order of a dozen
designed multiple chips, designed a custom
670
00:53:10,089 --> 00:53:14,599
computer from scratch, got it working. And
it was quite good. And I think that this
671
00:53:14,599 --> 00:53:17,380
really turned people's heads. It made
people think differently that the people
672
00:53:17,380 --> 00:53:21,160
that were not Motorola and IBM really,
really big companies with enormous
673
00:53:21,160 --> 00:53:27,479
resources could do this and could make it
work. I think actually that led to the
674
00:53:27,479 --> 00:53:30,809
thinking that people could design their
systems on the chip in the 90s and that
675
00:53:30,809 --> 00:53:35,309
market taking off. So I think this is
really key in getting people thinking that
676
00:53:35,309 --> 00:53:40,420
way. It was possible to design your own
silicon. And finally, I just want to thank
677
00:53:40,420 --> 00:53:45,279
the people I spoke to and Adrian and
Jason. Their center of computing history in
678
00:53:45,279 --> 00:53:49,049
Cambridge. If you're in Cambridge, then
please visit there. It's a really cool
679
00:53:49,049 --> 00:53:56,270
museum. And with that, I'll wrap up. If
there's any time for questions, then I'm
680
00:53:56,270 --> 00:53:58,356
getting a blank look. No time for
questions?
681
00:53:58,356 --> 00:54:01,890
Herald: There's about 5 minutes left for
questions.
682
00:54:01,890 --> 00:54:07,880
Matt: Fantastic! Or come up to me afterwards.
I'm happy to chat more about this.
683
00:54:07,880 --> 00:54:18,940
applause
Herald:The first question is for the
684
00:54:18,940 --> 00:54:29,799
Internet. Signal angel, will you?
Well, grab your microphones and get the
685
00:54:29,799 --> 00:54:36,700
first of the audio in the room here. There
that microphone, please ask a question.
686
00:54:36,700 --> 00:54:44,130
Mic1: You mentioned that the system is
making good use of the memory, but how is
687
00:54:44,130 --> 00:54:50,459
that actually not completely being
stalled on memory? Having no cache and
688
00:54:50,459 --> 00:54:55,450
same cycle time for the cache- for the
memory as for the CPU.
689
00:54:55,450 --> 00:55:01,000
M: Good question. So how is it not always
stalled on memory ? I mean. Well, it's
690
00:55:01,000 --> 00:55:04,390
sometimes stalled on memory when you do
something that's non sequential. You have
691
00:55:04,390 --> 00:55:08,869
to take one of the slow cycles. This was
the N cycle. The key is you try and
692
00:55:08,869 --> 00:55:11,469
maximize the amount of time that you're
doing sequential stuff.
693
00:55:11,469 --> 00:55:16,220
So on the ARM 2 you wanted to unroll loops
as much as possible. So you're fetching
694
00:55:16,220 --> 00:55:19,799
your instructions sequentially, right? You
wanted to make as much use of load-store
695
00:55:19,799 --> 00:55:24,290
multiples. You could load single registers
with an individual register load, but it
696
00:55:24,290 --> 00:55:28,710
was much more efficient to pay that cost.
Just once the start of the instruction and
697
00:55:28,710 --> 00:55:33,619
then stream stuff sequentially. So you're
right that it is still stalled sometimes,
698
00:55:33,619 --> 00:55:37,141
but that was still a good
tradeoff, I think, for a system that
699
00:55:37,141 --> 00:55:40,549
didn't have a cache for other reasons.
M1: Thanks.
700
00:55:40,549 --> 00:55:45,140
Herald: Next question is for the Internet.
Signal Angel: Are there any Acorns on
701
00:55:45,140 --> 00:55:49,839
sale right now or if you want to get into
this kind of hardware where do you get it?
702
00:55:49,839 --> 00:55:52,810
Herald: Can you repeat the first sentence,
please? Sorry, the first part.
703
00:55:52,810 --> 00:55:56,259
S: If you want to get into this kind of
hardware right now, if you want to buy it
704
00:55:56,259 --> 00:55:58,839
right now.
M: Yeah, good question. How do you
705
00:55:58,839 --> 00:56:06,359
get hold of one drive prices up on eBay? I
guess I hate to say it. Might be fun to
706
00:56:06,359 --> 00:56:09,170
play around in emulators. Always
perfer that to hack around on the
707
00:56:09,170 --> 00:56:12,309
real thing. Emulators always feel a bit
strange. There are a bunch of really good
708
00:56:12,309 --> 00:56:19,180
emulators out there. Quite complete. Yeah,
I think it just I would just go on
709
00:56:19,180 --> 00:56:23,260
auction sites and try and find one.
Unfortunately, they're not completely
710
00:56:23,260 --> 00:56:27,829
rare. I mean that's the thing, they
did sell. Not quite sure. Exact figure,
711
00:56:27,829 --> 00:56:31,500
but you know, there were tens and tens of
thousands of these things made. So I would
712
00:56:31,500 --> 00:56:35,130
look also in Britain more than elsewhere.
Although I do understand that Germany had
713
00:56:35,130 --> 00:56:40,170
quite a few. If you can get a hold of one,
though, I do suggest doing so. I think
714
00:56:40,170 --> 00:56:46,259
they're really fun to play with.
Herald: OK, next question.
715
00:56:46,259 --> 00:56:51,860
M2: So I found myself looking at the
documentation for the LVM/STM instructions
716
00:56:51,860 --> 00:56:58,049
while devaluing something on ARM just last
week. And just maybe wonder what's your
717
00:56:58,049 --> 00:57:04,029
thought? Are there any quirks of the
Archimedes that have crept into the modern
718
00:57:04,029 --> 00:57:06,900
ARM design and instruction set that you
are aware of?
719
00:57:06,900 --> 00:57:13,449
M: Most of them got purged. So there are
the 26 bits adressing. There was a
720
00:57:13,449 --> 00:57:19,409
couple of strange uses of, there is an XOR
instruction into PC for changing flags. So
721
00:57:19,409 --> 00:57:25,160
there was a great purge when the ARM 6 was
designed and the ARM 6. I should know
722
00:57:25,160 --> 00:57:31,559
this ARM v3. That's got 32 bit addressing
and lost this. These weirdnesses
723
00:57:31,559 --> 00:57:35,690
got moved out.
I can't think of aside from just the
724
00:57:35,690 --> 00:57:40,619
resulting ARM 32 instructions that being
quite quirky and having a lot of good
725
00:57:40,619 --> 00:57:46,789
quirks. This shifted register as sort of a
free thing you can do. For example, you
726
00:57:46,789 --> 00:57:52,059
can add one register to a shifted register
in one cycle. I think that's a good quirk.
727
00:57:52,059 --> 00:57:55,119
So in terms of the inheriting that
instruction set and not changing those
728
00:57:55,119 --> 00:58:05,959
things. Maybe that counts?
Herald: Any further questions? Internet,
729
00:58:05,959 --> 00:58:11,439
any new questions? No? Okay, so in that
case one round of applause for Matt Evans.
730
00:58:11,439 --> 00:58:13,579
M: Thank you.
731
00:58:13,579 --> 00:58:21,142
applause
732
00:58:21,142 --> 00:58:27,679
postroll music
733
00:58:27,679 --> 00:58:43,658
Subtitles created by c3subtitles.de
in the year 2021. Join, and help us!