WEBVTT
00:00:00.000 --> 00:00:10.770
36c3 preroll music
00:00:10.770 --> 00:00:24.929
Herald: Our next talk will be "The
ultimate Arcon Archimedes Talk", in which
00:00:24.929 --> 00:00:29.599
there will be spoken about everything
about the Archimedes computer. There's a
00:00:29.599 --> 00:00:33.590
promise in advance that there will be no
heureka jokes in there. Give a warm
00:00:33.590 --> 00:00:40.790
welcome to Matt Evans.
00:00:40.790 --> 00:00:48.320
Matt (M): Thank you. Okay. Little bit of
retro computing first thing in the
00:00:48.320 --> 00:00:54.949
morning, sort of. Welcome. My name is Matt
Evans. The Acorn Archimedes was my
00:00:54.949 --> 00:00:59.379
favorite computer when I was a small
hacker and I'm privileged to be able to
00:00:59.379 --> 00:01:04.780
talk a bit little bit about it with you
today. Let's start with: What is an Acorn
00:01:04.780 --> 00:01:08.720
Archimedes? So I'd like an interactive
session, I'm afraid. Please indulge me,
00:01:08.720 --> 00:01:15.130
like a show of hands. Who's heard of the
Acorn Archimedes before? Ah, OK, maybe 50,
00:01:15.130 --> 00:01:23.090
60 percent. Who has used one? Maybe 10
percent, maybe. Okay. Who has programs -
00:01:23.090 --> 00:01:31.100
who has coded on an Archimedes? Maybe
half? Two to three people. Great. Okay.
00:01:31.100 --> 00:01:34.180
Three. laughs Okay, so a small
percentage. I don't see these machines as
00:01:34.180 --> 00:01:39.650
being as famous as - say the Macintosh or
IBM P.C. And certainly outside of Europe,
00:01:39.650 --> 00:01:44.030
they were not that common. So this is kind
of interesting just how many people here
00:01:44.030 --> 00:01:49.840
have seen this. So it was the first ARM-
based computer. This is an astonishingly
00:01:49.840 --> 00:01:55.530
1980s - I think one of them is drawing,
actually. But they're not just the first
00:01:55.530 --> 00:02:01.439
ARM-based machine, but the machine that
the ARM was originally designed to drive.
00:02:01.439 --> 00:02:07.230
It's a... Is that a comment for me?
(Mike?)?
00:02:07.230 --> 00:02:14.300
I'm being heckled already. It's only slide
two. Let's see how this goes. So it's a
00:02:14.300 --> 00:02:18.849
two box computer. It looks a bit like a
Mega S.T. ... to me. Its main unit with
00:02:18.849 --> 00:02:26.480
the processor and disks and expansion
cards and so on. Now this is an A3000.
00:02:26.480 --> 00:02:31.459
This is mine, in fact, and I didn't bother
to clean it before taking the photo. And
00:02:31.459 --> 00:02:34.459
now it's on this huge screen. That was a
really bad idea. You can see all the
00:02:34.459 --> 00:02:38.209
disgusting muck in the keyboard. It has a
bit of ink on it, I don't know why. But
00:02:38.209 --> 00:02:41.660
this this machine is 30 years old. And
this was luckily my machine, as I said, as
00:02:41.660 --> 00:02:45.069
a small hacker. And this is why I'm doing
the talk today. This had a big influence
00:02:45.069 --> 00:02:52.540
on me. I'd like to say as a person, but
more as an engineer. In terms of what my
00:02:52.540 --> 00:02:57.550
programing experience when I was learning
to program and so on. So I live and work
00:02:57.550 --> 00:03:02.040
in Cambridge in the U.K., where this
machine was designed. And through the
00:03:02.040 --> 00:03:05.470
funny sort of turn of events, I ended up
there and actually work in the building
00:03:05.470 --> 00:03:09.310
next to the building where this was
designed. And a bunch of the people that
00:03:09.310 --> 00:03:13.720
were on that original team that designed
this system are still around and
00:03:13.720 --> 00:03:18.280
relatively contactable. And I thought this
is a good opportunity to get on the phone
00:03:18.280 --> 00:03:21.760
and call them up or go for a beer with a
couple of them and ask them: Why are
00:03:21.760 --> 00:03:25.670
things the way they are? There's all sorts
of weird quirks to this machine. I was
00:03:25.670 --> 00:03:28.901
always wondering this, for 20 years. Can
you please tell me - why did you do it
00:03:28.901 --> 00:03:33.330
this way? And they were really good bunch
of people. So I talked to Steve Ferber,
00:03:33.330 --> 00:03:37.790
who led the hardware design, Sophie
Wilson, who was the same with software.
00:03:37.790 --> 00:03:43.530
Tudor Brown, who did the video system.
Mike Miller, the IO system. John Biggs and
00:03:43.530 --> 00:03:46.489
Jamie Urquhart , who did the silicon
design of silicon, I spoiled one of the
00:03:46.489 --> 00:03:49.120
surprises here. There's been some silicon
design that's gone on in building this
00:03:49.120 --> 00:03:55.060
Acorn. And they were all wonderful people
that gave me their time and told me a
00:03:55.060 --> 00:03:59.550
bunch of anecdotes that I will pass on to
you. So I'm going to talk about the
00:03:59.550 --> 00:04:04.520
classic Arc. There's a bunch of different
machines that Acorn built into the 1990s.
00:04:04.520 --> 00:04:08.960
But the ones I'm talking about started in
1987. There were 2 models, effectively a
00:04:08.960 --> 00:04:16.170
low end and a high end. One had an option
for a hard disk, 20 megabytes, 2300
00:04:16.170 --> 00:04:20.700
pounds, up to 4MB of RAM. They all share
the same basic architecture, they're all
00:04:20.700 --> 00:04:27.540
basically the same. So the A3000 that I
just showed you came out in 1989. That was
00:04:27.540 --> 00:04:30.540
the machine I had. Those again, the same.
It had the memory controller slightly
00:04:30.540 --> 00:04:35.970
updated, was slightly faster. They all had
an ARM 2. This was the released version of
00:04:35.970 --> 00:04:41.550
the ARM processor designed for this
machine, at 8 MHz. And then finally in
00:04:41.550 --> 00:04:47.000
1990, what I call the last of the classic
Arc, Archimedes, is the A540. This was the
00:04:47.000 --> 00:04:51.670
top end machine - could have up to 16
megabytes of memory, which a fair bit.
00:04:51.670 --> 00:04:57.600
even in 1990. It had a 30 MHz ARM 3. The
ARM 3 was the evolution of the ARM 2, but
00:04:57.600 --> 00:05:02.130
with the cache and a lot faster. So this
talk will be centered around how these
00:05:02.130 --> 00:05:08.820
these machines work, not the more modern
machines. So around 1987, what else was
00:05:08.820 --> 00:05:13.760
was available? This is a random selection
machines. Apologies if your favorite
00:05:13.760 --> 00:05:19.280
machine is not on this list. It wouldn't
fit on the slide otherwise. So at the
00:05:19.280 --> 00:05:22.110
start of the 80s, we had the exotic things
like the Apple Lisa and the Apple Mac.
00:05:22.110 --> 00:05:28.720
Very expensive machines. The Amiga - I had
to put in here. Sort off, relatively
00:05:28.720 --> 00:05:32.530
expensive course. The Amiga 500 was, you
know, very good value for money, very
00:05:32.530 --> 00:05:37.160
capable machine. But I'm comparing this
more to PCs and Macs, because that was the
00:05:37.160 --> 00:05:41.950
sort of, you know, market it was going
for. And although it was an expensive
00:05:41.950 --> 00:05:46.790
machine compared to Macintosh, it was
pretty cheap. Next cube on there, I
00:05:46.790 --> 00:05:50.260
figured that... I'd heard that they were
incredibly expensive. And actually
00:05:50.260 --> 00:05:53.330
compared to the Macintosh, they're not
expensive at all. Oh well, I (don't?) know
00:05:53.330 --> 00:05:57.930
which one I would have preferred. So the
first question I asked them - the first
00:05:57.930 --> 00:06:04.210
thing they told me: Why was it built? I've
used them in school and as I said, had one
00:06:04.210 --> 00:06:08.560
at home. But I was never really quite sure
what it was for. And I think a lot of the
00:06:08.560 --> 00:06:11.850
Acorn marketing wasn't quite sure what it
was for either. They told me it was the
00:06:11.850 --> 00:06:15.940
successor to the BBC Micro, this 8 bit
machine. Lovely 6502 machine, incredibly
00:06:15.940 --> 00:06:20.100
popular, especially in the UK. And the
goal was to make a machine that was 10
00:06:20.100 --> 00:06:23.770
times the performance of this. The
successor would be 10 times faster at the
00:06:23.770 --> 00:06:29.680
same price. And the thing I didn't know is
they had been inspired. The team Acorn had
00:06:29.680 --> 00:06:35.620
seen the Apple Lisa and the Xerox Star,
which comes from the famous Xerox Alto,
00:06:35.620 --> 00:06:41.700
Xerox PARC, first GUI workstation in the
70s, monumental machine. They'd been
00:06:41.700 --> 00:06:45.290
inspired by these machines and they wanted
to make something very similar. So this is
00:06:45.290 --> 00:06:49.480
the same story as the Macintosh. They
wanted to make something that was desktop
00:06:49.480 --> 00:06:52.310
machine for business, for office
automation and desktop publishing and that
00:06:52.310 --> 00:06:56.270
kind of thing. But I never really
understood this before. So this was this
00:06:56.270 --> 00:07:01.650
inspiration came from the Xerox machines.
It was supposed to be obviously a lot more
00:07:01.650 --> 00:07:06.680
affordable and a lot faster. So this is
what happens when Acorn marketing gets
00:07:06.680 --> 00:07:12.290
hold of this vision. So Xerox Star on the
left is this nice, sensible business
00:07:12.290 --> 00:07:15.380
machine. Someone's wearing nice, crisp
suit bumps microphone - banging their
00:07:15.380 --> 00:07:20.940
microphone - and it gets turned into the
very Cambridge Tweed version on the right.
00:07:20.940 --> 00:07:24.410
It's apparently illegal to program one of
these if you're not wearing a top hat. But
00:07:24.410 --> 00:07:29.630
no one told me that when I was a kid. And
my court case comes up next week. So
00:07:29.630 --> 00:07:32.240
Cambridge is a bit of a funny place. And
for those that been there, this picture on
00:07:32.240 --> 00:07:38.680
the right is sums it all up. So they began
Project A, which was build this new
00:07:38.680 --> 00:07:43.240
machine. And they looked at the
alternatives. They looked at the
00:07:43.240 --> 00:07:49.560
processors that were available at that
time, the 286, the 68 K, then that semi
00:07:49.560 --> 00:07:55.720
32, a 16, which was an early 32 bit
machine, a bit of a weird processor. And
00:07:55.720 --> 00:07:58.030
they all had something in common that
they're ridiculously expensive and in
00:07:58.030 --> 00:08:03.410
Tudors words a bit crap. They weren't a
lot faster than the BBC Micro. They're a
00:08:03.410 --> 00:08:06.620
lot more expensive. They're much more
complicated in terms of the processor
00:08:06.620 --> 00:08:10.490
itself. But also the system around them
was very complicated. They need lots of
00:08:10.490 --> 00:08:15.400
weird support chips. This just drove the
price up of the system and it wasn't going
00:08:15.400 --> 00:08:21.390
to hit that 10 times performance, let
alone at the same price point. They'd
00:08:21.390 --> 00:08:24.690
visited a couple of other companies
designing their own custom silicon. They
00:08:24.690 --> 00:08:28.090
got this idea in about 1983. They were
looking at some of the RISC papers coming
00:08:28.090 --> 00:08:31.180
out of Berkeley and they were quite
impressed by what a bunch of grad students
00:08:31.180 --> 00:08:38.070
were doing. They managed to get a working
RISC processor and they went to Western
00:08:38.070 --> 00:08:42.570
Design Center and looked at 6502
successors being design there. They had a
00:08:42.570 --> 00:08:45.210
positive experience. They saw a bunch of
high school kids with Apple 2s doing
00:08:45.210 --> 00:08:48.930
silicon layout. And they though "OK,
well". They'd never designed a CPU before
00:08:48.930 --> 00:08:53.310
at ACORN. ACORN hadn't done any custom
silicon to this degree, but they were
00:08:53.310 --> 00:08:57.160
buoyed by this and they thought, okay,
well, maybe RISC is the secret and we can
00:08:57.160 --> 00:09:02.250
do this. And this was not really the done
thing in this timeframe and not for a
00:09:02.250 --> 00:09:06.450
company the size of ACORN, but they
designed their computer from scratch. They
00:09:06.450 --> 00:09:09.320
designed all of the major pieces of
silicon in this machine. And it wasn't
00:09:09.320 --> 00:09:12.830
about designing the ARM chip. Hey, we've
got a processor core. What should we do
00:09:12.830 --> 00:09:16.000
with it? But it was about designing the
machine that ARM and the history of that
00:09:16.000 --> 00:09:20.310
company has kind of benefited from. But
this is all about designing the machine as
00:09:20.310 --> 00:09:26.710
a whole. They're a tiny team. They're a
handful of people - about a dozent, if -
00:09:26.710 --> 00:09:30.990
that did the hardware design, a similar
sort of order for software and operating
00:09:30.990 --> 00:09:36.210
systems on top, which is orders of
magnitude different from IBM and Motorola
00:09:36.210 --> 00:09:41.880
and so forth that were designing computers
at this time. RISC was the key. They
00:09:41.880 --> 00:09:43.893
needed to be incredibly simple. One of the
other experiences they had was they went
00:09:43.893 --> 00:09:48.820
to a CISC processor design center. They
had a team in a couple of hundred people
00:09:48.820 --> 00:09:52.650
and they were on revision H and it still
had bugs and it was just this unwieldy,
00:09:52.650 --> 00:09:58.160
complex machine. So RISC was the secret.
Steve Ferber has an interview somewhere.
00:09:58.160 --> 00:10:03.470
He jokes about ACORN management giving him
two things. Special sauce was two things
00:10:03.470 --> 00:10:07.810
that no one else had: He'd no people and
no money. So it had to be incredibly
00:10:07.810 --> 00:10:14.890
simple. It had to be built on a
shoestring, as Jamie said to me. So there
00:10:14.890 --> 00:10:19.760
are lots of corners cut, but in the right
way. I would say "corners cut", that
00:10:19.760 --> 00:10:23.220
sounds ungenerous. There's some very
shrewd design decisions, always weighing
00:10:23.220 --> 00:10:30.210
up cost versus benefit. And I think they
erred on the correct side for all of them.
00:10:30.210 --> 00:10:34.960
So Steve sent me this picture. That's he's
got a cameo here. That's the outline of
00:10:34.960 --> 00:10:39.750
him in the reflection on the glass there.
He's got this stuff in his office. So he
00:10:39.750 --> 00:10:43.630
led the hardware design of all of these
chips at ACORN. Across the top, we've got
00:10:43.630 --> 00:10:50.080
the original ARM, the ARM 1, ARM 2 and the
ARM 3 - guess the naming scheme - and the
00:10:50.080 --> 00:10:53.090
video controller, memory controller and IO
controller. Think, sort of see their
00:10:53.090 --> 00:10:57.320
relative sizes and it's kind of pretty.
This was also on a processor where you
00:10:57.320 --> 00:11:00.930
could really point at that and say, "oh,
that's the register five and you can see
00:11:00.930 --> 00:11:06.410
the cache over there". You can't really do
that nowadays with modern processors. So
00:11:06.410 --> 00:11:11.670
the bit about the specification, what it
could do, the end product. So I mentioned
00:11:11.670 --> 00:11:16.850
they all had this ARM 2 8MHz, up to four
MB of RAM, 26-bit addresses, remember
00:11:16.850 --> 00:11:21.670
that. That's weird. So a lot of 32-bit
machines, had 32-bit addresses or the ones
00:11:21.670 --> 00:11:25.550
that we know today do. That wasn't the
case here. And I'll explain why in a
00:11:25.550 --> 00:11:32.610
minute. The A540 had a updated CPU. The
memory controller, had an MMU, which was
00:11:32.610 --> 00:11:39.350
unusual for machines of the mid 80s. So it
could support, the hardware would support
00:11:39.350 --> 00:11:45.620
virtual memory, page faults and so on. It
had decent sound, it had 8-channel sound,
00:11:45.620 --> 00:11:49.460
hardware mixed and stereo. It was 8 bit,
but it was logarithmic - so it was a bit
00:11:49.460 --> 00:11:53.240
like u-law, if anyone knows that - instead
of PCM, so you got more precision at the
00:11:53.240 --> 00:11:58.620
low end and it sounded to me a little bit
like 12 bit PCM sound. So this is quite
00:11:58.620 --> 00:12:04.840
good. Storage wise, it's the same floppy
controller as the Atari S.T.. It's fairly
00:12:04.840 --> 00:12:09.690
boring. Hard disk controller was a
horrible standard called ST506, MFM
00:12:09.690 --> 00:12:17.440
drives, which were very, very crude
compared to disks we have today. Keyboard
00:12:17.440 --> 00:12:20.440
and mouse, nothing to write home about. I
mean, it was a normal keyboard. It was
00:12:20.440 --> 00:12:23.430
nothing special going on there. And
printer port, serial port and some
00:12:23.430 --> 00:12:29.380
expansion slots which, all them, I'll
outline later on. The thing I really liked
00:12:29.380 --> 00:12:32.650
about the arc was the graphics
capabilities. It's fairly capable,
00:12:32.650 --> 00:12:37.800
especially for a machine of that era and
of the price. It just had a flat frame
00:12:37.800 --> 00:12:42.170
buffer so it didn't have sprites, which is
unfortunate. It didn't have a blitter and
00:12:42.170 --> 00:12:48.680
a bitplanes and so forth. But the upshot
of that is dead simple to program. It had
00:12:48.680 --> 00:12:52.320
a 256 color mode, 8 bits per pixel, so
it's a byte, and it's all just laid out as
00:12:52.320 --> 00:12:55.890
a linear string of bytes. So it was dead
easy to just write some really nice
00:12:55.890 --> 00:12:59.910
optimized code to just blit stuff to the
screen. Part of the reason why there isn't
00:12:59.910 --> 00:13:05.090
a blitter is actually the CPU was so good
at doing this. Colorwise, it's got
00:13:05.090 --> 00:13:10.620
paletted modes out of a 4096 color
palette, same as the Amiga. It has this
00:13:10.620 --> 00:13:16.350
256 color mode, which is different. The
big high end machines, the top end
00:13:16.350 --> 00:13:21.290
machines, the A540 and the A400 series
could also do this very high res 1152 by
00:13:21.290 --> 00:13:24.060
900, which was more of a workstation
resolution. If you bought a Sun
00:13:24.060 --> 00:13:28.560
workstation a Sun 3 in those days, could
do this and some higher resolutions. But
00:13:28.560 --> 00:13:32.890
this is really not seen on computers that
might have been the office or school or
00:13:32.890 --> 00:13:36.370
education at the end of the market. And
it's quite clever the way they did that.
00:13:36.370 --> 00:13:40.450
I'll come back to that in a sec. But for
me, the thing about the ARC: For the
00:13:40.450 --> 00:13:45.920
money, it was the fastest machine around.
It was definitely faster than 386s and all
00:13:45.920 --> 00:13:49.460
the stuff that Motorola was doing at the
time by quite a long way. It is almost
00:13:49.460 --> 00:13:55.250
eight times faster than a 68k at about the
same clock speed. And it's to do with it's
00:13:55.250 --> 00:13:57.020
pipelineing and to do with it having a 32
bit word and a couple of other tricks
00:13:57.020 --> 00:14:01.790
again. I'll tell you later on what the
secret to that performance was. About
00:14:01.790 --> 00:14:04.850
minicomputer speed and compared to some of
the other RISC machines at the time, it
00:14:04.850 --> 00:14:09.450
wasn't the first RISC in the world, it was
the first cheap RISC and the first RISC
00:14:09.450 --> 00:14:14.020
machine that people could feasibly buy and
have on their desks at work or in
00:14:14.020 --> 00:14:19.222
education. And if you compare it to
something like the MIPS or the SPARC, it
00:14:19.222 --> 00:14:25.300
was not as fast as a MIPS or SPARC chip.
It was also a lot smaller, a lot cheaper.
00:14:25.300 --> 00:14:29.240
Both of those other processers had very
big die. They needed other support chips.
00:14:29.240 --> 00:14:33.350
They had huge packages, lots of pins, lots
of cooling requirements. So all this
00:14:33.350 --> 00:14:36.180
really added up. So I looked up the price
of the Sun 4 workstation at the time and
00:14:36.180 --> 00:14:40.050
it was well over four times the price of
one of these machines. And that was before
00:14:40.050 --> 00:14:44.400
you add on extras such as disks and
network interfaces and things like that.
00:14:44.400 --> 00:14:47.480
So it's very good, very competitive for
the money. And if you think about building
00:14:47.480 --> 00:14:51.070
a cluster, then you could get a lot more
throughput, you could network them
00:14:51.070 --> 00:14:56.980
together. So this is about as far as I got
when I was a youngster, I was wasn't brave
00:14:56.980 --> 00:15:03.230
enough to really take the machine apart
and poke around. Fortunately, now it's 30
00:15:03.230 --> 00:15:07.180
years old and I'm fine. I'm qualified and
doing this. I'm going to take it apart.
00:15:07.180 --> 00:15:12.089
Here's the motherboard. Quite a nice clean
design. This is built in Wales for anyone
00:15:12.089 --> 00:15:18.190
that's been to the UK. Very unusual these
days. Anything to be built in the UK. It's
00:15:18.190 --> 00:15:23.420
got several main sections around these
these four chips. Remember the Steve photo
00:15:23.420 --> 00:15:29.470
earlier on? This is the chip set: the arm
BMC, PDC, IOC. So the IOC side of things
00:15:29.470 --> 00:15:34.510
happens over on the left video and sound
in the top right. And the memory and the
00:15:34.510 --> 00:15:38.399
processor in the middle. It's got a
megabyte onboard and you can plug in an
00:15:38.399 --> 00:15:44.210
expansion for four megabytes. So memory
maps and software view. I mentioned this
00:15:44.210 --> 00:15:46.930
26-bit addressing and I think this is one
of the key characteristics of one of these
00:15:46.930 --> 00:15:52.690
machines. So you have a 64MB address
space, it's quite packed. That's quite a
00:15:52.690 --> 00:15:56.980
lot of stuff shoehorned into here. So
there's the memory. The bottom half of the
00:15:56.980 --> 00:16:02.040
address space, 32MB of that is the
processor. It's got user space and
00:16:02.040 --> 00:16:08.100
privilege mode. It's got a concept of
privilege within the processor execution.
00:16:08.100 --> 00:16:11.851
So when you're in user mode, you only get
to see the bottom half and that's the
00:16:11.851 --> 00:16:16.250
virtual maps. There's the MMU, that will
map pages into that space and then when
00:16:16.250 --> 00:16:18.980
you're in supervisor mode, you get to see
the whole of the rest of the memory,
00:16:18.980 --> 00:16:23.610
including the physical memory and various
registers up the top. The thing to notice
00:16:23.610 --> 00:16:27.460
here is: there's stuff hidden behind the
ROM, this address space is very packed
00:16:27.460 --> 00:16:31.390
together. So there's there's a requirement
for control registers, for the memory
00:16:31.390 --> 00:16:34.770
controller, for the video controller and
so on, and they write only registers in
00:16:34.770 --> 00:16:39.700
ROM basically. So you write to the ROM and
you get to hit these registers. Kind of
00:16:39.700 --> 00:16:43.730
weird when you first see it, but it was
quite a clever way to fit this stuff into
00:16:43.730 --> 00:16:50.810
the address space. So it will start with
the ARM one. So Sophie Wilson designed the
00:16:50.810 --> 00:16:59.150
instruction sets late 1983, Steve took the
instruction set and designed the top
00:16:59.150 --> 00:17:03.100
level, the block, the micro architecture
of this processor. So this is the data
00:17:03.100 --> 00:17:08.140
path and how the control logic works. And
then the VLSI team, then implemented this
00:17:08.140 --> 00:17:12.420
to their own custom cells. There's a
custom data path and custom logic
00:17:12.420 --> 00:17:18.179
throughout this. It took them about a
year, all in. Well, 1984, that sort of...
00:17:18.179 --> 00:17:22.760
This project A really kicked off early
1984. And this staked out first thing
00:17:22.760 --> 00:17:34.690
early 1985. The design process the guys
gave me a little bit of... So Jamie
00:17:34.690 --> 00:17:40.800
Urquhart and John Biggs gave me a bit of
an insight into how they worked on the
00:17:40.800 --> 00:17:46.870
VLSI side of things. So they had an Apollo
workstation, just one Apollo workstation,
00:17:46.870 --> 00:17:51.990
the DN600. This is a 68K based washing
machine, as Jamie described it. It's this
00:17:51.990 --> 00:17:56.970
huge thing. It cost about 50000 pounds.
It's incredibly expensive. And they
00:17:56.970 --> 00:18:00.580
designed all of this with just one of
these workstations. Jamie got in at 5:00
00:18:00.580 --> 00:18:04.710
a.m., worked until the afternoon and then
let someone else on the machine. So they
00:18:04.710 --> 00:18:06.760
shared the workstation that they worked
shifts so that they could design this
00:18:06.760 --> 00:18:10.390
whole thing on one workstation. So this
comes back to that. It was designed on a
00:18:10.390 --> 00:18:13.660
bit of a shoestring budget. When they got
a couple of other workstations later on in
00:18:13.660 --> 00:18:17.760
the projects, there was an allegation that
the software might not have been licensed
00:18:17.760 --> 00:18:21.950
initially on the other workstations and
the CAD software might have been. I can
00:18:21.950 --> 00:18:28.450
neither confirm nor deny whether that's
true. So Steve wrote a BBC-basics
00:18:28.450 --> 00:18:33.300
simulator for this. When he's designing
this block level micro architecture run on
00:18:33.300 --> 00:18:38.750
his BBC Micro. So this could then run real
software. There could be a certain amount
00:18:38.750 --> 00:18:42.890
of software development, but then they
could also validate that the design was
00:18:42.890 --> 00:18:47.480
correct. There's no cache on this. This is
a quite a large chip. 50 square
00:18:47.480 --> 00:18:52.820
millimeters was the economic limit of
those days for this part of the market.
00:18:52.820 --> 00:18:56.420
There's no cache. That also would have
been far too complicated. So this was
00:18:56.420 --> 00:19:03.120
also, I think, quite a big risk, no pun
intended. The the the aim of doing this
00:19:03.120 --> 00:19:07.620
with such a small team that they're all
very clever people. But they haven't all
00:19:07.620 --> 00:19:11.490
got experience in building chips before.
And I think they knew what they were up
00:19:11.490 --> 00:19:15.100
against. And so not having a cache of
complicated things like that was the right
00:19:15.100 --> 00:19:21.740
choice to make. I'll show you later that
that didn't actually affect things. So
00:19:21.740 --> 00:19:25.030
this was a risk machine. If anyone has not
programed in this room, then get out at
00:19:25.030 --> 00:19:29.680
once. But if you have programed on this is
quite familiar with some distance, aehm,
00:19:29.680 --> 00:19:36.210
differences. The. It's a classical three
operand risk its got three shift on one of
00:19:36.210 --> 00:19:38.790
the operands for most of the instructions.
So you can do things like static
00:19:38.790 --> 00:19:43.820
multiplies quite easily. It's not purist
risk though. It does have loads or
00:19:43.820 --> 00:19:47.980
multiple instructions. So these will, as
the name implies, load or store multiple
00:19:47.980 --> 00:19:51.460
number of registers in one go. So one
register per cycle, but it's all done
00:19:51.460 --> 00:19:54.970
through one instruction. This is not risk.
Again, there's a good reason for doing
00:19:54.970 --> 00:19:59.300
that. So when one comes back and it gets
plugged into a board that looks a bit like
00:19:59.300 --> 00:20:07.400
this. This is called the ATP, the second
processor. It plugs into a BBC Micro. It's
00:20:07.400 --> 00:20:11.280
basically there's a thing called the Tube,
which is sort of a FIFO like arrangement.
00:20:11.280 --> 00:20:15.780
The BBC Micro can send messages one way
and this can send messages back. And the
00:20:15.780 --> 00:20:20.250
BBC Micro has the discs, it has the IO
keyboard and so on. And that's used as the
00:20:20.250 --> 00:20:23.960
hosts to then download code into one
megabytes of ram up here and then you
00:20:23.960 --> 00:20:30.030
combine the code on the arm. So this was
the initial system, six megahertz. The
00:20:30.030 --> 00:20:32.350
thing I found quite interesting about
this, I mentioned that Steve had built
00:20:32.350 --> 00:20:37.200
this BBC basic simulation, one of the
early bits of software that could run on
00:20:37.200 --> 00:20:41.870
this. So he d ported BBC Basic to arm and
written it on version of it. The basic
00:20:41.870 --> 00:20:47.780
interpreter was very fast, very lean, and
it was running on this board early on.
00:20:47.780 --> 00:20:51.750
They then built a simulator called ACM,
which was an event based simulator for
00:20:51.750 --> 00:20:55.240
doing logic design and all of the other
chips in the chips on the chipset that
00:20:55.240 --> 00:20:59.020
were simulated using ACM on one, which is
quite nice. So this was the fastest
00:20:59.020 --> 00:21:02.480
machine that they had around. They didn't
have, you know, the thousands of machines
00:21:02.480 --> 00:21:08.330
in the cluster like you'd have in a
modern, modern company doing PDA. They had
00:21:08.330 --> 00:21:11.370
a very small number of machines and these
were the fastest ones they had about. So
00:21:11.370 --> 00:21:17.910
ARM 2 simulated ARM one and all the other
chipset. So then ARM 2 comes on. So
00:21:17.910 --> 00:21:21.590
there's a year later, this is a shrink of
the design. It's based on the same basic
00:21:21.590 --> 00:21:26.000
micro architecture that has a multiplier
now. It's a booth multiplier , so it is at
00:21:26.000 --> 00:21:32.090
worst case, 16 cycle, multiply just two
bits per clock. Again, no cache. But one
00:21:32.090 --> 00:21:36.950
thing they did add in on to is banked
registers. Some of the processor modes I
00:21:36.950 --> 00:21:42.130
mentioned there's an interrupt mode. Next
slide, some of the processor modes will
00:21:42.130 --> 00:21:48.950
basically give you different view on
registers, which is very useful. These
00:21:48.950 --> 00:21:51.090
were all validated at eight megahertz. So
the product was designed for eight
00:21:51.090 --> 00:21:54.020
megahertz. The company that built them
said, okay, put the stamp on the outside
00:21:54.020 --> 00:21:57.681
saying that megahertz. There's two
versions of this chip and I think they're
00:21:57.681 --> 00:22:01.390
actually the same silicon. I've got a
suspicion that they're the same. They just
00:22:01.390 --> 00:22:05.420
tested this batch saying that works at 10
or 12. So on my project list is
00:22:05.420 --> 00:22:12.020
overclocking my 80000 to see how fast
it'll go and see if I can get it to 12
00:22:12.020 --> 00:22:18.559
megahertz. Okay. So the banking have the
registers just got this even modern 32.
00:22:18.559 --> 00:22:25.280
But arms have got a type of interrupts and
pronounced ERC in English and FIQ I queue
00:22:25.280 --> 00:22:28.559
pronounced fic in English. Appreciate. It
doesn't mean quite the same thing in
00:22:28.559 --> 00:22:34.290
German. So I call if FIQ from here on in
and if FIQ mode has this property where
00:22:34.290 --> 00:22:38.260
the top half of the registers effectively
different registers. When you get into
00:22:38.260 --> 00:22:42.670
this mode. So this lets you first of all
you don't have to back up those registers.
00:22:42.670 --> 00:22:47.950
Mean if your are an FIQ handler and
secondly if you can write an FIQ handler
00:22:47.950 --> 00:22:51.970
using just those registers and there's
enough for doing most basic tasks, you
00:22:51.970 --> 00:22:55.940
don't have to save and restore anything
when you get an interrupt. So this is
00:22:55.940 --> 00:23:02.510
designed specifically to be very, very low
overhead. Interrupt mode. So I'm coming to
00:23:02.510 --> 00:23:08.580
why there's a 26 address base. And so I
found this link very, very unintuitive. So
00:23:08.580 --> 00:23:13.520
unlike 32 bit on the more the more modern
1990s onwards ARMs, the program council
00:23:13.520 --> 00:23:17.020
register 15 doesn't just contain the
program council, but also contains the
00:23:17.020 --> 00:23:20.420
status lags and processor mode and
effectively all of the machines date is
00:23:20.420 --> 00:23:24.200
packed in there as well. So I asked the
question, well why, why 64 megabytes of
00:23:24.200 --> 00:23:27.700
address space? What's special about 64.
And Mike told me, well, you're asking the
00:23:27.700 --> 00:23:31.980
wrong question. It's the other way round.
What we wanted was this property that all
00:23:31.980 --> 00:23:35.990
of the machine state is in one register.
So this means you just have to save one
00:23:35.990 --> 00:23:40.360
register. Well, you know, what's the harm
in saving two registers? And he reminded
00:23:40.360 --> 00:23:43.490
me of this FIQ mode. Well, if you're
already in a state where you've really
00:23:43.490 --> 00:23:47.890
optimized your interrupt handler so that
you don't need any other registers to deal
00:23:47.890 --> 00:23:51.390
with, you're not saving restoring anything
apart from UPC, then saving another
00:23:51.390 --> 00:23:56.000
register is 50 percent overhead on that
operation. So that was the prime motivator
00:23:56.000 --> 00:24:00.500
was to keep all of the state in one word.
And then once you take all of the flags
00:24:00.500 --> 00:24:04.600
away, you're left with 24 bits for a word
airlines program counter, which leads to
00:24:04.600 --> 00:24:09.799
26 addressing. And that was then seen as
well, 64 megs is enough. There were
00:24:09.799 --> 00:24:14.690
machines in 1985 that, you know, could
conceivably have more memory than that.
00:24:14.690 --> 00:24:19.290
But for a desktop that was still seen as a
very large, very expensive amount of
00:24:19.290 --> 00:24:24.450
memory. The other thing, you don't need to
reinvent a another instruction to do and
00:24:24.450 --> 00:24:28.170
return from exception so you can return
using one of your existing instructions.
00:24:28.170 --> 00:24:32.740
In this case, it's this attract into PCG
which looks a bit strange, but trust me,
00:24:32.740 --> 00:24:39.030
that does the right thing. It's a memory
controller. This is I mentioned the
00:24:39.030 --> 00:24:43.040
address translation, so this has an MMU in
it. In fact, the thing directly on the
00:24:43.040 --> 00:24:46.080
left hand slight left hand side. I was
worried that these slides actually might
00:24:46.080 --> 00:24:49.520
not be the right resolution and they might
be sort of too small for people to see
00:24:49.520 --> 00:24:53.750
this. And in fact, it's the size of a
house is really useful here. So the left
00:24:53.750 --> 00:24:59.110
hand side of this chip is the emu. This
chips the same size as the ARM 2. Yeah,
00:24:59.110 --> 00:25:02.380
pretty much. So that's part of the reason
why the MMU is on another chip ARM two was
00:25:02.380 --> 00:25:06.610
as big as they could make it to fit the
price as you don't have anyone here done
00:25:06.610 --> 00:25:10.810
silicon design. But as the the area goes
up effectively your yield goes down and
00:25:10.810 --> 00:25:14.690
the price it's it's a non-linear effect on
price. So the MMU had to be on a separate
00:25:14.690 --> 00:25:19.910
chip and it's half the size of that as
well. Means he does most mundane things
00:25:19.910 --> 00:25:23.920
like it drives DRAM, it does refresh for
DRAM and it converts from linear addresses
00:25:23.920 --> 00:25:33.799
into row and column addresses which DRAM
takes. So the key thing about this, this
00:25:33.799 --> 00:25:39.090
ARM and MMC binding is the key factor of
performance is making use of memory
00:25:39.090 --> 00:25:43.740
bandwidth. When the team had looked at all
the other processors in Project A before
00:25:43.740 --> 00:25:49.380
designing their own, one of the things
they looked at was how well they utilized
00:25:49.380 --> 00:25:56.320
DRAM and 68K and the semi chips made very,
very poor use of different bandwidth.
00:25:56.320 --> 00:25:59.940
Steve said, well, okay. The DRAM is the
most expensive component of any of these
00:25:59.940 --> 00:26:04.280
machines and they're making poor use of
it. And I think a key insight here is if
00:26:04.280 --> 00:26:07.740
you maximize that use of the DRAM, then
you're going to be able to get much higher
00:26:07.740 --> 00:26:13.490
performance in those machines. And so it's
32 bits wide. The ARM pipelined, so it can
00:26:13.490 --> 00:26:19.010
do 32 bit word every cycle. And it also
indicates whether it's sequential or non
00:26:19.010 --> 00:26:25.960
sequential. Addressing this then lets
your. Yes. Okay. This then lets your BMC
00:26:25.960 --> 00:26:31.200
decide whether to do an N cycle or an S
cycle. So there's a fast one in the slow
00:26:31.200 --> 00:26:35.220
one basically. So when you access a new
random address and DRAM, you have to open
00:26:35.220 --> 00:26:40.710
that row and that takes twice the time.
It's a four megahertz cycle. But then once
00:26:40.710 --> 00:26:45.150
you've access that address and then once
you're accessing linearly ahead of that
00:26:45.150 --> 00:26:48.220
address, you can do fast page mode
accesses, which are eight megahertz
00:26:48.220 --> 00:26:54.720
cycles. So ultimately, that's the reason
why these loadstore multiples exist. The
00:26:54.720 --> 00:26:57.820
non risk instructions, they're there so
that you can stream out registers and back
00:26:57.820 --> 00:27:03.100
in and make use of this DRAM bandwidth. So
store multiple. This is just a simple
00:27:03.100 --> 00:27:07.860
calculation for 14 registers, you're
hitting about 25 megabytes a second out of
00:27:07.860 --> 00:27:12.809
30. So this is it's not 100%, but it's way
more than, you know, 10 for an eighth.
00:27:12.809 --> 00:27:17.130
It's a lot of the other processes where
we're using. So this was really good. This
00:27:17.130 --> 00:27:21.170
is the prime factor of why this machine
was so fast. is effectively the most or
00:27:21.170 --> 00:27:30.169
multiple instructions and being able to
access the stuff linearly. So the MMU is
00:27:30.169 --> 00:27:36.980
weird. It's not TLB in the traditional
sense, so TLB's today, if you take your
00:27:36.980 --> 00:27:43.040
MIPS chip or something where the TSB is
visible to software, it will map a virtual
00:27:43.040 --> 00:27:47.760
address into a chosen physical address and
you'll have some number of entries and you
00:27:47.760 --> 00:27:54.220
more or less arbitrarily, you know, poke
an entry and with the set mapping in it.
00:27:54.220 --> 00:27:57.789
MEMC does it upside down. So it says it's
got a fixed number of entries for every
00:27:57.789 --> 00:28:02.380
page in DB. And then for each of those
entries, it checks an incoming address to
00:28:02.380 --> 00:28:08.600
see whether it matches. So it has all of
those entries that we've showed on the
00:28:08.600 --> 00:28:13.500
chip diagram a couple of slides ago. That
big left hand side had that big array. All
00:28:13.500 --> 00:28:16.831
of those effectively just storing a
virtual address and then matching it and
00:28:16.831 --> 00:28:21.840
have a comparator. And then one of them
lights up and says, yes, it's mine. So
00:28:21.840 --> 00:28:24.551
effectively, the aphysical page says that
virtual address is mine instead of the
00:28:24.551 --> 00:28:30.030
other way round. So this also limits your
memory. If you're saying I have to have
00:28:30.030 --> 00:28:34.480
one of these entries on chip per page of
physical memory and you don't want pages
00:28:34.480 --> 00:28:40.960
to be enormous. The 32 K if you do the
math for megabytes over 128 pages is the
00:28:40.960 --> 00:28:44.690
32K page. If you don't want the page to
get much bigger than that and trust me you
00:28:44.690 --> 00:28:47.890
don't, then you need to add more of these
entries and it's already half the size of
00:28:47.890 --> 00:28:52.110
the chip. So effectively, this is one of
the limits of why you can only have four
00:28:52.110 --> 00:28:58.360
megabytes on one of these memory
controller chips. OK. So Vinci is the core
00:28:58.360 --> 00:29:05.230
of the video and sound system. It's set a
FIFO is and a set of shift digital analog
00:29:05.230 --> 00:29:09.970
converters for doing video and sound
stream stuff into the FIFO zone. It does
00:29:09.970 --> 00:29:14.850
the display timing and pallet lookup and
so forth. It has an 8 bit mode I
00:29:14.850 --> 00:29:21.840
mentioned. It's slightly strange. It also
has an output for transparency bit. So in
00:29:21.840 --> 00:29:23.830
your palette you can sense 12 bits of
color, but you can set a bit of
00:29:23.830 --> 00:29:31.910
transparency as well so you can do video
(gen?) looking quite easily with this. So
00:29:31.910 --> 00:29:36.701
there was a revision later on Tudor
explains that the very first one had a bit
00:29:36.701 --> 00:29:41.230
of crosstalk between the video and the
sound, so you'd get sound with noise on
00:29:41.230 --> 00:29:45.980
it. That was basically video noise and
it's quite hard to get rid of. And so they
00:29:45.980 --> 00:29:50.000
did this revision and the way he fixed it
was quite cool. They shuffled the power
00:29:50.000 --> 00:29:54.000
supply around and did all the sensible
engineering things. But he also filtered
00:29:54.000 --> 00:29:58.610
out a bit of the noise that is being
output on the that's the sound. He
00:29:58.610 --> 00:30:02.630
inverted it and then fed that back in as
the reference current for the DAC. So that
00:30:02.630 --> 00:30:06.090
sort of self compensating and took the
noise a bit like the noise canceling
00:30:06.090 --> 00:30:10.809
headphones. So it was kind of a nice hack.
And that was that was VIDC1. OK, the final
00:30:10.809 --> 00:30:17.700
one, I'm going to stop showing you chip
plots after this, unfortunately, but just
00:30:17.700 --> 00:30:20.980
get your fill while we're here. And again,
I'm really glad this is enormous for the
00:30:20.980 --> 00:30:25.590
people in the room and maybe those zooming
in online. There's a cool little
00:30:25.590 --> 00:30:29.510
Illuminati eye logo in the bottom left
corner. So I feared that you weren't gonna
00:30:29.510 --> 00:30:34.630
be able to see and I didn't have time to
do zoomed in version, but. Okay. So I see
00:30:34.630 --> 00:30:38.030
is the center of the IOC system as much of
the IO system as possible? All the random
00:30:38.030 --> 00:30:41.030
bits of blue logic to do things like
timing. Some peripherals are slower than
00:30:41.030 --> 00:30:47.309
others lives in IOC. It contains a UART
for the keyboard, so the keyboard is
00:30:47.309 --> 00:30:52.320
looked after by an 851 microcontroller.
Just nice and easy to do. Scanning in
00:30:52.320 --> 00:30:57.429
software. This microcontroller just sends
stuff up of serial port to this chip. So
00:30:57.429 --> 00:31:02.039
UART keyboard, asynchronous receiver and
transmitter. It was at one point called
00:31:02.039 --> 00:31:06.080
the fast asynchronous receiver and
transmitter. Mike got forced to change the
00:31:06.080 --> 00:31:12.730
name. Not everyone has a 12 year old sense
of humor, but I admire his spirit. So the
00:31:12.730 --> 00:31:15.630
other thing it does is interrupts all the
interrupts go into IOC and it's got masks
00:31:15.630 --> 00:31:20.341
and consolidates them effectively for
sending an interrupt up to the on the ARM
00:31:20.341 --> 00:31:24.690
can then check the status to a fast
response to it. So the eye of providence
00:31:24.690 --> 00:31:27.540
there, the little logo I pointed out, Mike
said you put that in for future
00:31:27.540 --> 00:31:35.799
archaeologists to wonder about.Okay That
was that was it. I was hoping there'd be
00:31:35.799 --> 00:31:40.500
this big back story about, you know, he
was in the Illuminati or something. Maybe
00:31:40.500 --> 00:31:44.690
he is not allowed to say anyway. So just
like the other Dave Porter showed, you say
00:31:44.690 --> 00:31:49.930
this one's A 500 to B, it's still a second
processor that plugs into a BBC Micro.
00:31:49.930 --> 00:31:54.460
It's still got this this hosts having disk
drives and so forth attached to it and
00:31:54.460 --> 00:32:00.289
pushing stuff down the tube into the
memory here. But now, finally, all of the
00:32:00.289 --> 00:32:05.370
all of this, the chips that are now
assembled in one place. So this is
00:32:05.370 --> 00:32:08.370
starting to look like an Archimedes. It
got video out. It's got keyboard
00:32:08.370 --> 00:32:11.620
interface. It's got some expansion stuff.
So this is bring up an early software
00:32:11.620 --> 00:32:18.460
headstart. But very shortly afterwards, we
got the a five A500 internal 2 Acorn. And
00:32:18.460 --> 00:32:21.460
this is really the first Archimedes. This
is the prototype. Archimedes actually got
00:32:21.460 --> 00:32:27.660
a gorgeous gray brick sort of look to it,
kind of concrete. It weighs like concrete,
00:32:27.660 --> 00:32:31.480
too, but it has all the hallmarks. It's
got the. IO interfaces, it's got the
00:32:31.480 --> 00:32:36.950
expansion slots. It can see at the back.
It's got all it runs the same operating
00:32:36.950 --> 00:32:39.950
system. Now, this was used for the OS
development. There's only a couple of
00:32:39.950 --> 00:32:44.820
hundred of these made. Well, this is a
serial 2 2 2. So this is one of the last,
00:32:44.820 --> 00:32:50.730
I think. But yeah. Only an internal to
ACORN. There is a nice tweaks to this
00:32:50.730 --> 00:32:55.700
machine. So the hardware team had designed
this Tudor design this as well as the
00:32:55.700 --> 00:33:01.710
video system. And he said, well, his A500
was the special one that he had a video
00:33:01.710 --> 00:33:05.409
control of the heat. He'd hand-picked one
of the videos so that instead of running
00:33:05.409 --> 00:33:11.190
at 24 megahertz to 56, so some silicon
variations in manufacturer. So he found a
00:33:11.190 --> 00:33:16.169
56 megahertz pipe. And so he could do. I
think it was 1024 x 768, which is way out
00:33:16.169 --> 00:33:22.400
of respect for the rest of the Archimedes.
So he had the really, really cool machine.
00:33:22.400 --> 00:33:26.220
They also ran some of them at 12 megahertz
as well instead of 8. This is a massive
00:33:26.220 --> 00:33:30.500
performance improvement. I think it use
expensive memory, which is kind of out of
00:33:30.500 --> 00:33:37.180
reach for the product. Right. So believe
me, this is the simplified circle, the
00:33:37.180 --> 00:33:41.240
circuit diagram. The technical reference
manuals are available online if anyone
00:33:41.240 --> 00:33:46.159
wants the complicated one. The main parts
of the display are ARM , MEMC and some RAM
00:33:46.159 --> 00:33:52.049
and we have a little walk through them. So
the clocks are generated actually by the
00:33:52.049 --> 00:33:57.200
memory controller. Memory controller gives
the clocks the ARM. The main reason for
00:33:57.200 --> 00:34:01.030
this is that the memory controller has to
do some slow things now and then. It has
00:34:01.030 --> 00:34:05.860
to open pages of DRAMs, refresh cycles and
things. So it stops the CPU and generates
00:34:05.860 --> 00:34:11.559
the clock and it pauses the CPU by
stopping that clock from time to time.
00:34:11.559 --> 00:34:16.079
When you do a DRAM access, your adress on
bus along the top, the arm outputs an
00:34:16.079 --> 00:34:19.720
address that goes into the memory. The
MEMCthen converts that, it does an address
00:34:19.720 --> 00:34:23.599
translation and then it converts that into
a row and column addresses sheet with
00:34:23.599 --> 00:34:27.139
them. And then if you're doing a reading
and outputs the address aehm outputs the
00:34:27.139 --> 00:34:33.419
data onto the date bus, which on then sees
this kind of menses, the critical path on
00:34:33.419 --> 00:34:37.279
this. But the address flows through memory
effectively. Notice that MEMC is not on
00:34:37.279 --> 00:34:41.329
the data bus. It just gets addresses
flowing through it become important later
00:34:41.329 --> 00:34:47.260
on ROM is another slow things. Another
reason why memory might slow down the
00:34:47.260 --> 00:34:54.099
access and in a similar sort of way. There
is also a permission check done when
00:34:54.099 --> 00:35:00.259
you're doing the address translation per
user permission versus I was a supervisor
00:35:00.259 --> 00:35:06.640
and so this information's output as part
of the cycle when when he does access. If
00:35:06.640 --> 00:35:09.730
you miss and that translation, you get a
page false or permission fault, then an
00:35:09.730 --> 00:35:17.410
abort signal comes back and you take an
exception on deals with that in software.
00:35:17.410 --> 00:35:22.289
The database is a critical path, and so
the IO stuff is buffered, it is kept away
00:35:22.289 --> 00:35:27.599
from that. So the IO bus is 16 bits and
not a lot 32 bit peripherals around in
00:35:27.599 --> 00:35:32.599
those days that will the peripherals 8 or
16 bits. So that's the right thing to do.
00:35:32.599 --> 00:35:36.150
The IOC decodes that and there's a
handshake with memory if it needs more
00:35:36.150 --> 00:35:39.809
time, if it's accessing one of the
expansion cards in the expansion card. Is
00:35:39.809 --> 00:35:47.691
that something slow on X then that's dealt
with in the IOC. So I mentioned the
00:35:47.691 --> 00:35:53.680
interrupt status that gets funneled into
IOC and then back out again. There's a V
00:35:53.680 --> 00:35:57.599
Sync interrupt, but not an H Sync
interrupt. You have to use timers for that
00:35:57.599 --> 00:36:02.010
really annoyingly. There's one timer and
there's a 2 megahertz timer available. I
00:36:02.010 --> 00:36:05.539
think I had in a previous life not
previously mentioned it. So if you want to
00:36:05.539 --> 00:36:09.730
do funny palette switching stuff or copper
bars or something as possible with the
00:36:09.730 --> 00:36:13.400
timers, it's also simple hardware mod to
make a real HD sync interrupt as well.
00:36:13.400 --> 00:36:18.529
There's some spare interrupt inputs on the
IOC as an exercise for you . So the bit I
00:36:18.529 --> 00:36:23.440
really like about this system, I mentioned
that MEMC is not on the data bus. The VIDC
00:36:23.440 --> 00:36:28.079
is only on the data bus and it doesn't
have an address by C. Then the VIDC is the
00:36:28.079 --> 00:36:31.200
thing responsible for turning the frame
buffer into video reading that frame
00:36:31.200 --> 00:36:35.509
buffer out of RAM, so on. So how does it
actually do that? DRam read without the
00:36:35.509 --> 00:36:40.960
address? Well, the memory contains all of
the registers for doing this DNA. The
00:36:40.960 --> 00:36:45.140
start of the frame buffer, the current
position and size and so on. They will
00:36:45.140 --> 00:36:51.410
live in the MEMC. So there's a handshake
where VIDC sends a request up to the MEMC.
00:36:51.410 --> 00:36:55.239
When it's FIFO gets low, the memory then
actually generates the address into the
00:36:55.239 --> 00:37:00.349
DRAM diagram, DRAM outputs that data and
then gives the memory, gives an
00:37:00.349 --> 00:37:05.509
acknowledged to the... I mean...too many
chips. The memory gives an acknowledged to
00:37:05.509 --> 00:37:11.210
VIDC, which then matches that data into
the into the FIFO. So this partitioning is
00:37:11.210 --> 00:37:16.710
quite neat. A lot of the video, DMA, while
the video DMA all lives in MEMC and
00:37:16.710 --> 00:37:20.799
there's this kind of split across the two
chips. The sound one I've just
00:37:20.799 --> 00:37:24.839
highlighted, one interrupt that comes from
MEMC. Sound works exactly the same way,
00:37:24.839 --> 00:37:27.730
except there's a double buffering scheme
that goes on. And when one half of it
00:37:27.730 --> 00:37:32.359
becomes empty, you get an interrupt so you
can be sure that so you don't get your
00:37:32.359 --> 00:37:39.700
sound. So this this all works really very
smoothly. So finally the high res mono
00:37:39.700 --> 00:37:44.509
thing that I mentioned before is quite
novel way they did that to do had realized
00:37:44.509 --> 00:37:49.931
that with one external component to the
shift register and running very fast, he
00:37:49.931 --> 00:37:53.400
could implement this very high resolution
mode without really affecting the rest of
00:37:53.400 --> 00:38:00.099
the chip. So VIDC still runs at 24
megahertz to sort of PGA resolution. The
00:38:00.099 --> 00:38:05.450
outputs on a digital bus that was a test
boardoriginally. It outputs 4 bits. So 4
00:38:05.450 --> 00:38:09.420
pixels in one chunk at 24 megahertz and
then this external component then shifts
00:38:09.420 --> 00:38:13.880
through that 4 times the speed. There's
one component. I mean, this is this is a
00:38:13.880 --> 00:38:17.569
very cheap way of doing this. And as I
said, this this high res mode is very
00:38:17.569 --> 00:38:23.009
unusual for machines of this of this era.
I've got a feeling and a 500 the top end
00:38:23.009 --> 00:38:26.979
machine, if anyone's got one of these and
wants to try this trick and please get in
00:38:26.979 --> 00:38:31.080
touch, I've got a feeling and a five
hundred will do 1280 x 1024 by
00:38:31.080 --> 00:38:35.750
overclocking this. I think all of the
parts survive it. But for some reason,
00:38:35.750 --> 00:38:40.369
ACORN didn't support that on the board.
And finally, clock selection obviously on
00:38:40.369 --> 00:38:44.839
some of the machines, quite flexible set
of clocks for different resolutions,
00:38:44.839 --> 00:38:51.589
basically. So MEMC is not on the data bus.
How do we program it? It's got registers
00:38:51.589 --> 00:38:55.259
for DNA and it's got all this address
translation. So the memory map I showed
00:38:55.259 --> 00:39:01.089
before has an 8 megabyte space reserve for
the address translation registers doesn't
00:39:01.089 --> 00:39:04.690
have eight megabytes of it. I mean,
doesn't have two million 32 bit registers
00:39:04.690 --> 00:39:09.819
behind them, which is a hint of what's
going on here. So what you do is you write
00:39:09.819 --> 00:39:14.410
any value to this space and you encode the
information that you want to put into one
00:39:14.410 --> 00:39:19.539
of these registers in the address. So this
address, the top three bits, the one it's
00:39:19.539 --> 00:39:25.230
in the top eight megabytes of the 64
megabyte address space and you format your
00:39:25.230 --> 00:39:28.999
logical physical page information in this
address and then you write any byte
00:39:28.999 --> 00:39:35.479
effectively. This is a sort of feels
really dirty, but also really a very nice
00:39:35.479 --> 00:39:39.779
way of doing it because there's no other
space in the address map. And this reads
00:39:39.779 --> 00:39:45.069
to the the price balance. So it's not
worth having an address bus going into
00:39:45.069 --> 00:39:49.809
MEMC costing 32 more pins just to write
these registers as opposed to playing this
00:39:49.809 --> 00:39:55.849
sort of trick. If you have that address.
But adjust for that database just for
00:39:55.849 --> 00:39:59.990
that, then you know, you have to get to a
more expensive package. And this was this
00:39:59.990 --> 00:40:05.140
was really in their minds a 68 pin chip
versus an 84 pin chip. It was a big deal,
00:40:05.140 --> 00:40:08.719
right. So everything they really strived
to make sure it was in the very smallest
00:40:08.719 --> 00:40:13.250
package possible. And this system
partitioning effort led to these sorts of
00:40:13.250 --> 00:40:22.890
tricks to then then program it. So on the
A540, we get multiple MEMCs. Each one is
00:40:22.890 --> 00:40:27.329
assigned a colored stripe here of the
physical address space. So you have a 16
00:40:27.329 --> 00:40:31.049
megabyte space, each one looks after four
megabytes of it. But then when you do a
00:40:31.049 --> 00:40:36.039
virtual access in the bottom half of the
user space, regular program access, all of
00:40:36.039 --> 00:40:40.080
them light up and all of them will
translate that address in parallel. And
00:40:40.080 --> 00:40:44.290
one of them hopefully will translate and
then energize the RAM to do the read. For
00:40:44.290 --> 00:40:49.930
example, when you put an ARM 3 in this
system, on three has its cache and then
00:40:49.930 --> 00:40:54.420
the address leads into the memory. So then
that means that the address is being
00:40:54.420 --> 00:40:58.240
translated outside of the cache or after
the cache. So your caching virtual
00:40:58.240 --> 00:41:02.900
addresses and as we all know, this is kind
of bad for performance because whenever
00:41:02.900 --> 00:41:06.749
you change that virtual address space, you
have to invalidate your cache target. But
00:41:06.749 --> 00:41:11.799
they didn't do that. There's other ways of
solving this problem. Basically on this
00:41:11.799 --> 00:41:14.950
machine, what you need to do is invalidate
the whole cache. It's quite a quick
00:41:14.950 --> 00:41:24.150
operation, but it's still not good for
performance to have an empty cache. The
00:41:24.150 --> 00:41:28.730
only DMA present in the system is for the
video, for the video and sound. I/O
00:41:28.730 --> 00:41:32.569
doesn't have any DMA at all. And this is
another area where as younger engineers
00:41:32.569 --> 00:41:35.969
see crap, why didn't they have DMA? That
would be way better. DMA is the solution
00:41:35.969 --> 00:41:40.989
to everyone's problems, as we all know.
And I think the quote on the right hand
00:41:40.989 --> 00:41:47.390
ties in with the ACORN team's discovery
that all of these other processes needed
00:41:47.390 --> 00:41:51.969
quite complex chipsets, quite expensive
support chips. So the quote on the right
00:41:51.969 --> 00:41:56.539
says that if you've got some chips, that
vendors will be charging more for their
00:41:56.539 --> 00:42:03.259
DMA devices even than the CPU. So not
having dedicated DMA engine on board is a
00:42:03.259 --> 00:42:08.930
massive cost saving. The comment I made on
the previous to slide about the system
00:42:08.930 --> 00:42:14.440
partitioning, putting a lot of attention
into how many pins were on one chip versus
00:42:14.440 --> 00:42:19.380
another, how many buses were going around
the place. Not having IOC having to access
00:42:19.380 --> 00:42:25.019
memory was a massive saving and cost for
the number of pins and the system as a
00:42:25.019 --> 00:42:33.539
whole. The other thing is the the FIQ mode
was effectively the means for doing IO.
00:42:33.539 --> 00:42:37.999
Therefore, FIQ Mode was designed to be an
incredibly low overhead way of doing
00:42:37.999 --> 00:42:44.010
programed IO by having the CPU, you do the
IO. So this was saying that the CPU is
00:42:44.010 --> 00:42:48.850
going to be doing all of the IO stuff, but
lets just optimize it, let's make it make
00:42:48.850 --> 00:42:53.930
it as good as it could be and that's what
led to the threatened IO (?). I also
00:42:53.930 --> 00:42:57.849
remember ARM 2 didn't have a cache. If you
don't have a cache on your CPU you can.
00:42:57.849 --> 00:43:03.099
DMA is going to hold up the CPU anyway, so
we'll know cycles. DMA is not any
00:43:03.099 --> 00:43:06.960
performance. Again, you may as well get
the CPU to do it and then get the CPU to
00:43:06.960 --> 00:43:13.029
do it in the lowest overhead way possible.
I think this can be summarized as bringing
00:43:13.029 --> 00:43:17.410
the "RISC principles" to the system. So
the RISC principle, say for your CPU, you
00:43:17.410 --> 00:43:21.420
don't put anything in the CPU that you can
do in software and this is saying, okay,
00:43:21.420 --> 00:43:26.789
we'll actually software can do the IO just
as well without the cache as the DMA
00:43:26.789 --> 00:43:29.799
system. So let's get software to do that.
And I think this is a kind of a nice way
00:43:29.799 --> 00:43:34.339
of seeing it. This is part of the cost
optimization for really very little
00:43:34.339 --> 00:43:39.910
degradation in performance compared to
doing in hardware. So this is an IO card.
00:43:39.910 --> 00:43:43.380
The euro cards then nice and easy. The
only thing I wanted to say here was this
00:43:43.380 --> 00:43:49.339
is my SCSI card and it has a ROM on the
left hand side. And so. This is the
00:43:49.339 --> 00:43:54.339
expansion ROM basically many, many years
before PCI made this popular. Your drivers
00:43:54.339 --> 00:43:58.950
are on this ROM. This is a SCSI disc
plugging into this and you can plug this
00:43:58.950 --> 00:44:02.990
card in and then boot off the desk. You
don't need any other software to make it
00:44:02.990 --> 00:44:07.670
work. So this is just a very nice user
experience. There is no messing around
00:44:07.670 --> 00:44:11.690
with configuring IO windows or interrupts
or any of the ISIS sort of stuff that was
00:44:11.690 --> 00:44:17.869
going on at the time. So to summarize some
of the the hardware stuff that we've seen,
00:44:17.869 --> 00:44:21.950
the AMAs pipeline and it has the load-
store-multiple -instructions which make
00:44:21.950 --> 00:44:27.950
for a very high bandwidth utilization.
That's what gives it its high performance.
00:44:27.950 --> 00:44:32.670
The machine was really simple. So
attention to detail about separating,
00:44:32.670 --> 00:44:37.239
partitioning the work between the chips
and reducing the chip cost as much as
00:44:37.239 --> 00:44:44.569
possible. Keeping that balanced was really
a good idea. The machine was designed when
00:44:44.569 --> 00:44:49.400
memory and CPUs were about the same speed.
So this is before that kind of flipped
00:44:49.400 --> 00:44:52.910
over. An eight megahertz on two is
designed to use 8 megahertz memory.
00:44:52.910 --> 00:44:56.509
There's no need to have a cache at all on
there these days. It sounds really crazy
00:44:56.509 --> 00:45:01.410
not to have a cache on you, but if your
memory is not that much slower than this
00:45:01.410 --> 00:45:07.809
is a huge cost saving, but it is also risk
saving This was the first real proper CPU.
00:45:07.809 --> 00:45:11.670
If we don't count ARM 1 to say oh, was a
test, but ARM 2 is that, you know, the
00:45:11.670 --> 00:45:16.490
first product, CPU. And having a cache on
that would have been a huge risk for a
00:45:16.490 --> 00:45:20.640
design team that hadn't hadn't dealt with
structures that complicated it at that
00:45:20.640 --> 00:45:26.299
point. So that was the right thing to do,
I think, and took that DMA. I'm actually
00:45:26.299 --> 00:45:29.299
converse on this. I thought this was crap.
And actually, I think this was a really
00:45:29.299 --> 00:45:33.319
good example of balanced design. What's
the right tool for the job? Software is
00:45:33.319 --> 00:45:38.009
going to do the IO, so let's make sure
that the FIQ mode, it makes sure that
00:45:38.009 --> 00:45:44.640
there's low overhead as possible. Have you
talked about system partitioning the MMU ?
00:45:44.640 --> 00:45:49.569
I've seen ones about. I still think it's
weird and backward. I think there is a
00:45:49.569 --> 00:45:56.029
strong argument though that a more
familiar TB(?) is a massively complicated
00:45:56.029 --> 00:45:59.339
compared to what they did here. And I
think the main drive here was not just
00:45:59.339 --> 00:46:06.770
area on the chip, but also to make it much
simpler to implement. So it worked. And I
00:46:06.770 --> 00:46:09.450
think this was they really didn't have
that many shots of doing this. This wasn't
00:46:09.450 --> 00:46:14.779
a company or a team that could afford to
have many goes at this product. And I
00:46:14.779 --> 00:46:20.660
think that says it all. I think they did a
great job. Okay. So the ARX story is a
00:46:20.660 --> 00:46:24.599
little bit more complicated. Remember,
it's gonna be this office automation
00:46:24.599 --> 00:46:28.920
machine a bit like a Xerox star. Was going
to have this wonderful highres mono mode
00:46:28.920 --> 00:46:33.729
and people gonna be laser printing from
it. So just like Xerox PARC Aiken started
00:46:33.729 --> 00:46:37.911
Palo Alto based research center.
Californians and beanbags writing an
00:46:37.911 --> 00:46:43.319
operating system using a micro kernel in
modular 2 all of the trendy boxes ticked
00:46:43.319 --> 00:46:49.400
here for the mid 80s. It was the sounds
that very advanced operating system and it
00:46:49.400 --> 00:46:54.349
did virtual memory and so on is very
resource hungry, though. And it was never
00:46:54.349 --> 00:47:00.130
really very performance. Ultimately, the
hardware got done quicker than the
00:47:00.130 --> 00:47:05.930
software. And after a year or two.
Management got the jitters. Hardware was
00:47:05.930 --> 00:47:09.460
looming and said, well, next year we're
going to have the computer ready. Where's
00:47:09.460 --> 00:47:13.650
the operating system? And the project got
canned. And this is a real shame. I'd love
00:47:13.650 --> 00:47:16.599
to know more about this operating system.
Virtually nothing is documented outside of
00:47:16.599 --> 00:47:21.569
ACORN. Even the people I spoke to didn't
work on this. A bunch of people in
00:47:21.569 --> 00:47:25.250
California that kind of disappeared with
it. So if anyone has this software
00:47:25.250 --> 00:47:29.259
archived anywhere, then get in touch.
Computer Museum around the corner from me
00:47:29.259 --> 00:47:35.699
is raring to go on that. That'll be really
cool things to archive. So anyway, they
00:47:35.699 --> 00:47:39.979
had now a desperate situation. They had to
go to Plan B, which was in under a year.
00:47:39.979 --> 00:47:42.719
Right. An operating system for the machine
that was on its way to being delivered.
00:47:42.719 --> 00:47:48.260
And it kind of shows Arthur was I mean, I
think the team did a really good job in
00:47:48.260 --> 00:47:53.160
getting something out of the door in half
a year, but it was a little bit flaky.
00:47:53.160 --> 00:47:57.160
Risk, cost. Then a year later, developed
from Arthur. I don't know if anyone's
00:47:57.160 --> 00:48:01.609
heard of risk OS, but this is Arthur is
very, very niche and basically got
00:48:01.609 --> 00:48:07.170
completely replaced by risk loss because
it was a bit less usable than risk.
00:48:07.170 --> 00:48:12.059
Another really strong point that this is
quite a big wrong. So two megabytes going
00:48:12.059 --> 00:48:17.400
up. So half a megabytes in the 80s going
up to two megabytes in the early 90s.
00:48:17.400 --> 00:48:22.019
There's a lot of stuff in ROM. One of
those things is BBC Basic Five. I know
00:48:22.019 --> 00:48:29.289
it's 2019, but I know basic is basic, but
BBC Basic is actually quite good. It has
00:48:29.289 --> 00:48:32.859
procedures and it's got no support for all
the graphics and sound. You could give me
00:48:32.859 --> 00:48:36.660
applications and basic and a lot of people
did. It's also very fast. So Sophie Wilson
00:48:36.660 --> 00:48:42.920
wrote this this very, very optimized basic
interpreter. I talked about the modules
00:48:42.920 --> 00:48:45.589
and produles (?). This is the expansion
room. Things are really great user
00:48:45.589 --> 00:48:50.589
experience there. But speaking of user
experience, this was ARTHUR . I never used
00:48:50.589 --> 00:48:58.559
Arthur. I just dug out from it how to play
with it. It is bloody horrible. So that
00:48:58.559 --> 00:49:03.819
went away quickly. At the time also. So
part of this emergency plan B was to take
00:49:03.819 --> 00:49:08.210
the ACORN soft team who were supposed to
be writing applications for this and get
00:49:08.210 --> 00:49:12.079
them to quickly knock out an operating
system. So at launch, basically, this is
00:49:12.079 --> 00:49:15.750
one of the only things that you could do
with the machine. Had a great demo called
00:49:15.750 --> 00:49:20.569
Lender of Great Game called Arch, which is
3D space. You could fly around it, didn't
00:49:20.569 --> 00:49:27.029
have business, operate serious business
applications. And, you know, it was very
00:49:27.029 --> 00:49:31.079
there was not much you could do with this
really expensive machine at launch and
00:49:31.079 --> 00:49:35.450
that really hurt it, I think. So let me
get the risk as to 1988 and this is now
00:49:35.450 --> 00:49:42.219
looking less like a vomit sort of thing,
much nicer machine. And then eventually
00:49:42.219 --> 00:49:46.749
you Risc OS 3. It was drag and drop
between applications. It's all
00:49:46.749 --> 00:49:52.849
multitasking, does outline anti aliasing
and so on. So just lastly, I want to
00:49:52.849 --> 00:49:55.769
quickly touch on the really interesting
operating systems that ACORN had a Unix
00:49:55.769 --> 00:49:59.079
operating system. So as well as being a
geek, I'm also UNIX geek and I've always
00:49:59.079 --> 00:50:04.609
been fascinated by RISCiX. These machines
are astonishing and expensive. They were
00:50:04.609 --> 00:50:08.191
the existing Archimedes machines with a
different sticker on. So that's a 540 with
00:50:08.191 --> 00:50:13.890
a sticker on the front. And this system
was developed after the Archimedes was
00:50:13.890 --> 00:50:18.529
really designed at that point when this
open system was being developed. So
00:50:18.529 --> 00:50:20.950
there's a lot of stuff about the hardware
that wasn't quite right for a Unix
00:50:20.950 --> 00:50:26.230
operating system. 32K. page size on a 4
megabyte machine really, really killed you
00:50:26.230 --> 00:50:29.900
in terms of your page cache and and that
kind of thing. They turned this into a bit
00:50:29.900 --> 00:50:35.089
of an opportunity. At least they made good
on some of this. There was a quite a novel
00:50:35.089 --> 00:50:42.380
online decompression scheme for you to
demand Page in all text from the binary
00:50:42.380 --> 00:50:46.170
and it would decompressed into your search
to get a page, but it was stored in a
00:50:46.170 --> 00:50:54.309
sparse way on disk. So actually on disk
use was a lot less than you'd expect. The
00:50:54.309 --> 00:50:59.609
only way it fit on some of the smaller
machines. Also tackles the department does
00:50:59.609 --> 00:51:05.049
on the cyber track. It turns out this is
their view of the 680, which is an
00:51:05.049 --> 00:51:08.940
unreleased workstation. I love this
picture. I like that piece of cheese or
00:51:08.940 --> 00:51:13.959
cake is the mouse. That's my favorite
part. But this is the real machine. So
00:51:13.959 --> 00:51:20.650
this is an unreleased prototype I found at
the computer museum. It's notable. And
00:51:20.650 --> 00:51:24.650
there's got to MEMC. It's got a 8MB of
RAM. It's only designed to run. Respects
00:51:24.650 --> 00:51:26.099
the Unix operating system and has highres
monitor only doesn't have color, who's
00:51:26.099 --> 00:51:30.279
designed to run frame maker and driver
laser printers and be a kind of desktop
00:51:30.279 --> 00:51:35.249
publishing workstation. I've always been
fascinated by Risk X, as I said a while
00:51:35.249 --> 00:51:41.450
ago. I hacked around on ACORN for a while.
I got a beating and I can. I've never seen
00:51:41.450 --> 00:51:46.640
this before. I never used to risk X
machine. So there we go it Boots, it is
00:51:46.640 --> 00:51:51.730
multi-user. But wait, there's more. It has
a really cool X-Server, a very fast one. I
00:51:51.730 --> 00:51:54.730
think so. If you Wilson again worked on
the server here. So it's very, very well
00:51:54.730 --> 00:51:58.019
optimized and very fast for a machine of
its era. And it makes quite a nice little
00:51:58.019 --> 00:52:02.900
Unix workstation. It's quite a cool little
system, by the way TUDOR the guy that
00:52:02.900 --> 00:52:07.099
designed the VIDC and the IO system called
me a sado forgetting this working in
00:52:07.099 --> 00:52:14.150
there. That's my claim to fame. Finally,
and I want to leave some time for
00:52:14.150 --> 00:52:19.510
questions. There's a lot of useful stuff
in Rome. One of them is BBC Basic Basic
00:52:19.510 --> 00:52:23.009
has an assembler so you can walk up to
this machine with a floppy disk and write
00:52:23.009 --> 00:52:30.239
assembler has a special bit of syntax
there and then you can just call it. And
00:52:30.239 --> 00:52:32.460
so this is really powerful. So at school
or something with the floppy disk, you can
00:52:32.460 --> 00:52:37.199
do something that's a bit more than basic
programing. Bizarrely, I mostly write that
00:52:37.199 --> 00:52:41.420
with only two or three tiny syntax errors
after about 20 years away from this. It's
00:52:41.420 --> 00:52:46.059
in there somewhere legacy wise. The
machine didn't sell very many under a
00:52:46.059 --> 00:52:50.930
hundred thousand easily. I don't think it
really made a massive impact. PCs had
00:52:50.930 --> 00:52:54.640
already taken off. By then. The ARM
processor is going to go on about the
00:52:54.640 --> 00:52:58.920
company. That's that's clear that that
obviously has changed the world in many
00:52:58.920 --> 00:53:04.140
ways. The thing I really took away from
this exercise was that a handful of smart
00:53:04.140 --> 00:53:10.089
people. Not that many. No order of a dozen
designed multiple chips, designed a custom
00:53:10.089 --> 00:53:14.869
computer from scratch, got it working. And
it was quite good. And I think that this
00:53:14.869 --> 00:53:17.380
really turned people's heads. It made
people think differently that the people
00:53:17.380 --> 00:53:21.160
that were not Motorola and IBM really,
really big companies with enormous
00:53:21.160 --> 00:53:27.479
resources could do this and could make it
work. I think actually that led to the
00:53:27.479 --> 00:53:30.809
thinking that people could design their
systems on the chip in the 90s and that
00:53:30.809 --> 00:53:35.309
market taking off. So I think this is
really key in getting people thinking that
00:53:35.309 --> 00:53:40.420
way. It was possible to design your own
silicon. And finally, I just want to thank
00:53:40.420 --> 00:53:45.279
the people I spoke to and Adrian and
Jason. Their sense of computing history in
00:53:45.279 --> 00:53:49.049
Cambridge. If you're in Cambridge, then
please visit there. It's a really cool
00:53:49.049 --> 00:53:56.270
museum. And with that, I'll wrap up. If
there's any time for questions, then I'm
00:53:56.270 --> 00:54:01.890
getting a blank look. No time for
questions. There's about 5 minutes left.
00:54:01.890 --> 00:54:09.680
Say it or come up to me afterwards. I'm
happy to. Happy to talk more about this.
00:54:09.680 --> 00:54:18.940
Applause
Herald:The first question is for the
00:54:18.940 --> 00:54:29.799
Internet. Internet signal angel, will you?
Well, get your microphones and get the
00:54:29.799 --> 00:54:36.700
first of the audio in the room here. Since
the microphone, please ask a question.
00:54:36.700 --> 00:54:44.130
Mic1: You mentioned that the system is
making good use of the memory, but how is
00:54:44.130 --> 00:54:50.459
that actually not completely being
installed on memory? Having no cache and
00:54:50.459 --> 00:54:55.450
same cycle time for the cache as for the
memory as for the CPU.
00:54:55.450 --> 00:55:01.140
M: Good question. So how is it not always
build on memory ? I mean. Well, it's
00:55:01.140 --> 00:55:04.390
sometimes stored on memory when you do
something that's non sequential. You have
00:55:04.390 --> 00:55:08.869
to take one of the slow cycles. This was
the N cycle. The key is this you try and
00:55:08.869 --> 00:55:11.469
maximize the amount of time that you're
doing sequential stuff.
00:55:11.469 --> 00:55:16.220
So on the ARM 2 you wanted to unroll loops
as much as possible. So you're fetching
00:55:16.220 --> 00:55:19.799
your instructions sequentially, right? You
wanted to make as much use as lodestone
00:55:19.799 --> 00:55:24.290
multiples. You could load single registers
with an individual register load, but it
00:55:24.290 --> 00:55:28.710
was much more efficient to pay that cost.
Just once the start of the instruction and
00:55:28.710 --> 00:55:33.619
then stream stuff sequentially. So you're
right that it is still stored sometimes,
00:55:33.619 --> 00:55:37.141
but that was still there. Still a good
tradeoff, I think, for a system that
00:55:37.141 --> 00:55:40.549
didn't have a cache for other reasons.
M1: Thanks.
00:55:40.549 --> 00:55:45.530
Herald: Next question is for the Internet.
Signal Angel(S): Are there any other ACORN
00:55:45.530 --> 00:55:49.839
here right now or if you want to get into
this kind of party together?
00:55:49.839 --> 00:55:51.980
Herald: Can you repeat the first sentence,
please?
00:55:51.980 --> 00:55:55.839
S: Sorry. The first part if you want to
get into this kind of popular vibe right
00:55:55.839 --> 00:55:58.839
now.
M: Yeah, good question, sir. How do you
00:55:58.839 --> 00:56:06.359
get hold of one drive prices up on eBay? I
guess I hate to say it might be fun to
00:56:06.359 --> 00:56:09.170
play around and emulators. Always
professors that are hack around on the
00:56:09.170 --> 00:56:12.309
real thing. Emulators always feel a bit
strange. There are a bunch of really good
00:56:12.309 --> 00:56:19.180
emulators out there. Quite complete. Yeah,
I think it just I would just go on on on
00:56:19.180 --> 00:56:23.260
auction sites and try and find one.
Unfortunately, they're not completely
00:56:23.260 --> 00:56:27.829
rare. I mean that's that's the thing they
did sell. Not quite sure. Exact figure,
00:56:27.829 --> 00:56:31.500
but you know, there were tens and tens of
thousands of these things made. So I would
00:56:31.500 --> 00:56:35.130
look also in Britain more than elsewhere.
Although I do understand that Germany had
00:56:35.130 --> 00:56:40.170
quite a few. If you can get a hold of one,
though, I do suggest doing so. I think
00:56:40.170 --> 00:56:46.259
they're really fun to play with.
Herald: OK, next question.
00:56:46.259 --> 00:56:51.860
M2: So I found myself looking at the
documentation for the LV MSU instructions
00:56:51.860 --> 00:56:58.049
while devaluing something on. Just last
week. And just maybe wonder what's your
00:56:58.049 --> 00:57:04.029
thought? Are there any quirks of the
Archimedes that have crept into the modern
00:57:04.029 --> 00:57:06.900
arm design and instruction set that you
were aware of?
00:57:06.900 --> 00:57:13.449
M: Most of them got purged. So there are
the 26 bits of dressing. There was a
00:57:13.449 --> 00:57:20.039
couple of strange uses of theirs A XOR or
instruction into PC for changing flags. So
00:57:20.039 --> 00:57:25.160
there was a great purge when the ARM 6 was
designed and the arm 6. I should know
00:57:25.160 --> 00:57:31.559
there's ARM v3. That's what first step
addressing and lost this. These witnesses
00:57:31.559 --> 00:57:35.690
got moved out.
I can't think of aside from just the
00:57:35.690 --> 00:57:40.619
resulting on 32 instructions that being
quite quirky and having a lot of good
00:57:40.619 --> 00:57:47.099
quirks. This shifted register as sort of a
free thing you can do. For example, you
00:57:47.099 --> 00:57:52.059
can add one register to a shifted register
in one cycle. I think that's a good quirk.
00:57:52.059 --> 00:57:55.119
So in terms of the inheriting that
instruction set and not changing those
00:57:55.119 --> 00:58:05.959
things. Maybe that counts this.
Herald: Any further questions Internet ?
00:58:05.959 --> 00:58:10.959
And if you have questions. No. Okay. No.
In that case, one round of applause.
00:58:10.959 --> 00:58:12.959
M: Thank you.
00:58:12.959 --> 00:58:13.959
Applause
00:58:13.959 --> 00:58:14.959
postroll music
00:58:14.959 --> 00:58:28.130
Subtitles created by c3subtitles.de
in the year 2020. Join, and help us!