WEBVTT
00:00:00.000 --> 00:00:13.321
33C3 preroll music
00:00:13.321 --> 00:00:16.840
Herald: You have been
here on stage before.
00:00:16.840 --> 00:00:20.160
You successfully tampered with the Wii,
00:00:20.160 --> 00:00:23.110
You successfully tampered
with the PS3 and got
00:00:23.110 --> 00:00:26.840
some legal challenges over there?
00:00:26.840 --> 00:00:28.939
marcan: Some unfounded
legal challenges, yes.
00:00:28.939 --> 00:00:31.640
Herald: And then you fucked,
and excuse my French over here
00:00:31.640 --> 00:00:35.149
– by the way, that is number 8021 to get
00:00:35.149 --> 00:00:39.840
the translation on your DECT phone.
00:00:39.840 --> 00:00:44.600
So you fucked with the Wii U as well.
00:00:44.600 --> 00:00:47.999
“Console Hacking 2016”,
here we go!
00:00:47.999 --> 00:00:51.629
marcan: I’m a lazy guy, so I haven’t
turned on my computer yet for the slides.
00:00:51.629 --> 00:00:57.180
So let me do that,
hopefully this will work.
00:00:57.180 --> 00:01:00.559
My computer is a little bit special.
It runs a lot of Open Source software.
00:01:00.559 --> 00:01:05.620
It runs FreeBSD.
00:01:05.620 --> 00:01:09.909
applause
00:01:09.909 --> 00:01:14.370
It even has things like OpenSSL
in there, and nginx.
00:01:14.370 --> 00:01:21.160
And Cairo I think, and WebKit. It runs a
lot of interesting Open Source software.
00:01:21.160 --> 00:01:24.980
But we all know that BSD is dying, so
we can make it run something a little bit
00:01:24.980 --> 00:01:29.730
more interesting. And hopefully
give a presentation about it.
00:01:29.730 --> 00:01:32.530
Let’s see if this works.
00:01:36.149 --> 00:01:38.380
It’s a good start, black screen, you know.
00:01:38.380 --> 00:01:43.330
It’s syncing to disk
and file system shutting down.
00:01:43.330 --> 00:01:48.710
There we go!
applause
00:01:48.710 --> 00:01:55.310
continued applause
00:01:55.310 --> 00:01:58.610
And yes, I run Gentoo Linux.
00:01:58.610 --> 00:02:01.390
applause
00:02:01.390 --> 00:02:05.400
This is the “Does Wi-Fi work?” moment.
Hopefully.
00:02:07.490 --> 00:02:12.570
NTP, yeah, no… “NTP failed”. Well,
that’s a bit annoying, but it still works.
00:02:15.630 --> 00:02:21.250
Hello? Yeah, it takes a bit to boot.
It doesn’t run systemd, you know.
00:02:21.250 --> 00:02:25.250
It’s sane, it’s a tiny bit slower,
but it’s sane.
00:02:25.250 --> 00:02:30.390
There we go.
applause
00:02:30.390 --> 00:02:35.260
This is the “Does my controller
work?” moment.
00:02:35.260 --> 00:02:39.517
Bluetooth in Saal 1.
Okay, it does.
00:02:39.517 --> 00:02:41.708
Alright, so let’s get started.
00:02:49.700 --> 00:02:53.730
So this is “Console Hacking 2016 –
PS4: PC Master Race”.
00:02:53.730 --> 00:02:58.350
I apologize for the horrible Nazi joke in
the subtitle, but it’s a Reddit thing.
00:02:58.350 --> 00:03:03.069
“PC Master Race”, why? Well.
PS4, is it a PC? Is it not a PC?
00:03:03.069 --> 00:03:06.070
But before we get started,
I would like to dedicate this talk
00:03:06.070 --> 00:03:09.430
to my good friend Ben Byer
who we all know as “bushing”.
00:03:09.430 --> 00:03:11.790
Unfortunately, he passed away
in February of this year and he was
00:03:11.790 --> 00:03:15.240
a great hacker, he came to multiple
congresses, one of the nicest people
00:03:15.240 --> 00:03:19.040
I’ve ever met. I’m sure that some of you
who have met him would agree with that.
00:03:19.040 --> 00:03:23.960
If it weren’t for him, I wouldn’t be here.
So, thank you.
00:03:23.960 --> 00:03:30.480
applause
00:03:30.480 --> 00:03:34.840
Alright. So, the PS4.
Is it a PC? Is it not a PC?
00:03:34.840 --> 00:03:37.220
Well, it’s a little bit different
from previous consoles.
00:03:37.220 --> 00:03:42.490
It has x86, it’s an x86 CPU.
It runs FreeBSD, it runs WebKit.
00:03:42.490 --> 00:03:45.490
It doesn’t have a hypervisor,
unfortunately.
00:03:45.490 --> 00:03:49.849
Then again, the PS3 had a hypervisor
and it was useless, so there you go.
00:03:49.849 --> 00:03:52.380
So this is different from the PS3,
but it’s not completely different.
00:03:52.380 --> 00:03:54.959
It does have a security processor
that you can just ignore because
00:03:54.959 --> 00:03:59.779
it doesn’t secure anything.
So that’s good.
00:03:59.779 --> 00:04:02.520
So how to own a PS4? Well, you write
a WebKit exploit and you write
00:04:02.520 --> 00:04:07.800
a FreeBSD exploit, duh. Right?
Everything runs WebKit,
00:04:07.800 --> 00:04:10.739
and FreeBSD is not exactly the
most secure OS in the world,
00:04:10.739 --> 00:04:14.800
especially not with Sony customizations.
So this is completely boring stuff.
00:04:14.800 --> 00:04:18.548
Like, what’s the point of talking about
WebKit and FreeBSD exploits?
00:04:18.548 --> 00:04:22.089
Instead, this talk is going to be about
something a little bit different.
00:04:22.089 --> 00:04:26.040
First of all, after you run an exploit,
well, you know, step 3 “something”,
00:04:26.040 --> 00:04:29.770
step 4 “PROFIT”. What is this about?
And not only that, though.
00:04:29.770 --> 00:04:32.740
Before you write an exploit, you usually
want to have the code you’re trying
00:04:32.740 --> 00:04:38.100
to exploit. And with WebKit and FreeBSD
you kinda do, but not the build they use,
00:04:38.100 --> 00:04:41.440
and it’s customized. And it’s annoying to
write an exploit if you don’t have access
00:04:41.440 --> 00:04:43.770
to the binary. So how do you get
the binary in the first place?
00:04:43.770 --> 00:04:47.690
Well, you dump the code,
that’s an interesting step.
00:04:47.690 --> 00:04:51.580
So let’s get started with step zero:
black-box code extraction, the fun way.
00:04:51.580 --> 00:04:54.450
A long time ago
in a hackerspace far, far away
00:04:54.450 --> 00:04:59.280
fail0verflow got together
after 31c3.
00:04:59.280 --> 00:05:02.530
And we looked at the PS4 motherboard
and this is what we saw. So there’s
00:05:02.530 --> 00:05:06.000
an Aeolia southbridge, that’s a codename,
by the way. Then there’s the Liverpool APU
00:05:06.000 --> 00:05:10.450
which is the main processor.
It’s a GPU and a CPU
00:05:10.450 --> 00:05:13.870
which is done by AMD, and
it has some RAM. And then
00:05:13.870 --> 00:05:16.250
the southbridge connects to a bunch
of random crap like the USB ports,
00:05:16.250 --> 00:05:19.280
a hard disk, which is USB. For some
inexplicable reason the internal disk
00:05:19.280 --> 00:05:24.840
on the PS4 is USB. Like it’s SATA to USB,
and then to USB on the southbridge.
00:05:24.840 --> 00:05:28.040
Even though it has SATA,
like, what? laughs
00:05:28.040 --> 00:05:31.630
The Blu-ray drive is SATA. The Wi-Fi,
Bluetooth, SDIO and Ethernet is GMII.
00:05:31.630 --> 00:05:34.090
Okay, how do we attack this?
Well, GDDR5…
00:05:34.090 --> 00:05:38.720
What just…?
Oh. I have a screensaver, apparently!
00:05:38.720 --> 00:05:40.960
That’s great.
laughter
00:05:40.960 --> 00:05:44.350
I thought I killed that,
let me kill that screensaver real quick.
00:05:44.350 --> 00:05:50.960
applause
Something had to fail, it always does.
00:05:52.490 --> 00:05:55.310
I mean, of course I can
SSH into my PS4, right?
00:05:55.310 --> 00:05:59.500
So there we go, okay.
Could have sworn I’d fix that. Anyway…
00:05:59.500 --> 00:06:02.760
Which one of these interfaces
do you attack? Well, you know,
00:06:02.760 --> 00:06:06.820
USB, SATA, SDIO, GMII – that’s
the raw ethernet interface, by the way –
00:06:06.820 --> 00:06:11.520
all these are CPU-controlled. The CPU
issues commands and the devices reply.
00:06:11.520 --> 00:06:16.389
The devices can’t really do anything. They
can’t write to memory or anything like that.
00:06:16.389 --> 00:06:19.050
You can exploit USB if you
hide a bug in the USB driver,
00:06:19.050 --> 00:06:21.370
but we’re back to the no-code issue.
00:06:21.370 --> 00:06:24.870
DDR5, that would be great,
we could just write to our memory
00:06:24.870 --> 00:06:27.930
and basically own the entire thing.
But it’s a very high-speed bus.
00:06:27.930 --> 00:06:30.160
It’s definitely exploitable.
If you were making a secure system
00:06:30.160 --> 00:06:33.840
don’t assume we can’t own DDR5,
because we will.
00:06:33.840 --> 00:06:37.020
But it’s not the path of least resistance,
so we’re not gonna do that.
00:06:37.020 --> 00:06:40.150
However, there’s a thing called
PCI Express in the middle there.
00:06:40.150 --> 00:06:42.100
Hmm, that’s interesting!
00:06:42.100 --> 00:06:45.430
PCIe is very fun for hacking –
even though it might seem intimidating –
00:06:45.430 --> 00:06:48.870
because it’s bus mastering,
that means you can DMA to memory.
00:06:48.870 --> 00:06:52.759
It’s complicated, and complicated things
are hard to implement properly.
00:06:52.759 --> 00:06:58.330
It’s robust. People think that PCIe is this
voodoo-highspeed… No it’s not!
00:06:58.330 --> 00:07:00.610
It’s high-speed, but you don’t need
matched traces to make it work.
00:07:00.610 --> 00:07:05.440
It will run over wet string. You can hotwire
PCIe with pieces of wire and it will work.
00:07:05.440 --> 00:07:09.330
At least at short distances anyway.
Believe me, it’s not as bad as you think.
00:07:09.330 --> 00:07:13.310
It’s delay-tolerant, so you
can take your time to reply.
00:07:13.310 --> 00:07:16.550
And the drivers are full of fail because
nobody writes a PCIe driver assuming
00:07:16.550 --> 00:07:19.520
the device is evil even though of course
everybody should because devices can
00:07:19.520 --> 00:07:22.620
and will be evil.
But nobody does that.
00:07:22.620 --> 00:07:25.680
So, what can we do?
Well, we have a PCIe link,
00:07:25.680 --> 00:07:30.740
let’s cut the lines and plug in the
southbridge to a PC motherboard
00:07:30.740 --> 00:07:34.460
that we stick on the side. Now
the southbridge is a PCIe card for us.
00:07:34.460 --> 00:07:38.479
And we connect the APU to an FPGA
board which then can pretend to be
00:07:38.479 --> 00:07:43.130
a PCIe device. So we can man-in-the-middle
this PCIe bus and it’s now x1 width
00:07:43.130 --> 00:07:47.110
instead of x4 because it’s easier that
way, but it will negotiate, that’s fine.
00:07:47.110 --> 00:07:50.520
So how do we connect that
motherboard and the FPGA?
00:07:50.520 --> 00:07:53.669
There’s of course many ways of doing this.
How many of you have done
00:07:53.669 --> 00:07:57.550
any hardware hacking, even Arduino or
anything like that? Raise your hand!
00:07:57.550 --> 00:08:02.310
I think that’s about a third to a half
or something like that, at least.
00:08:02.310 --> 00:08:04.750
When you hack some hardware,
you meld some hardware,
00:08:04.750 --> 00:08:10.100
after you blink an LED, what is the first
interface you use to talk to your hardware?
00:08:10.100 --> 00:08:14.880
Serial port! So we run
PCIe over RS232 at 115 kBaud
00:08:14.880 --> 00:08:16.490
which makes this PCIe…
laughter and applause
00:08:21.500 --> 00:08:27.710
I said it was delay-tolerant!
So it makes this PCIe 0.00002x.
00:08:27.710 --> 00:08:30.199
And eventually there was a
Gigabit ethernet port on the FPGA
00:08:30.199 --> 00:08:35.000
so I upgraded to that, but I only got
around to doing it in one direction.
00:08:35.000 --> 00:08:39.019
So now it’s PCIe 0.0002x in one direction
and 0.5x in the other direction
00:08:39.019 --> 00:08:42.099
which has to make this one of the most
asymmetric buses in the world.
00:08:43.489 --> 00:08:45.870
But it works, believe me.
This his hilarious.
00:08:45.870 --> 00:08:50.920
We can run PCIe over serial out. Also, we
were ASCII encoding, so half the bandwidth.
00:08:50.920 --> 00:08:52.940
It works fine. It’s fine.
00:08:52.940 --> 00:08:56.550
So, PCIe 101.
It’s a reliable packet-switched network.
00:08:56.550 --> 00:08:59.270
It uses a thing called
“Transaction Layer Packets”
00:08:59.270 --> 00:09:03.440
which are basically just packets you send.
It can be… Memory Read, Memory Write,
00:09:03.440 --> 00:09:06.140
IO Read, IO Write,
Configuration Read, Configuration Write.
00:09:06.140 --> 00:09:09.600
There can be a message-signaled interrupt
which is a way of saying: “Hey,
00:09:09.600 --> 00:09:13.470
listen to me!” by writing
to an address in memory.
00:09:13.470 --> 00:09:16.010
Because we can write the thing,
so why not write for interrupts?
00:09:16.010 --> 00:09:20.320
It has legacy interrupts
which are basically emulating the old
00:09:20.320 --> 00:09:24.430
wire-low-for-interrupt-and-
high-for-no-interrupt thing,
00:09:24.430 --> 00:09:25.750
you can tunnel that over PCIe.
00:09:25.750 --> 00:09:29.380
And it has completions, which are
basically the replies. So if you read
00:09:29.380 --> 00:09:31.930
a value from memory the completion
is what you get back with the value
00:09:31.930 --> 00:09:36.040
you tried to read. So that’s PCIe,
we can just go wild with DMA.
00:09:36.040 --> 00:09:39.250
We can just read all memory, dump
the kernel. Hey, it’s awesome, right?
00:09:39.250 --> 00:09:41.470
Except there’s an IOMMU in the APU.
00:09:41.470 --> 00:09:46.180
But... first, the IOMMU will protect
the devices. It will only let you access
00:09:46.180 --> 00:09:50.430
what memory is mapped to your device.
So the host has to allow you
00:09:50.430 --> 00:09:53.070
to read and write to memory.
But just because there’s an IOMMU
00:09:53.070 --> 00:09:58.190
doesn’t mean that Sony uses it properly.
Here’s some pseudo-code,
00:09:58.190 --> 00:10:01.390
it has a buffer on the stack, it says:
“please read from flash to this buffer”
00:10:01.390 --> 00:10:04.810
with the correct length. Can anyone
see the problem with this code?
00:10:04.810 --> 00:10:09.290
Well, it maps the buffer and it
reads and it unmaps the buffer.
00:10:09.290 --> 00:10:13.100
But IOMMUs don’t just map
byte “foo” to byte “bar”,
00:10:13.100 --> 00:10:16.570
they map pages, and
pages are 64k on the PS4.
00:10:16.570 --> 00:10:19.910
So Sony has just mapped 64k
of its stack to the device so
00:10:19.910 --> 00:10:25.720
it can just DMA straight into the stack,
basically the whole stack, and take over.
00:10:25.720 --> 00:10:29.660
Now we got code execution, FreeBSD
kernel dump, and WebKit and OS libs dump,
00:10:29.660 --> 00:10:32.500
just from mapping the flash.
00:10:32.500 --> 00:10:36.080
Okay, that’s step zero.
We have the code.
00:10:36.080 --> 00:10:39.930
But that’s not the PS4 that we did this
on, it was a giant mess of wires.
00:10:39.930 --> 00:10:43.019
Someone here knows about that,
you know, flying over on Facebook.
00:10:43.019 --> 00:10:46.480
We don’t make a ‘nice’ exploit.
We’ve done that because, as I said,
00:10:46.480 --> 00:10:50.089
WebKit, FreeBSD, whatever.
What comes after that?
00:10:50.089 --> 00:10:55.010
We want to do something.
Of course we want to run Linux, duh!
00:10:55.010 --> 00:10:58.590
How do you go from FreeBSD to Linux?
It’s not a trivial process.
00:10:58.590 --> 00:11:02.660
But you use something
that we call “ps4-kexec”.
00:11:02.660 --> 00:11:06.640
So how does this work? It’s simple,
right? You just want to run Linux?
00:11:06.640 --> 00:11:10.190
Just ‘jmp’ to Linux, right?
Well… kind of.
00:11:10.190 --> 00:11:13.180
You need to load Linux into contiguous
physical RAM, set up boot parameters,
00:11:13.180 --> 00:11:16.700
shut down FreeBSD cleanly, halt secondary
CPUs, make new pagetables etc.
00:11:16.700 --> 00:11:19.540
A lot of random things. I’m not going to
bore you with this crap because you
00:11:19.540 --> 00:11:23.459
can read the code. But there’s a lot
of iteration in getting this to work.
00:11:23.459 --> 00:11:26.930
Let’s assume that you do all this magical
cleanup and you get Linux into
00:11:26.930 --> 00:11:32.850
a nice state and you can ‘jmp’ Linux.
Now we jmp Linux, right? It’s cool.
00:11:32.850 --> 00:11:35.440
Yeah, you can technically jmp to Linux,
and it will technically run
00:11:35.440 --> 00:11:41.370
…for a little bit. And it will stop.
00:11:41.370 --> 00:11:45.290
And you will not get any serial or any
video or anything. What’s going on here?
00:11:45.290 --> 00:11:49.430
Let’s talk about hardware.
What is x86?
00:11:49.430 --> 00:11:53.050
x86 is a mediocre instruction set
architecture by Intel.
00:11:53.050 --> 00:11:56.190
It’s okay, I guess.
It’s not great.
00:11:56.190 --> 00:12:00.250
PS4 is definitely x86, it’s x86-64.
00:12:00.250 --> 00:12:03.580
What is a PC? Aah!
PC is a horrible, horrible thing
00:12:03.580 --> 00:12:07.220
built upon piles and piles of legacy crap
dating back to 1981.
00:12:07.220 --> 00:12:10.310
The PS4 is definitely -not- a PC.
00:12:10.310 --> 00:12:15.190
That’s practically Sony-level hardware fail,
so it could be, but it’s not.
00:12:15.190 --> 00:12:19.480
So what’s going on? A legacy PC
00:12:19.480 --> 00:12:22.660
basically has an 8259 Programmable
Interrupt Controller,
00:12:22.660 --> 00:12:27.360
a 8253 Programmable Interval Timer,
a UART at I/O 3f8h,
00:12:27.360 --> 00:12:29.399
which is the standard address
for a serial port.
00:12:29.399 --> 00:12:33.709
It has a PS/2 keyboard controller, 8042.
It has an RTC, a real-time clock
00:12:33.709 --> 00:12:35.510
with a CMOS, and everyone
knows the CMOS, right?
00:12:35.510 --> 00:12:40.240
MC146818 is the chip number for that. An
ISA bus – even if you think you don’t have
00:12:40.240 --> 00:12:43.010
an ISA bus your computer has an ISA bus
inside the southbridge somewhere.
00:12:43.010 --> 00:12:48.019
And it has VGA.
The PS4 doesn’t have -any- of these things.
00:12:48.019 --> 00:12:51.880
So what do we do?
Let’s look a little bit how a PC works
00:12:51.880 --> 00:12:55.760
and how a PS4 works. This is a general
simple PC system. There’s an APU
00:12:55.760 --> 00:13:00.170
or an Intel Core CPU with a southbridge,
Intel calls it PCH, AMD FCH.
00:13:00.170 --> 00:13:03.750
There’s an interface that is basically
PCIe although Intel calls it DMI and AMD
00:13:03.750 --> 00:13:08.270
calls it UMI. DDR3 RAM and a bunch
of peripherals and SATA, whatever.
00:13:08.270 --> 00:13:12.120
The PS4 kind of looks like that, right?
So you think this can’t be that dif…
00:13:12.120 --> 00:13:15.810
What’s so hard about this?
Because all the crap I mentioned earlier
00:13:15.810 --> 00:13:20.410
is in the southbridge on a PC, right?
The PS4 has a southbridge, right?
00:13:20.410 --> 00:13:23.870
Right? Right? Umm… so
the southbridge, the AMD standard FCH
00:13:23.870 --> 00:13:27.959
implements Intel legacy from 1981.
The Marvell Aeolia
00:13:27.959 --> 00:13:31.030
– Marvell is the maker of the PS4
southbridge – implements Intel legacy
00:13:31.030 --> 00:13:35.550
from 2002. What does that mean?
Ah! That’s no southbridge,
00:13:35.550 --> 00:13:40.300
that’s a Marvell Armada SoC!
So it’s not actually a southbridge,
00:13:40.300 --> 00:13:43.760
it was never a southbridge.
It’s an ARM system-on-a-chip CPU
00:13:43.760 --> 00:13:47.120
with everything. It’s a descendant
from Intel StrongARM or XScale.
00:13:47.120 --> 00:13:49.120
It has a bunch of peripherals.
And what they did is, they stuck
00:13:49.120 --> 00:13:53.240
a PCIe bridge on the side and said: “Hey
x86, you can now use all my ARM shit.”
00:13:53.240 --> 00:13:56.270
So it exposes all of its ARM peripherals
to the x86. They added some stuff
00:13:56.270 --> 00:13:59.100
they really needed for PCs
and it has its own RAM.
00:13:59.100 --> 00:14:03.720
Why do they do this? Well, it also runs
FreeBSD on the ARM in standby mode.
00:14:03.720 --> 00:14:06.019
And that’s how they do the whole
“download updates in the background,
00:14:06.019 --> 00:14:08.760
get content, update, whatever”.
All that crap is because they have
00:14:08.760 --> 00:14:12.851
a separate OS on a separate chip running
in standby mode. Okay, that’s great, but
00:14:12.851 --> 00:14:17.860
it’s also batshit insane.
laughter
00:14:17.860 --> 00:14:21.540
Quick recap: This is what a
PCIe bus number looks like,
00:14:21.540 --> 00:14:24.459
sorry, a device number.
It has a bus number, which is 8 bits,
00:14:24.459 --> 00:14:27.980
a device number, which is 5 bits,
and a function number, which is 3 bits.
00:14:27.980 --> 00:14:31.339
You’ve probably seen this in lspci
if you ever done that.
00:14:31.339 --> 00:14:34.480
This is what a regular southbridge
looks like. It has a USB controller,
00:14:34.480 --> 00:14:38.180
a PCI, ISA bridge, SATA, whatever.
And it has a bunch of devices.
00:14:38.180 --> 00:14:41.110
So one southbridge pretends
to be multiple devices.
00:14:41.110 --> 00:14:43.769
Because you only have three bits
for a function number so you can only have
00:14:43.769 --> 00:14:47.200
up to eight functions in one device.
00:14:47.200 --> 00:14:48.860
Intel southbridge just says:
“I’m device 14, 16, 1a, 1…,
00:14:48.860 --> 00:14:51.860
I’m just a bunch of devices,
and you can talk to all of them.”
00:14:51.860 --> 00:14:57.670
If you lspci on a roughly unpatched
Linux kernel on the PS4
00:14:57.670 --> 00:15:00.649
you get something like this.
So the Aeolia first of all
00:15:00.649 --> 00:15:03.740
clones itself into every PCIe device
because they were too lazy to do
00:15:03.740 --> 00:15:08.110
“if device equals my number then
reply, otherwise don’t reply”. No,
00:15:08.110 --> 00:15:11.470
they just said: “Oh, just reply to every
single PCIe device that might query”.
00:15:11.470 --> 00:15:16.870
Linux sees the southbridge 31 different
times, which is kind of annoying
00:15:16.870 --> 00:15:20.380
because it gets really confused when it
sees 31 clones of the same southbridge.
00:15:20.380 --> 00:15:24.540
And then it has eight functions:
ACPI, ethernet, SATA, SDMC, PCIe,…
00:15:24.540 --> 00:15:27.839
Eight functions, so all three bits.
00:15:27.839 --> 00:15:29.790
Turns out, eight functions
are not enough for everybody.
00:15:29.790 --> 00:15:34.490
Function no. 4, “PCI Express Glue”, has a
bridge config, MSI interrupt controller,
00:15:34.490 --> 00:15:37.410
ICC – we’ll talk about that later –,
HPET timers, Flash controller,
00:15:37.410 --> 00:15:44.920
RTC, timers, 2 serial ports, I2C… All
this smashed into one single PCIe device.
00:15:44.920 --> 00:15:49.210
Linux has a minimum system requirement
to run on anything.
00:15:49.210 --> 00:15:53.520
You need a timer, you need interrupts,
and you need some kind of console.
00:15:53.520 --> 00:15:57.010
The PS4 has no PIT, no PIC and no standard
serial so none of the standard PC stuff
00:15:57.010 --> 00:16:01.639
is going to work here. The board has
test points for an 8250 standard serial
00:16:01.639 --> 00:16:05.529
in a different place. So we run
DMESG over that, okay, fine.
00:16:05.529 --> 00:16:08.300
Linux has earlycon which we can
point to a serial port and say:
00:16:08.300 --> 00:16:11.221
“Please send all your DMESG here
very early because I really want to see
00:16:11.221 --> 00:16:16.030
what’s going on”. Doesn’t need IRQs,
you set console=uart8250,
00:16:16.030 --> 00:16:20.420
the type, the address, the speed.
And you’ll see it says 3200 instead of
00:16:20.420 --> 00:16:23.420
115 kBaud. That’s because their clock
is different. So you set 3200 but
00:16:23.420 --> 00:16:27.540
it really means 115k.
And that gets you DMESG.
00:16:27.540 --> 00:16:29.710
That actually gets you “Linux booting,
uncompressing”, whatever.
00:16:29.710 --> 00:16:32.400
That’s pretty good.
00:16:32.400 --> 00:16:36.540
Okay, we need a timer.
Because otherwise everything explodes.
00:16:36.540 --> 00:16:40.360
Linux supports the TSC, a built-in CPU
timer which is super nice and super fun.
00:16:40.360 --> 00:16:44.420
And PS4 has that. But Linux tries to
calibrate it against the legacy timer
00:16:44.420 --> 00:16:47.430
which on the PS4 doesn’t exist
so that’s fail.
00:16:47.430 --> 00:16:52.149
So again, the PS4 -really- is not a PC.
00:16:52.149 --> 00:16:54.270
What we need to do here is
defining a new subarchitecture
00:16:54.270 --> 00:16:58.519
because Linux supports this concept.
Says: “this is not a PC, this is a PS4”.
00:16:58.519 --> 00:17:01.290
The bootloader tells Linux:
“Hey! This is a PS4!”
00:17:01.290 --> 00:17:04.010
And then Linux says: “Okay, I’m not gonna
do the old timestamp calibration,
00:17:04.010 --> 00:17:07.829
I’m gonna do it for the PS4” which has
a special code that we wrote
00:17:07.829 --> 00:17:11.339
that calibrates against the PS4 timer.
And it disables the legacy crap.
00:17:11.339 --> 00:17:13.790
So now this is officially
not a PC anymore.
00:17:13.790 --> 00:17:18.539
Now we can talk about ACPI.
00:17:18.539 --> 00:17:21.479
You might know ACPI for all its
horribleness and all its evilness
00:17:21.479 --> 00:17:25.059
and all its Microsoft-y-ness.
ACPI - most people associate it with
00:17:25.059 --> 00:17:28.069
“Suspend” and “Suspend to Hibernate”.
It’s not just power,
00:17:28.069 --> 00:17:31.940
it does other stuff, too.
So we need ACPI for PCI config,
00:17:31.940 --> 00:17:34.139
for the IOMMU, for the CPU frequency.
00:17:34.139 --> 00:17:38.389
The PS4 of course has broken ACPI tables
because, of course it would be.
00:17:38.389 --> 00:17:42.190
So we fixed them in ps4-kexec.
00:17:42.190 --> 00:17:44.789
Now interrupts. We have timers,
we have serial, we fixed some stuff.
00:17:44.789 --> 00:17:48.619
The PS4 does message-signaled interrupts
which is, what I said, the non-legacy,
00:17:48.619 --> 00:17:51.490
the nice new thing where you just write
a value, and what you do is you tell
00:17:51.490 --> 00:17:55.129
the device when you want to interrupt
“please write this value to this address”.
00:17:55.129 --> 00:17:58.450
The device does that, and the CPU
interrupt controller sees that write
00:17:58.450 --> 00:18:01.049
and says: “Oh, this is an interrupt”
and then just fires off that interrupt
00:18:01.049 --> 00:18:06.490
into the CPU. That’s great.
It’s super fast and very efficient.
00:18:06.490 --> 00:18:08.739
And the value directly tells the CPU:
“That’s the interrupt vector you have
00:18:08.739 --> 00:18:14.460
to go to”. Okay, that’s the standard MSI
way there. Your computer does MSI that way.
00:18:14.460 --> 00:18:19.700
This is how the PS4 does MSI: The Aeolia
ignores the MSI config registers
00:18:19.700 --> 00:18:24.419
in the standard location. Instead of
has its own MSI controller,
00:18:24.419 --> 00:18:28.279
all stuff that’s in Function 4,
which is that “glue” device.
00:18:28.279 --> 00:18:32.460
Each function gets a shared address in
memory to write to and the top 27 bits
00:18:32.460 --> 00:18:36.119
of data. And every sub function, because
you can’t do a lot of things into one place,
00:18:36.119 --> 00:18:40.309
only gets the different 5 bits.
And all MSIs originate from Function 4,
00:18:40.309 --> 00:18:43.399
so this device has to fire an interrupt,
then it goes to here, and then
00:18:43.399 --> 00:18:48.700
that device fires an interrupt. Like… what…
this is all… what the hell is going on?
00:18:48.700 --> 00:18:53.769
Seriously, this is really fucked up. And
– the i’s are missing in the front there.
00:18:53.769 --> 00:18:59.299
But yeah. So, driver hell. Now the devices
are interdependent. Then the IRQ vector
00:18:59.299 --> 00:19:02.831
location is not sequential, so that’s not
gonna work. And you need to modify
00:19:02.831 --> 00:19:07.590
all the drivers. This is really painful to
develop for. So what we ended up doing
00:19:07.590 --> 00:19:11.950
is there is a core driver that implements
an interrupt controller for this thing.
00:19:11.950 --> 00:19:15.779
And then we have to make sure that loads
first, before the device driver. So Linux
00:19:15.779 --> 00:19:19.399
has a mechanism for that. And we had to
patch the drivers. Some drivers we patched,
00:19:19.399 --> 00:19:22.820
so to use these interrupts. And others
we wrapped around to use these interrupts.
00:19:22.820 --> 00:19:26.350
Unfortunately, because of the top bit
thing, everything has to share one interrupt
00:19:26.350 --> 00:19:31.279
within a function. Thankfully, we can fix
that with a IOMMU because it can read
00:19:31.279 --> 00:19:34.320
direct interrupt. So we can say:
“Oh, interrupt no. 0 goes to here,
00:19:34.320 --> 00:19:39.209
1 goes to here, 2 goes to here…”.
That’s great 'cause it's consecutive, right?
00:19:39.209 --> 00:19:45.490
0 1 2 3 4 5… it’s obviously gonna have
the same top bits. But we have to fix
00:19:45.495 --> 00:19:49.152
the ACPI table for that because it’s
broken. But this does work. So this
00:19:49.152 --> 00:19:54.109
gets us interrupts that function and
they’re individual. So let’s look at
00:19:54.109 --> 00:19:58.220
the check list: we have interrupts, timers,
early serial, late serial with interrupts.
00:19:58.220 --> 00:20:03.169
We can get some user space, we can stash
some user space and binaries into the kernel.
00:20:03.169 --> 00:20:06.060
And it will boot and you can get a console,
but you get a console and you try
00:20:06.060 --> 00:20:12.880
writing commands and sometimes it hangs.
Okay. What’s going on there?
00:20:12.880 --> 00:20:16.700
So it turns out that FreeBSD masks
interrupts with an AMD proprietary
00:20:16.700 --> 00:20:21.149
register set. We had to clean that up,
too. And that fixes serial,
00:20:21.149 --> 00:20:24.729
and all the other interrupts.
This took ages to find. It’s like: “why…
00:20:24.729 --> 00:20:26.909
interrupts on CPU serial
sometimes don’t…, yeah”.
00:20:26.909 --> 00:20:33.789
I ended up dumping register sets,
and I saw this #FFFFF here, not #FFFFF,
00:20:33.789 --> 00:20:39.350
what’s that? But tracking through this
stack to find this was really annoying.
00:20:39.350 --> 00:20:45.780
Alright. So we have the basics. We have
like a core platform we can run Linux on,
00:20:45.780 --> 00:20:49.500
even though it won’t do anything
interesting. Add drivers!
00:20:49.500 --> 00:20:54.450
So we have USB xHCI which has three
controllers in one device. Again, because
00:20:54.450 --> 00:20:59.899
“Let’s make it insane!”. We have SDHCI,
that’s SDIO for the Wi-Fi and the Bluetooth.
00:20:59.899 --> 00:21:03.509
Needs a non-standard config, it needs
quirks. Ethernet needs more hacks.
00:21:03.509 --> 00:21:07.139
It’s still partially broken, it only runs at
Gigabit speed. If you plug in a 100Mbit/s
00:21:07.139 --> 00:21:10.320
switch it just doesn’t send any data.
Not sure why.
00:21:10.320 --> 00:21:13.809
And then all of this worked fine in
Linux 4.4, and then just three days ago
00:21:13.809 --> 00:21:18.190
I think I tried to rebase on 4.9, and so
we have the latest and the greatest.
00:21:18.190 --> 00:21:21.249
And everything failed. And DMA didn’t
work. And all the drivers were just
00:21:21.249 --> 00:21:24.200
throwing their hands up in the air,
“what’s going on here?”.
00:21:24.200 --> 00:21:27.279
exhales
Aeolia strikes back. So.
00:21:27.279 --> 00:21:32.549
That’s what… the Aeolia looks like,
normally. So you have… again,
00:21:32.549 --> 00:21:36.690
it’s an ARM SoC, it’s really not a device.
It’s like its own little system. But
00:21:36.690 --> 00:21:40.750
it maps, it’s low 2 GB of the address base
to memory on the PC. And then the PC
00:21:40.750 --> 00:21:45.080
has a window into its registers that it
can use to control those devices.
00:21:45.080 --> 00:21:48.429
So the PC can kind of play with the
devices, and the DMA is to the same address
00:21:48.429 --> 00:21:53.149
and that works great. Because it’s mapped
in the same place. And then has its own RAM,
00:21:53.149 --> 00:21:58.580
in its own address space. This works fine.
But now we had an IOMMU. Because
00:21:58.580 --> 00:22:01.869
we needed it for the interrupts. And the
IOMMU inserts its own address space
00:22:01.869 --> 00:22:05.190
in between and says: “Okay, you can map
anything to anything you want, that’s great.“
00:22:05.190 --> 00:22:08.320
It’s a page table, you can say “this
address goes to that address.”
00:22:08.320 --> 00:22:13.099
Linux 4.4 did this: it would find some
addresses at the bottom of the IOMMU
00:22:13.099 --> 00:22:17.659
address space, say: “page 1 goes to this,
page 2 goes to that, page 3 goes to that”.
00:22:17.659 --> 00:22:22.870
And say: “device, you can now write to these
pages”. And they go to this place in the x86.
00:22:22.870 --> 00:22:28.200
That worked fine. It turns out Linux 4.9,
or somewhere between 4.4 and 4.9
00:22:28.200 --> 00:22:32.549
it started doing this: it would map pages
from the top of the IOMMU address space
00:22:32.549 --> 00:22:36.749
and that’s fine for the IOMMU but it’s
not in the window in the Aeolia, so
00:22:36.749 --> 00:22:42.140
you say “ethernet DMA to address
FExxx”, and instead of DMA-ing
00:22:42.140 --> 00:22:49.830
to the RAM on the PC it DMA-s to the RAM
on the Aeolia which is not gonna work.
00:22:49.830 --> 00:22:53.980
Effectively the Aeolia implements 31 bit
DMA, not 32 bit DMA because only
00:22:53.980 --> 00:23:00.009
the bottom half is usable. It’s like why…
this is all really fucked up, guys!
00:23:00.009 --> 00:23:03.799
Seriously. And this is littered all over
the code in Linux, so they seeded
00:23:03.799 --> 00:23:07.409
more patches, and it works, but, yeah.
00:23:07.409 --> 00:23:11.029
Painful. Okay. Devices, laying out (?)
devices’ work.
00:23:11.029 --> 00:23:16.259
Now for something completely different.
Who can tell me who this character is?
00:23:16.259 --> 00:23:20.659
That’s Starsha from Space Battleship Yamato.
And apparently that’s the code name
00:23:20.659 --> 00:23:24.840
for the PS4 graphics chip. Or at least that’s
one of the code names. Because
00:23:24.840 --> 00:23:27.940
they don’t seem to be able to agree
on like what the code names are.
00:23:27.940 --> 00:23:31.860
It’s got “Liverpool” in some places, and
“Starsha” in other places. Then “ThebeJ”
00:23:31.860 --> 00:23:36.210
in other places. And we think Sony calls
it “Starsha” and AMD calls it “Liverpool”
00:23:36.210 --> 00:23:39.789
but we’re not sure. We are calling it
“Liverpool” everywhere just to avoid
00:23:39.789 --> 00:23:43.660
confusion. Okay.
What’s this GPU about?
00:23:43.660 --> 00:23:47.230
Well, it’s an AMD Sea
Islands generation GPU,
00:23:47.230 --> 00:23:52.940
which is spelled CI instead of SI because
“S” was taken. It’s similar to other chips
00:23:52.940 --> 00:23:57.969
in the generation. So at least that’s
not a bat shit crazy new thing.
00:23:57.969 --> 00:24:00.950
But it does have quirks and customizations
and oddities and things that don’t work.
00:24:00.950 --> 00:24:03.769
What we did is we took Bonaire which is
another GPU that is already supported
00:24:03.769 --> 00:24:06.919
by Linux in that generation, and just kind
of added a new chip and said, okay,
00:24:06.919 --> 00:24:12.769
do all the Bonaire stuff, and then change
things. And hopefully adapt it to the PS4.
00:24:12.769 --> 00:24:16.440
So hacking AMD drivers, okay, well,
they’re open-source but AMD does not
00:24:16.440 --> 00:24:20.190
publish register docs. They publish 3D
shader and command queue documentations,
00:24:20.190 --> 00:24:24.280
so we get all the user space 3D rendering
commands, that’s documented. But they
00:24:24.280 --> 00:24:27.609
don’t publish all the kernel hardware
register documentation. That’s what
00:24:27.609 --> 00:24:30.740
we really want for hacking on drivers. So
that’s annoying. And you’re thinking
00:24:30.740 --> 00:24:34.389
“the code is the documentation”,
right? “Just read the Linux drivers”.
00:24:34.389 --> 00:24:39.299
That’s great. Yeah, but they’re incomplete,
then they have magic numbers, and
00:24:39.299 --> 00:24:43.229
it’s, you know, you don’t know if you need
to write a new register that’s not there,
00:24:43.229 --> 00:24:47.399
and it really sucks to try to write a GPU
driver by reading other GPU drivers
00:24:47.399 --> 00:24:50.840
with no docs. So what do we do? We’re
hackers, right? We google. Everytime
00:24:50.840 --> 00:24:54.480
we need information, hopefully Google will
find it because Google knows everything.
00:24:54.480 --> 00:24:59.109
And any tip that you could find in any
forum or code dumped somewhere is
00:24:59.109 --> 00:25:05.850
great. One of the things we found is we
googled this little string, “R8XXGPU”.
00:25:05.850 --> 00:25:10.730
And we get nine results. And the second
result is this place, it’s “Siliconkit”,
00:25:10.730 --> 00:25:15.629
token, was that okay? It’s an XML file.
And if we look at that it looks like
00:25:15.629 --> 00:25:21.499
it’s an XML file that contains a dump of
the Bonaire GPU register documentation.
00:25:21.499 --> 00:25:26.389
But it’s like broken XML, and it’s
incomplete, it stops at one point.
00:25:26.389 --> 00:25:31.379
But like: “what’s this doing here?”
And where did this come from, right?
00:25:31.379 --> 00:25:35.539
So let’s dig a little deeper. Okay Google,
what do you know about this website?
00:25:35.539 --> 00:25:39.789
Well, there’s some random things like
whatthehellno.txt and whatthehellyes.txt
00:25:39.789 --> 00:25:46.200
and some Excel files. Those are
really Excel like XML cell sheets.
00:25:46.200 --> 00:25:50.890
And then there’s a thing in the (?) there
called RAI.GRAMMAR.4.TXT.
00:25:50.890 --> 00:25:56.960
I wonder what that is. And it looks like
it’s a grammar, being a notation description
00:25:56.960 --> 00:26:03.490
for a syntax, of some kind of register
documentation file. This looks like
00:26:03.490 --> 00:26:10.749
an AMD internal format but it’s on this
website. Okay. So we have these two URLs,
00:26:10.749 --> 00:26:14.559
/pragmatic/bonaire.xml
and /RAI/rai.grammar4.txt.
00:26:14.559 --> 00:26:22.199
Let’s try something. How about maybe
/pragmatic/bonaire.rai – nah, it’s a 404.
00:26:22.199 --> 00:26:26.539
Okay, /pragmatic/RAI/bonaire.rai – aah!
Bingo!
00:26:26.539 --> 00:26:34.869
laughter and applause
00:26:34.869 --> 00:26:39.249
So this is a full – almost full Bonaire
register documentation with like
00:26:39.249 --> 00:26:44.350
full register field descriptions, breakdowns,
all the addresses. It’s not 100% but
00:26:44.350 --> 00:26:48.829
like of the vast majority. This seems to
be AMD-internal stuff. And I looked
00:26:48.829 --> 00:26:53.469
this guy up, and apparently he worked
at AMD at some point. So…
00:26:53.469 --> 00:26:56.849
But yeah… This is really, really helpful
because now you know what everything
00:26:56.849 --> 00:27:03.249
means, and debug registers, and… yeah.
So I wrote a working parser for this format.
00:27:03.249 --> 00:27:06.559
This was effectively writing an XML parser,
something like convert this thing to XML
00:27:06.559 --> 00:27:10.833
but it was all broken. Oh – he was writing
it in PHP, by the way, so there you go …
00:27:10.833 --> 00:27:14.580
So I wrote a working one in Python and
you can dump it and then you can see
00:27:14.580 --> 00:27:18.309
what each register means, and it’ll tell
you all the options. You can take
00:27:18.309 --> 00:27:22.519
a register dump and map it to the (?)(?)
documented. You can diff dumps,
00:27:22.519 --> 00:27:26.529
you can generic defines, it’s very useful
for AMD GPUs. And this, grossly speaking
00:27:26.529 --> 00:27:31.109
applies to a lot of AMD GPUs, like they
share a lot of registers. So this is useful
00:27:31.109 --> 00:27:36.090
for anyone hacking on AMD GPU stuff. Over
4.000 registers are documented in the …
00:27:36.090 --> 00:27:42.019
just in the main GPU address space alone.
That’s great. Okay. So we have some docs.
00:27:42.019 --> 00:27:49.969
How do we get to a frame buffer? So if you…
Israel (?) is HDMI it’s easy, right? The GPU
00:27:49.969 --> 00:27:52.489
has HDMI, and if you query the GPU
information you actually get that it has
00:27:52.489 --> 00:27:57.860
an HDMI port and a DisplayPort port. Okay,
maybe it’s unconnected, that’s fine, right?
00:27:57.860 --> 00:28:03.509
But if you actually ask the GPU it tells
you: “HDMI is not connected, DP is connected”.
00:28:03.509 --> 00:28:09.919
Okay. Yeah, they have an external HDMI
encoder from DisplayPort to HDMI because
00:28:09.919 --> 00:28:13.029
just putting a wire from A to B is too
difficult, because this is Sony, so:
00:28:13.029 --> 00:28:19.759
“let’s put a chip that converts some
protocol A to protocol B…” sighs
00:28:19.759 --> 00:28:25.700
Yeah, yeah.
applause
00:28:25.700 --> 00:28:33.549
It’s a Panasonic DisplayPort to HDMI
bridge, not documented by the way.
00:28:33.549 --> 00:28:37.429
We parsed config to work, that’s why it
doesn’t just work. Even though some bridges do.
00:28:37.429 --> 00:28:41.389
And you’d think, okay, it’s hooked up to the
GPU I2C bus, because GPUs have in the past
00:28:41.389 --> 00:28:45.309
used these bridges, and, not this one
particularly but other AMD cards have had
00:28:45.309 --> 00:28:48.659
various chips that they stuck in front. And
the code has support for talking to them
00:28:48.659 --> 00:28:54.309
through the GPU I2C interface, right?
That’s easy. Yay, you wish – it’s a Sony.
00:28:54.309 --> 00:28:57.909
sighs
Enter ICC! So, remember the ICC thing
00:28:57.909 --> 00:29:02.169
in the Aeolia – it’s an RPC protocol you
use to send commands to an MCU that is
00:29:02.169 --> 00:29:05.549
somewhere else on the motherboard. It’s
a message box system, so you write some
00:29:05.549 --> 00:29:09.519
message to a memory place, and then you
tell: “Hey, read this message!” and then
00:29:09.519 --> 00:29:12.090
it writes some message back, and it tells
you “Hey, it’s the reply!”.
00:29:12.090 --> 00:29:15.019
The Aeolia – not the otherGPU – uses it for things like
00:29:15.019 --> 00:29:20.989
Power Button, the LEDs, turning the power
on and off, and also the HDMI encoder I2C.
00:29:20.989 --> 00:29:25.460
So now we have the dependency from the
GPU driver to the Aeolia driver, two different
00:29:25.460 --> 00:29:30.200
PCI devices and two different… sighs
Yeah. And okay, again, ICC, but it’s I2C,
00:29:30.200 --> 00:29:34.099
you know, I2C is a simple protocol.
You read a register, you write a register,
00:29:34.099 --> 00:29:38.549
that’s all you need. It super simple.
Right? Now let’s make a byte code
00:29:38.549 --> 00:29:41.479
fucking scripting engine to which you I2C
commands and delays and bit masking
00:29:41.479 --> 00:29:47.029
and everything. And why, Sony, why, like
why would you do this? Well, because
00:29:47.029 --> 00:29:50.769
ICC is so slow? That if you actually tried
to do one read and one write at a time
00:29:50.769 --> 00:29:55.500
it takes 2 seconds to bring up HDMI.
exhales
00:29:55.500 --> 00:29:57.039
Yeah…
00:29:57.039 --> 00:30:01.820
I don’t even know at this point…
applause
00:30:01.820 --> 00:30:04.059
I have no idea.
continued applause
00:30:04.059 --> 00:30:10.499
And by the way this thing has commands
where you can send scripts in a script
00:30:10.499 --> 00:30:13.849
to be run when certain events happen. So
“Yo dawg, I heard you like scripts, I put
00:30:13.849 --> 00:30:16.960
scripts in your scripts so you can I2C
while you I2C”. Like: “let’s just go
00:30:16.960 --> 00:30:23.769
even deeper at this point”, right? Yeah.
exhales
00:30:23.769 --> 00:30:29.009
Okay. We wrote some code for this,
you need more hacks, it needs all
00:30:29.009 --> 00:30:33.599
DisplayPort lanes up, Linux tries to downscale,
doesn’t work. Memory bandwidth calculation
00:30:33.599 --> 00:30:37.289
is broken. Mouse cursor size is from the
previous GPU generation for some reason,
00:30:37.289 --> 00:30:41.750
I guess they forgot to update that. So
wait! All this crap – we get a frame buffer.
00:30:41.750 --> 00:30:47.159
But X won’t start. Ah. Well, it turns out
that PS4 uses a unified memory architecture
00:30:47.159 --> 00:30:52.580
so it has a single memory pool that is
shared between the x86 and the GPU.
00:30:52.580 --> 00:30:56.110
And games just put a texture in memory
and say: “Hey, GPU, render this!” and
00:30:56.110 --> 00:31:00.889
that works great. And this makes a lot of
sense, and their driver uses this to the
00:31:00.889 --> 00:31:06.369
fullest extents. So there’s a VRAM,
you know, the legacy… GPUs had
00:31:06.369 --> 00:31:10.229
a separate VRAM and all these integrated
chip sets can emulate VRAM using a chunk
00:31:10.229 --> 00:31:13.739
of the system memory. And you can usually
configure that in the BIOS if you have
00:31:13.739 --> 00:31:18.729
a PC that does this. And PS4 sets it to
16 MB which is actually the lowest possible
00:31:18.729 --> 00:31:24.659
setting. And 16 Megs is not enough to have
more than one Full HD frame buffer. So,
00:31:24.659 --> 00:31:28.519
obviously, that’s going to explode in
Linux pretty badly. So what we do is
00:31:28.519 --> 00:31:31.749
we actually reconfigure the memory
controller in the system to give 1 GB
00:31:31.749 --> 00:31:36.719
of RAM to the VRAM, and we did it on the
psd-kexec. So it’s basically doing like
00:31:36.719 --> 00:31:41.519
BIOSy things. We were reconfiguring the
Northbridge at this point to make this work.
00:31:41.519 --> 00:31:46.299
But it works. And with this we can get X
to start because it can allocate its frame buffer.
00:31:46.299 --> 00:31:53.659
But okay, it’s 3D time, right? – Neeaah,
GPU acceleration doesn’t quite work yet.
00:31:53.659 --> 00:31:58.560
So we got at least, you know, X but let’s
talk a bit about the Radeon GPU
00:31:58.560 --> 00:32:03.179
for a second. So when you want to draw
something on the GPU you send it a command
00:32:03.179 --> 00:32:06.289
and you do this by putting it into ‘ring’
which is really just a structure in memory,
00:32:06.289 --> 00:32:11.499
that’s a (?)(?)(?)(?). And it wraps around.
So that way you can queue things to be done
00:32:11.499 --> 00:32:15.600
in the GPU, and then it does it on its own
and you can go and do other things.
00:32:15.600 --> 00:32:20.330
There’s a Graphics Ring for drawing,
a Compute Ring for GPGPU, and a DMA Ring
00:32:20.330 --> 00:32:24.809
for copying things around. The commands
are processed by the GPU Command Processor
00:32:24.809 --> 00:32:32.419
which is really a bunch of different CPUs
inside the GPU. They are called F32.
00:32:32.419 --> 00:32:36.570
And they run a proprietary AMD microcode.
So this is a custom architecture.
00:32:36.570 --> 00:32:40.419
Also the rings can call out to IBs which
are indirect buffers. So you can say
00:32:40.419 --> 00:32:44.999
basically “Call this piece of memory, do
this stuff there, return back to the ring”.
00:32:44.999 --> 00:32:48.629
And that’s actually how the user space
thing does things. So this says:
00:32:48.629 --> 00:32:51.750
“Draw this stuff” and it tells the kernel:
“Hey, draw this stuff”. And the kernel
00:32:51.750 --> 00:32:57.269
tells the GPU: “Jump to that stuff,
read it come back, keep doing stuff”.
00:32:57.269 --> 00:33:01.999
This is basically how most GPUs work but
Radeon specifically works like, you know…
00:33:01.999 --> 00:33:06.649
with this F32 stuff. Okay. The driver
complains: “Ring 0 test failed”.
00:33:06.649 --> 00:33:10.669
Technically (?), you test them, so at least
you know it has nice diagnostic,
00:33:10.669 --> 00:33:13.669
and how does the test work? It’s really
easy. It writes a register with a value,
00:33:13.669 --> 00:33:16.649
and then it tells the GPU with a command
“Please write this other value
00:33:16.649 --> 00:33:21.159
to the register”, runs it and the checks
to see if the register was actually written
00:33:21.159 --> 00:33:29.190
with the new value. So the write doesn’t
happen. Thankfully, thanks to that RAI file
00:33:29.190 --> 00:33:32.459
earlier we found some debug registers that
tell you exactly what’s going on inside
00:33:32.459 --> 00:33:36.809
the GPU. And it shows the Command
Processor is stuck, waiting for data
00:33:36.809 --> 00:33:41.549
in the ring, so it needs more data.
After a NOP command?! Yeah…
00:33:41.549 --> 00:33:46.950
NOP is hard, let’s go stalling. So packet
headers in this GPU thing have a size
00:33:46.950 --> 00:33:51.700
that is SIZE-2. Whoever thought that was
a good idea. So a 2 word packet
00:33:51.700 --> 00:33:58.919
has a size of zero. Then AMD implemented
a 1 word packet with a size of -1.
00:33:58.919 --> 00:34:03.309
And old firmware doesn’t support that and
thinks: “Oh it’s 3FFF so I’m just gonna wait
00:34:03.309 --> 00:34:08.540
for a shitload of code in the buffer”,
right? It turns out that Hawaii,
00:34:08.540 --> 00:34:12.418
which is another GPU in the same gen
has the same problem with old firmware.
00:34:12.418 --> 00:34:14.772
So they use a different NOP packet, so
there was an exception in the driver
00:34:14.772 --> 00:34:18.940
for this. And we had to add ours to that.
But again – getting to this point, many,
00:34:18.940 --> 00:34:23.110
many, many hours of headbanging.
00:34:23.110 --> 00:34:28.230
Okay. We fixed that. Now it says:
“Ring 3 test failed”.
00:34:28.230 --> 00:34:31.069
That’s the SDMA ring. That’s for copying
things in memory and it works
00:34:31.069 --> 00:34:34.909
in the same way. It puts a value in RAM.
It tells the SDMA engine: “hey, write
00:34:34.909 --> 00:34:40.429
a different value”. And checks. This time
we see the write happens but it writes “0”
00:34:40.429 --> 00:34:44.839
instead if the 0xDEADBEEF or whatever.
Okay. So I tried this.
00:34:44.839 --> 00:34:48.139
I put two Write commands in the ring
saying: “Write to one place, write to
00:34:48.139 --> 00:34:52.518
a different place”. And this time,
if I saw, what it did is it wrote “1”
00:34:52.518 --> 00:34:56.619
to the first destination and “0” to the
second destination. I’m thinking:
00:34:56.619 --> 00:35:00.380
“Okay, it’s supposed to write 0xDEADBEEF…”
which is what you see there, it’s…
00:35:00.380 --> 00:35:04.450
0xDEADBEEF is that word
with the value. It writes “1”.
00:35:04.450 --> 00:35:08.980
Well, there’s a “1” there that
wasn’t there before, it was a “0”,
00:35:08.980 --> 00:35:13.640
because of this padding, right? So it
turns out they have it off by four,
00:35:13.640 --> 00:35:17.890
in the SDMA command parser
and it reads from four words later
00:35:17.890 --> 00:35:21.670
than it should.
exhales
00:35:21.670 --> 00:35:26.910
Again, this took many hours of
headbanging. It was like:
00:35:26.910 --> 00:35:32.390
“Randomly try two commands, oh, one, one?”
– “One”.
00:35:32.390 --> 00:35:37.779
So it reads four words too late but only
in ring buffers. Indirect buffers work fine.
00:35:37.779 --> 00:35:40.940
That’s good because those come from user
space. So we don’t have to mock with those.
00:35:40.940 --> 00:35:43.480
We can work around this, because it’s
only used in two places in the kernel,
00:35:43.480 --> 00:35:47.540
by using a Fill command instead of a Write
command. That works fine. Again,…
00:35:47.540 --> 00:35:52.490
how do they even make these mistakes?!
Okay. But still the GPU doesn’t work.
00:35:52.490 --> 00:35:55.640
The ring tests pass but if you tried
to draw you get a bunch of page faults.
00:35:55.640 --> 00:35:59.369
And it turns out that what happens is that
on the PS4 you can’t write the page table
00:35:59.369 --> 00:36:05.829
registers from actual commands in the GPU
itself. You can write to them from the CPU
00:36:05.829 --> 00:36:09.319
directly. You can say just: “Write memory
– memory register write”, and then
00:36:09.319 --> 00:36:14.519
I’ll write. But you can’t tell the GPU:
“Please write to the page table register this”.
00:36:14.519 --> 00:36:18.520
So the page tables don’t work, the GPU
can’t see any memory, so everything is broken.
00:36:18.520 --> 00:36:22.920
Linux uses this, FreeBSD doesn’t. It uses
direct writes. And we think this is maybe
00:36:22.920 --> 00:36:27.290
a Firewall somewhere in the Liverpool,
some kind of security thing they added.
00:36:27.290 --> 00:36:30.940
We can directly write from the CPU.
But it like breaks the regular…
00:36:30.940 --> 00:36:34.830
like it’s not asynchronous anymore. So
this could break things. And it’s a really
00:36:34.830 --> 00:36:39.000
hacky solution. I would really like to fix
this. And I’m thinking: “Maybe the firewall
00:36:39.000 --> 00:36:42.940
is in the firmware, right?”. But it’s
proprietary and undocumented firmware.
00:36:42.940 --> 00:36:47.630
So let’s look at that firmware. It’s
a thing, it needs microcode, a CP thing.
00:36:47.630 --> 00:36:51.440
It’s undocumented. But we take the blobs
out of FreeBSD. And that’s great because
00:36:51.440 --> 00:36:56.510
we have don’t have to ship them. Let’s
dig deeper into those blobs. So how do you
00:36:56.510 --> 00:37:00.599
reverse-engineer an unknown CPU
architecture? That’s really easy,
00:37:00.599 --> 00:37:05.039
run an instruction and see what it did.
And then just keep doing that. Thankfully,
00:37:05.039 --> 00:37:07.710
we upload custom firmwares, so it’s
actually really easy to just have like
00:37:07.710 --> 00:37:10.450
a two-instruction firmware that does
something, and then writes a register
00:37:10.450 --> 00:37:14.220
to a memory location. And that’s actually
really easy to find. If you first like
00:37:14.220 --> 00:37:17.460
write the memory instruction, it’s really
easy to find in the binary because you see
00:37:17.460 --> 00:37:23.559
like GPU register offsets that stand out
a bit in one column. So long story short,
00:37:23.559 --> 00:37:27.799
we wrote F32DIS which is a disassembler
for the proprietary AMD F32 microcode.
00:37:27.799 --> 00:37:31.619
I shamelessly stole the instruction
syntax from ARM. So you may recognize
00:37:31.619 --> 00:37:35.130
that if you’ve ever seen an ARM disassembly.
And this is not complete but it can
00:37:35.130 --> 00:37:38.980
disassemble every single instruction
in all the firmware in Liverpool for PFP,
00:37:38.980 --> 00:37:43.110
ME, CE, MEC and RLC which are five
different blocks in the GPU. As far
00:37:43.110 --> 00:37:46.319
as I notice that’s never been done before,
all the firmware was like in a voodoo
00:37:46.319 --> 00:37:50.099
black magic thing that’s been shipped.
Not even the non-AMD kernel developers
00:37:50.099 --> 00:37:54.710
know anything about this. So…
applause
00:37:54.710 --> 00:37:57.290
ongoing applause
00:37:57.290 --> 00:38:01.839
And you can disassemble the desktop
GPU stuff, too. So this could be good for
00:38:01.839 --> 00:38:06.133
debugging strange GPU shenanigans
in non-PS4 stuff.
00:38:06.133 --> 00:38:10.660
Alright. Alas, it’s not in the firmware.
It seems to be blocked in hardware.
00:38:10.660 --> 00:38:14.510
I found a debug register that actually
says: “there was an access violation
00:38:14.510 --> 00:38:17.340
in the bus when you try to write this
thing”. And I tried a bunch of workarounds
00:38:17.340 --> 00:38:22.789
and I even bought an AMD APU system,
desktop. Dumped all the registers,
00:38:22.789 --> 00:38:26.780
diff’ed them against the one I had on Linux
and tried setting every single value
00:38:26.780 --> 00:38:30.880
from the other GPU and hoping I find some
magic bits somewhere, but… no.
00:38:30.880 --> 00:38:35.420
They probably have a setting for this,
somewhere, but it’s a sea of ones and zeros,
00:38:35.420 --> 00:38:40.210
good luck finding it. It does work with
a CPU Write, workaround, though.
00:38:40.210 --> 00:38:43.769
So, hey, at least we get 3D! And it’s
actually pretty stable, so if there’s
00:38:43.769 --> 00:38:49.210
a race condition I’m not really seeing it.
So – checklist! What works,
00:38:49.210 --> 00:38:52.640
what doesn’t work. We have interrupts,
and timers – the core thing you need
00:38:52.640 --> 00:38:56.490
to run any OS – we have a serial port,
we can shutdown the system and reboot,
00:38:56.490 --> 00:38:59.559
and you’ll think that’s funny but actually
that goes through ICC, so again,
00:38:59.559 --> 00:39:02.420
at least some interesting code there.
I actually just implemented that about
00:39:02.420 --> 00:39:08.700
four hours ago. Because pulling the plug
was getting old. The Power button works.
00:39:08.700 --> 00:39:13.280
USB works. There’s a funny story with USB
as it used not to work. And we said:
00:39:13.280 --> 00:39:17.430
“Fix it later, there seems to be special
code missing.” And then someone
00:39:17.430 --> 00:39:20.499
pulled a repo from the USB-not-working
branch, and tested it, and said:
00:39:20.499 --> 00:39:25.450
“It’s working!” It seems we fixed it by
accident, by changing something else.
00:39:25.450 --> 00:39:29.170
The hard disk works which is via the USB.
Blu-ray works, I wrote a driver for that,
00:39:29.170 --> 00:39:32.170
also four hours ago. – Three hours ago
now? Yeah, something like that.
00:39:32.170 --> 00:39:34.930
And I spent 20 minutes looking for someone
in the Hackcenter that had a DVD I could
00:39:34.930 --> 00:39:40.400
stick in to try. Apparently I’m from
he past if I ask for DVDs.
00:39:40.400 --> 00:39:45.390
But it does work. So that’s good. Wi-Fi
and Bluetooth works.
00:39:45.390 --> 00:39:49.119
Ethernet works, except only at GBit speeds.
Frame buffer works. HDMI works.
00:39:49.119 --> 00:39:54.829
It’s currently hard-coded to 1080p so…
It does work. We can fix that
00:39:54.829 --> 00:40:00.960
by improving the encoder implementation.
3D works with the ugly register write hack.
00:40:00.960 --> 00:40:06.659
And SPDIF audio works. So that’s good.
HDMI audio doesn’t work. Mostly because
00:40:06.659 --> 00:40:10.450
I only got audio grossly working, in
general, recently, and I haven’t had
00:40:10.450 --> 00:40:15.250
a chance to program the encoder to support
the audio stuff yet. Because, again,
00:40:15.250 --> 00:40:18.619
new more annoying hacks there. And the
real-time clock doesn’t work and everything.
00:40:18.619 --> 00:40:23.350
That’s simple, the clock, that device is
simple. But ever since the PS2 the way
00:40:23.350 --> 00:40:27.410
Sony has implemented real-time clocks
is that instead of reading and writing
00:40:27.410 --> 00:40:29.920
the time on the clock, which is what you
would think is the normal thing to do,
00:40:29.920 --> 00:40:33.480
they never write the time on the clock.
Instead, they store an offset from the clock
00:40:33.480 --> 00:40:39.579
to the real time, in some kind of storage
location. And there’s a giant mess of…
00:40:39.579 --> 00:40:44.269
…registry it’s called, in the PS4, and
I don’t even know where it’s stored.
00:40:44.269 --> 00:40:46.970
It might be on the hard drive, it might be
encrypted. So basically, getting
00:40:46.970 --> 00:40:50.259
the real-time clock to actually show the
right time involves a pile of nonsense
00:40:50.259 --> 00:40:53.980
that I haven’t had the chance to look at
yet. But… we have NTP, right?
00:40:53.980 --> 00:40:59.030
So it’s good enough. – Oh, and we have
Blinkenlights! Important! The Power LED
00:40:59.030 --> 00:41:04.329
does some interesting things, if you’re
on Linux. So that’s good.
00:41:04.329 --> 00:41:10.610
So – the code: you can get the ps4-kexec
code on our Github page. That has
00:41:10.610 --> 00:41:14.910
the kexec and the hardware configuration,
and the bootloader Linux stuff.
00:41:14.910 --> 00:41:18.599
You can get the ps4 Linux branch which is
the… our fork of the kernel,
00:41:18.599 --> 00:41:22.769
rebased on 4.9 which is the latest (?)
version, I think.
00:41:22.769 --> 00:41:26.319
You can get our Radeon patches which are
three, I think, really tiny patches for
00:41:26.319 --> 00:41:30.410
user space libraries just to support this
new chip. Really simple stuff, the NOP
00:41:30.410 --> 00:41:35.289
thing, and a couple of commands. And the
RAI and F32DIS thing I mentioned.
00:41:35.289 --> 00:41:40.779
You can get Radeon tools at that Github
repo. Just push that right before the stock.
00:41:40.779 --> 00:41:44.089
So if you’re interested – there you go.
And if you going at the RAI file, well,
00:41:44.089 --> 00:41:47.569
we wanna put you on a run before the guys
at that website realize they really should
00:41:47.569 --> 00:41:52.589
take that down! But I’m sure the internet
wayback machine has it somewhere.
00:41:52.589 --> 00:42:00.279
Okay! That’s everything for the story of
how we got Linux running on the PS4.
00:42:00.279 --> 00:42:08.710
And you can reach us at that website
or fail0verflow on Twitter.
00:42:08.710 --> 00:42:14.440
applause
Thank you!
00:42:14.440 --> 00:42:18.259
ongoing applause
00:42:18.259 --> 00:42:24.309
I hope that wasn’t too fast, sorry, I had
to rush through my 89 slides a little bit
00:42:24.309 --> 00:42:29.460
because I really wanted to do a demo.
I think this kind of is the demo, right.
00:42:29.460 --> 00:42:33.180
But we can try something else.
So maybe I can shut this –
00:42:33.180 --> 00:42:39.839
so I can aim with my controller.
00:42:39.839 --> 00:42:43.960
This is really not meant as a mouse!
That’s not Right Button.
00:42:43.960 --> 00:42:46.809
Come on! Yeah, I think it is…
00:42:46.809 --> 00:42:48.810
Close? Close! Maybe…
00:42:48.810 --> 00:42:51.099
So we have this little icon here.
I wonder what happens if it works.
00:42:51.099 --> 00:42:55.740
Do we have internet access? Hopefully
Wi-Fi works, let’s then just check real quick.
00:42:55.740 --> 00:42:57.730
keyboard typing sounds
00:42:57.730 --> 00:42:59.849
This could bork really badly if we don’t.
00:42:59.849 --> 00:43:02.039
keyboard typing sounds
00:43:02.039 --> 00:43:03.500
mumbles ping 8.8.8.8
00:43:03.500 --> 00:43:06.009
Yeah, we have internet access.
So, Wi-Fi works!
00:43:06.009 --> 00:43:08.710
Okay. I wonder what happens
if we click that!
00:43:08.710 --> 00:43:15.160
It takes a while to load.
This is not optimized for…
00:43:15.160 --> 00:43:23.859
laughter and applause
marcan laughs
00:43:23.859 --> 00:43:28.410
So the CPUs on this thing are
a little bit slow. But…
00:43:28.410 --> 00:43:31.990
sounds of the machine
Hey, it works!
00:43:31.990 --> 00:43:35.880
And now it’s a real game console!
00:43:35.880 --> 00:43:42.089
laughter and applause
00:43:42.089 --> 00:43:49.069
And this is… there we go, okay.
00:43:49.069 --> 00:43:54.290
So I think we can probably take some Q&A
because this is a little bit slow to load.
00:43:54.290 --> 00:43:56.529
But we can try a game, maybe.
00:43:56.529 --> 00:44:03.020
Herald: If you are for Q&A I think
there will be some questions.
00:44:03.020 --> 00:44:07.089
So shall we start with one
from the internet.
00:44:07.089 --> 00:44:16.029
Signal Angel: Hey! The internet wants to
know if most of your research will be
00:44:16.029 --> 00:44:18.470
published, or if stuff’s
going to stay private.
00:44:18.470 --> 00:44:21.992
marcan: All of this… the publishing is
basically the code which… and you know
00:44:21.992 --> 00:44:26.660
the explanation I just gave… I said that
everything’s on Github. So all the drivers
00:44:26.660 --> 00:44:30.950
we wrote, all the… I mean… and in this
case also the spec is the code.
00:44:30.950 --> 00:44:34.300
If you really want to I could write some
Wiki pages on this. But roughly speaking,
00:44:34.300 --> 00:44:37.890
what’s in the drivers is what we found
out. The really interesting bit,
00:44:37.890 --> 00:44:44.269
I think, is that F32 stuff from the AMD
GPU stuff. And that we have a repo for.
00:44:44.269 --> 00:44:48.369
But if you have any general questions, or
name a particular device, or any details,
00:44:48.369 --> 00:44:54.069
feel free to ask. I don’t know… again, it
would be nice if we wrote a bunch
00:44:54.069 --> 00:44:57.220
of docs and everything. But it’s not really
a matter of not wanting to write them,
00:44:57.220 --> 00:45:01.250
it’s lazy engineers not wanting to write
documentation. But the code is at least…
00:45:01.250 --> 00:45:05.250
the things we have on Github are fairly
clean. So.
00:45:05.250 --> 00:45:08.630
Herald: Okay, so, someone is piling up
on 4. Guys, if you have questions
00:45:08.630 --> 00:45:11.990
you see the microphones over here.
Just pile up over there
00:45:11.990 --> 00:45:14.539
and I’m gonna point… 4 please!
00:45:14.539 --> 00:45:19.210
Question: Just a small question.
How likely is it that you upstream
00:45:19.210 --> 00:45:22.700
some of that stuff. Because… I mean…
00:45:22.700 --> 00:45:27.299
marcan: So there’s two sides to that.
One side is that we need to actually
00:45:27.299 --> 00:45:31.059
get together and upstream it. The code…
some of it has horrible hacks, some of it
00:45:31.059 --> 00:45:36.539
isn’t too bad. So we want to upstream it.
00:45:36.539 --> 00:45:42.099
We have to sit down and actually do it.
I think most of the custom x86 based
00:45:42.099 --> 00:45:45.280
machine stuff and the kernel is doable.
The drivers are probably doable.
00:45:45.280 --> 00:45:49.609
Some people might scream at the interrupt
hacks. But it’s probably not terrible.
00:45:49.609 --> 00:45:53.580
And if they have a better way of doing it
I’m all ears, there are other kernel devs.
00:45:53.580 --> 00:45:59.589
The Radeon stuff is quite fishy because of
the encoder thing that is like (?) non-standard.
00:45:59.589 --> 00:46:03.880
And also understandably
AMD GPU driver developers
00:46:03.880 --> 00:46:07.380
that work for AMD may want to have nothing
to do with this. And in fact I know
00:46:07.380 --> 00:46:11.570
for a fact that at least
one of them doesn’t. But
00:46:11.570 --> 00:46:16.609
they can’t really stop us from upstreaming
things into the Linux kernel, right?
00:46:16.609 --> 00:46:20.210
So I think as long as we get to come
to a state where it’s doable it’s fine.
00:46:20.210 --> 00:46:23.250
But most likely I think…
laughter
00:46:23.250 --> 00:46:27.910
…I think most likely the non-GPU stuff
will go in first if we have a chance
00:46:27.910 --> 00:46:30.940
to do that. And of course, if you wanna
try upstreaming it go ahead!
00:46:30.940 --> 00:46:33.470
It’s open source, right? So.
00:46:33.470 --> 00:46:35.460
Herald: Over to microphone 1, please.
00:46:35.460 --> 00:46:42.079
Question: Hi. First I think I should
employ you to try and find trouble Hudson. (?)
00:46:42.079 --> 00:46:48.430
And control him into using your FreeBSD
kexec implementation in heads.
00:46:48.430 --> 00:46:55.210
Instead of having to run all of Linux in it,
as a joke. But my real question is:
00:46:55.210 --> 00:46:59.160
if the reason you used Gentoo was
because systemd was yet another hurdle
00:46:59.160 --> 00:47:00.519
in getting this to run?
00:47:00.519 --> 00:47:02.710
laughter
marcan laughs
00:47:02.710 --> 00:47:06.430
marcan: I run Gentoo on my main machine,
I run Gentoo on most of the machines
00:47:06.430 --> 00:47:10.950
I care about. I do run Arch on a few of
the others and then I’d live with systemd.
00:47:10.950 --> 00:47:15.661
But the reason why I run Gentoo is, first
it’s what I like and use. And second it’s
00:47:15.661 --> 00:47:19.119
super easy to use patches on Gentoo.
You get those things we put onto Github,
00:47:19.119 --> 00:47:21.549
which are just patch files, it’s not really
a repo. Because they’re so easy
00:47:21.549 --> 00:47:24.869
it’s not worth cloning everything. Just
get those patch files, stick them on
00:47:24.869 --> 00:47:28.480
/etc/portage/patches/, have a little hook to patch,
and that’s all you need. So it’s really
00:47:28.480 --> 00:47:33.070
easy to patch packages in Gentoo,
that’s one of the main reasons.
00:47:33.070 --> 00:47:37.730
laughs about something in audience
00:47:37.730 --> 00:47:39.599
Herald: No. 3 please!
00:47:39.599 --> 00:47:43.550
Question: Will there be new exploits,
new way to boot Linux
00:47:43.550 --> 00:47:48.400
on PS3 with modern firmwares
because finding one
00:47:48.400 --> 00:47:51.109
with firmware 1.76 is really rare.
00:47:51.109 --> 00:47:52.460
marcan: That was 4.05!
00:47:52.460 --> 00:47:58.500
Question: Ah, okay.
marcan: But again, our goal is to focus
00:47:58.500 --> 00:48:01.369
on… I just told you the story of the
pre-exploit thing because I think
00:48:01.369 --> 00:48:05.089
that’s good like a hacker story, a good
knowledge suite trying new platforms.
00:48:05.089 --> 00:48:07.740
And the Linux thing we’re working on.
The reason why we don’t want to publish
00:48:07.740 --> 00:48:11.599
the exploit or really get involved in the
whole exploit scene is that there is
00:48:11.599 --> 00:48:17.099
a lot of drama, it’s not rocket science
in that it’s like super custom code,
00:48:17.099 --> 00:48:21.400
this is WebKit and FreeBSD. It’s actually not
that hard. And we know for a fact
00:48:21.400 --> 00:48:25.751
that several people have reproduced this
on various firmwares. So there’s no need
00:48:25.751 --> 00:48:29.980
for us to be the exploit provider. And
we don’t want to get into that because
00:48:29.980 --> 00:48:37.420
it’s a giant drama fest as we all know,
anyway. Please DIY it this time!
00:48:37.420 --> 00:48:39.470
Question: Okay. Thanks.
00:48:39.470 --> 00:48:41.329
Herald: And what is the internet saying?
00:48:41.329 --> 00:48:46.440
Signal Angel: The internet wants to know
if you ever had fun with the BSD
00:48:46.440 --> 00:48:47.749
on the second processor.
00:48:47.749 --> 00:48:52.460
marcan: Oh, that’s a very good question.
I myself haven’t. I don’t know if anyone
00:48:52.460 --> 00:48:55.930
else has looked at it briefly. One of the
commands for rebooting will boot
00:48:55.930 --> 00:49:01.339
that CPU into FreeBSD. And there’s
probably fun to be had there.
00:49:01.339 --> 00:49:03.869
But we haven’t really looked into it.
00:49:03.869 --> 00:49:06.819
Herald: And over to 5, please.
00:49:06.819 --> 00:49:13.000
Question: I was wondering if any of that
stuff was applicable to the PS4 VR edition
00:49:13.000 --> 00:49:18.800
or whatever it’s called, the new one?
Did you ever test it?
00:49:18.800 --> 00:49:20.460
marcan: Sorry, say it again!
00:49:20.460 --> 00:49:22.359
Question: Sony brought up a new PS4
I thought.
00:49:22.359 --> 00:49:24.299
marcan: Oh, the Pro you mean,
the PS4 Pro?
00:49:24.299 --> 00:49:26.670
Question: Yes.
marcan: So Linux boots on the Pro,
00:49:26.670 --> 00:49:30.289
we got that far. GPU is broken. So we
would like to get this ported to the Pro
00:49:30.289 --> 00:49:34.140
and also working. It’s basically an
incremental update, so it’s not that hard,
00:49:34.140 --> 00:49:36.999
but the GPU needs a new definition,
new jBullet(?) stuff.
00:49:36.999 --> 00:49:40.940
Yeah, you get a lot of C frames
down-burned (?), yeah…
00:49:40.940 --> 00:49:45.280
So, as you can see, 3D works,
and, there you go!
00:49:45.280 --> 00:49:52.340
synth speech from game
applause
00:49:52.340 --> 00:49:56.119
I only have to look up and down in this game!
00:49:56.119 --> 00:49:58.230
continued synth speech from game
00:49:58.230 --> 00:50:01.019
Herald: Well, then number 3, please.
00:50:01.019 --> 00:50:07.679
Question: I want to ask you if you want to
port these Radeon patches to the new
00:50:07.679 --> 00:50:16.274
amdgpu driver because AMD now supports
the Southern Island GPUs?
00:50:16.274 --> 00:50:19.354
marcan: Yes, that’s a very good question.
Actually, the first attempt we made
00:50:19.354 --> 00:50:22.609
at writing this driver was with amdgpu.
And at the time it wasn’t working at all.
00:50:22.609 --> 00:50:26.559
And there was a big concern about its
freshness at the time and it was
00:50:26.559 --> 00:50:31.130
experimentally supporting this GPU
generation. I’m told it should work.
00:50:31.130 --> 00:50:35.720
So I would like to port this… move to
amdgpu and we have a working
00:50:35.720 --> 00:50:38.970
implementation, and we got to clean up
code much better, we know where all
00:50:38.970 --> 00:50:42.050
the nits are, I want to try again with
amdgpu and see if that works.
00:50:42.050 --> 00:50:47.019
That’s a very good question because the
newer gen might require the driver maybe, so …
00:50:47.019 --> 00:50:49.029
Question: Thank you.
Herald: Well then I’m gonna guess we ask
00:50:49.029 --> 00:50:50.220
the internet again.
00:50:50.220 --> 00:50:56.210
Signal Angel: Okay, the internet states
that about a year ago you argued
00:50:56.210 --> 00:51:02.069
with someone on twitter that the PS4 wasn’t
a PC and now you’re saying that kind of
00:51:02.069 --> 00:51:05.330
is something. And what’s about that?
00:51:05.330 --> 00:51:11.249
marcan: So again, the reason of saying
it’s not a PC is that it’s not an IBM
00:51:11.249 --> 00:51:17.369
Personal Computer compatible device.
It’s an x86 device that happens to
00:51:17.369 --> 00:51:20.470
be structured roughly like a current PC
but if you look at the details
00:51:20.470 --> 00:51:24.280
so many things are completely different.
It really isn’t a PC. Like on Linux I had
00:51:24.280 --> 00:51:29.730
to define “sub arch PS4”. It’s an x86
but it’s not a PC. And that’s actually
00:51:29.730 --> 00:51:32.520
a very important distinction because
there’s a lot of things you have
00:51:32.520 --> 00:51:36.210
never heard of that are x86 but not PC.
It’s like e.g. there’s a high chance
00:51:36.210 --> 00:51:40.480
your monitor at home has
an 8186 CPU in it. So, yeah.
00:51:40.480 --> 00:51:45.200
Herald: So nobody’s piling at the
microphones any more.
00:51:45.200 --> 00:51:47.430
Is there one last question
from the internet?
00:51:47.430 --> 00:51:51.299
Signal Angel: Yes, there is.
00:51:51.299 --> 00:51:53.819
The question is…
00:51:53.819 --> 00:51:59.660
…if there was any
decryption needed.
00:51:59.660 --> 00:52:05.509
marcan: No. So this is purely… you
exploit WebKit, you get user mode,
00:52:05.509 --> 00:52:08.769
you exploit the kernel, you got kernel
mode. You jump Linux…
00:52:08.769 --> 00:52:12.240
there’s no security like… there’s nothing
like stopping you from doing
00:52:12.240 --> 00:52:15.160
all that stuff. There’s a sand box in
FreeBSD but obviously you exploit
00:52:15.160 --> 00:52:20.920
around the sand box. There’s nothing…
there’s no hypervisor, there’s no monitoring,
00:52:20.920 --> 00:52:24.650
there’s nothing like saying: “Oh this code
should not be running.” There’s no
00:52:24.650 --> 00:52:29.089
like integrity checking. They have a security
architecture but as it’s tradition for Sony
00:52:29.089 --> 00:52:35.230
you can just walk around it.
laughter
00:52:35.230 --> 00:52:37.730
applause
00:52:37.730 --> 00:52:42.660
The PS3 was notable for the fact that
the PS Jailbreak which is a USB…
00:52:42.660 --> 00:52:47.470
it’s effectively a piracy device
that was released by someone
00:52:47.470 --> 00:52:51.510
that basically used a USB exploit
in the kernel and only a USB exploit
00:52:51.510 --> 00:52:54.990
in the kernel to effectively enable piracy.
So when you have like a stack of security
00:52:54.990 --> 00:52:58.400
and you break one thing and you get
piracy that’s a fail! This is basically
00:52:58.400 --> 00:53:02.050
the same idea. Except I have no idea what
you do to do piracy and I don’t care.
00:53:02.050 --> 00:53:09.780
But Sony doesn’t really know how to
architecture secure systems.
00:53:09.780 --> 00:53:11.500
That’s it.
00:53:11.500 --> 00:53:14.689
Herald: That’s it, here we go,
that’s your applause!
00:53:14.689 --> 00:53:20.230
applause
00:53:20.230 --> 00:53:21.810
postroll music
00:53:21.810 --> 00:53:32.109
subtitles created by c3subtitles.de
in the year 2017. Join, and help us!