1
00:00:00,000 --> 00:00:13,321
33C3 preroll music
2
00:00:13,321 --> 00:00:16,840
Herald: You have been
here on stage before.
3
00:00:16,840 --> 00:00:20,160
You successfully tampered with the Wii,
4
00:00:20,160 --> 00:00:23,110
You successfully tampered
with the PS3 and got
5
00:00:23,110 --> 00:00:26,840
some legal challenges over there?
6
00:00:26,840 --> 00:00:28,939
marcan: Some unfounded
legal challenges, yes.
7
00:00:28,939 --> 00:00:31,640
Herald: And then you fucked,
and excuse my French over here
8
00:00:31,640 --> 00:00:35,149
– by the way, that is number 8021 to get
9
00:00:35,149 --> 00:00:39,840
the translation on your DECT phone.
10
00:00:39,840 --> 00:00:44,600
So you fucked with the Wii U as well.
11
00:00:44,600 --> 00:00:47,999
“Console Hacking 2016”,
here we go!
12
00:00:47,999 --> 00:00:51,629
marcan: I’m a lazy guy, so I haven’t
turned on my computer yet for the slides.
13
00:00:51,629 --> 00:00:57,180
So let me do that,
hopefully this will work.
14
00:00:57,180 --> 00:01:00,559
My computer is a little bit special.
It runs a lot of Open Source software.
15
00:01:00,559 --> 00:01:05,620
It runs FreeBSD.
16
00:01:05,620 --> 00:01:09,909
applause
17
00:01:09,909 --> 00:01:14,370
It even has things like OpenSSL
in there, and nginx.
18
00:01:14,370 --> 00:01:21,160
And Cairo I think, and WebKit. It runs a
lot of interesting Open Source software.
19
00:01:21,160 --> 00:01:24,980
But we all know that BSD is dying, so
we can make it run something a little bit
20
00:01:24,980 --> 00:01:29,730
more interesting. And hopefully
give a presentation about it.
21
00:01:29,730 --> 00:01:32,530
Let’s see if this works.
22
00:01:36,149 --> 00:01:38,380
It’s a good start, black screen, you know.
23
00:01:38,380 --> 00:01:43,330
It’s syncing to disk
and file system shutting down.
24
00:01:43,330 --> 00:01:48,710
There we go!
applause
25
00:01:48,710 --> 00:01:55,310
continued applause
26
00:01:55,310 --> 00:01:58,610
And yes, I run Gentoo Linux.
27
00:01:58,610 --> 00:02:01,390
applause
28
00:02:01,390 --> 00:02:05,400
This is the “Does Wi-Fi work?” moment.
Hopefully.
29
00:02:07,490 --> 00:02:12,570
NTP, yeah, no… “NTP failed”. Well,
that’s a bit annoying, but it still works.
30
00:02:15,630 --> 00:02:21,250
Hello? Yeah, it takes a bit to boot.
It doesn’t run systemd, you know.
31
00:02:21,250 --> 00:02:25,250
It’s sane, it’s a tiny bit slower,
but it’s sane.
32
00:02:25,250 --> 00:02:30,390
There we go.
applause
33
00:02:30,390 --> 00:02:35,260
This is the “Does my controller
work?” moment.
34
00:02:35,260 --> 00:02:39,517
Bluetooth in Saal 1.
Okay, it does.
35
00:02:39,517 --> 00:02:41,708
Alright, so let’s get started.
36
00:02:49,700 --> 00:02:53,730
So this is “Console Hacking 2016 –
PS4: PC Master Race”.
37
00:02:53,730 --> 00:02:58,350
I apologize for the horrible Nazi joke in
the subtitle, but it’s a Reddit thing.
38
00:02:58,350 --> 00:03:03,069
“PC Master Race”, why? Well.
PS4, is it a PC? Is it not a PC?
39
00:03:03,069 --> 00:03:06,070
But before we get started,
I would like to dedicate this talk
40
00:03:06,070 --> 00:03:09,430
to my good friend Ben Byer
who we all know as “bushing”.
41
00:03:09,430 --> 00:03:11,790
Unfortunately, he passed away
in February of this year and he was
42
00:03:11,790 --> 00:03:15,240
a great hacker, he came to multiple
congresses, one of the nicest people
43
00:03:15,240 --> 00:03:19,040
I’ve ever met. I’m sure that some of you
who have met him would agree with that.
44
00:03:19,040 --> 00:03:23,960
If it weren’t for him, I wouldn’t be here.
So, thank you.
45
00:03:23,960 --> 00:03:30,480
applause
46
00:03:30,480 --> 00:03:34,840
Alright. So, the PS4.
Is it a PC? Is it not a PC?
47
00:03:34,840 --> 00:03:37,220
Well, it’s a little bit different
from previous consoles.
48
00:03:37,220 --> 00:03:42,490
It has x86, it’s an x86 CPU.
It runs FreeBSD, it runs WebKit.
49
00:03:42,490 --> 00:03:45,490
It doesn’t have a hypervisor,
unfortunately.
50
00:03:45,490 --> 00:03:49,849
Then again, the PS3 had a hypervisor
and it was useless, so there you go.
51
00:03:49,849 --> 00:03:52,380
So this is different from the PS3,
but it’s not completely different.
52
00:03:52,380 --> 00:03:54,959
It does have a security processor
that you can just ignore because
53
00:03:54,959 --> 00:03:59,779
it doesn’t secure anything.
So that’s good.
54
00:03:59,779 --> 00:04:02,520
So how to own a PS4? Well, you write
a WebKit exploit and you write
55
00:04:02,520 --> 00:04:07,800
a FreeBSD exploit, duh. Right?
Everything runs WebKit,
56
00:04:07,800 --> 00:04:10,739
and FreeBSD is not exactly the
most secure OS in the world,
57
00:04:10,739 --> 00:04:14,800
especially not with Sony customizations.
So this is completely boring stuff.
58
00:04:14,800 --> 00:04:18,548
Like, what’s the point of talking about
WebKit and FreeBSD exploits?
59
00:04:18,548 --> 00:04:22,089
Instead, this talk is going to be about
something a little bit different.
60
00:04:22,089 --> 00:04:26,040
First of all, after you run an exploit,
well, you know, step 3 “something”,
61
00:04:26,040 --> 00:04:29,770
step 4 “PROFIT”. What is this about?
And not only that, though.
62
00:04:29,770 --> 00:04:32,740
Before you write an exploit, you usually
want to have the code you’re trying
63
00:04:32,740 --> 00:04:38,100
to exploit. And with WebKit and FreeBSD
you kinda do, but not the build they use,
64
00:04:38,100 --> 00:04:41,440
and it’s customized. And it’s annoying to
write an exploit if you don’t have access
65
00:04:41,440 --> 00:04:43,770
to the binary. So how do you get
the binary in the first place?
66
00:04:43,770 --> 00:04:47,690
Well, you dump the code,
that’s an interesting step.
67
00:04:47,690 --> 00:04:51,580
So let’s get started with step zero:
black-box code extraction, the fun way.
68
00:04:51,580 --> 00:04:54,450
A long time ago
in a hackerspace far, far away
69
00:04:54,450 --> 00:04:59,280
fail0verflow got together
after 31c3.
70
00:04:59,280 --> 00:05:02,530
And we looked at the PS4 motherboard
and this is what we saw. So there’s
71
00:05:02,530 --> 00:05:06,000
an Aeolia southbridge, that’s a codename,
by the way. Then there’s the Liverpool APU
72
00:05:06,000 --> 00:05:10,450
which is the main processor.
It’s a GPU and a CPU
73
00:05:10,450 --> 00:05:13,870
which is done by AMD, and
it has some RAM. And then
74
00:05:13,870 --> 00:05:16,250
the southbridge connects to a bunch
of random crap like the USB ports,
75
00:05:16,250 --> 00:05:19,280
a hard disk, which is USB. For some
inexplicable reason the internal disk
76
00:05:19,280 --> 00:05:24,840
on the PS4 is USB. Like it’s SATA to USB,
and then to USB on the southbridge.
77
00:05:24,840 --> 00:05:28,040
Even though it has SATA,
like, what? laughs
78
00:05:28,040 --> 00:05:31,630
The Blu-ray drive is SATA. The Wi-Fi,
Bluetooth, SDIO and Ethernet is GMII.
79
00:05:31,630 --> 00:05:34,090
Okay, how do we attack this?
Well, GDDR5…
80
00:05:34,090 --> 00:05:38,720
What just…?
Oh. I have a screensaver, apparently!
81
00:05:38,720 --> 00:05:40,960
That’s great.
laughter
82
00:05:40,960 --> 00:05:44,350
I thought I killed that,
let me kill that screensaver real quick.
83
00:05:44,350 --> 00:05:50,960
applause
Something had to fail, it always does.
84
00:05:52,490 --> 00:05:55,310
I mean, of course I can
SSH into my PS4, right?
85
00:05:55,310 --> 00:05:59,500
So there we go, okay.
Could have sworn I’d fix that. Anyway…
86
00:05:59,500 --> 00:06:02,760
Which one of these interfaces
do you attack? Well, you know,
87
00:06:02,760 --> 00:06:06,820
USB, SATA, SDIO, GMII – that’s
the raw ethernet interface, by the way –
88
00:06:06,820 --> 00:06:11,520
all these are CPU-controlled. The CPU
issues commands and the devices reply.
89
00:06:11,520 --> 00:06:16,389
The devices can’t really do anything. They
can’t write to memory or anything like that.
90
00:06:16,389 --> 00:06:19,050
You can exploit USB if you
hide a bug in the USB driver,
91
00:06:19,050 --> 00:06:21,370
but we’re back to the no-code issue.
92
00:06:21,370 --> 00:06:24,870
DDR5, that would be great,
we could just write to our memory
93
00:06:24,870 --> 00:06:27,930
and basically own the entire thing.
But it’s a very high-speed bus.
94
00:06:27,930 --> 00:06:30,160
It’s definitely exploitable.
If you were making a secure system
95
00:06:30,160 --> 00:06:33,840
don’t assume we can’t own DDR5,
because we will.
96
00:06:33,840 --> 00:06:37,020
But it’s not the path of least resistance,
so we’re not gonna do that.
97
00:06:37,020 --> 00:06:40,150
However, there’s a thing called
PCI Express in the middle there.
98
00:06:40,150 --> 00:06:42,100
Hmm, that’s interesting!
99
00:06:42,100 --> 00:06:45,430
PCIe is very fun for hacking –
even though it might seem intimidating –
100
00:06:45,430 --> 00:06:48,870
because it’s bus mastering,
that means you can DMA to memory.
101
00:06:48,870 --> 00:06:52,759
It’s complicated, and complicated things
are hard to implement properly.
102
00:06:52,759 --> 00:06:58,330
It’s robust. People think that PCIe is this
voodoo-highspeed… No it’s not!
103
00:06:58,330 --> 00:07:00,610
It’s high-speed, but you don’t need
matched traces to make it work.
104
00:07:00,610 --> 00:07:05,440
It will run over wet string. You can hotwire
PCIe with pieces of wire and it will work.
105
00:07:05,440 --> 00:07:09,330
At least at short distances anyway.
Believe me, it’s not as bad as you think.
106
00:07:09,330 --> 00:07:13,310
It’s delay-tolerant, so you
can take your time to reply.
107
00:07:13,310 --> 00:07:16,550
And the drivers are full of fail because
nobody writes a PCIe driver assuming
108
00:07:16,550 --> 00:07:19,520
the device is evil even though of course
everybody should because devices can
109
00:07:19,520 --> 00:07:22,620
and will be evil.
But nobody does that.
110
00:07:22,620 --> 00:07:25,680
So, what can we do?
Well, we have a PCIe link,
111
00:07:25,680 --> 00:07:30,740
let’s cut the lines and plug in the
southbridge to a PC motherboard
112
00:07:30,740 --> 00:07:34,460
that we stick on the side. Now
the southbridge is a PCIe card for us.
113
00:07:34,460 --> 00:07:38,479
And we connect the APU to an FPGA
board which then can pretend to be
114
00:07:38,479 --> 00:07:43,130
a PCIe device. So we can man-in-the-middle
this PCIe bus and it’s now x1 width
115
00:07:43,130 --> 00:07:47,110
instead of x4 because it’s easier that
way, but it will negotiate, that’s fine.
116
00:07:47,110 --> 00:07:50,520
So how do we connect that
motherboard and the FPGA?
117
00:07:50,520 --> 00:07:53,669
There’s of course many ways of doing this.
How many of you have done
118
00:07:53,669 --> 00:07:57,550
any hardware hacking, even Arduino or
anything like that? Raise your hand!
119
00:07:57,550 --> 00:08:02,310
I think that’s about a third to a half
or something like that, at least.
120
00:08:02,310 --> 00:08:04,750
When you hack some hardware,
you meld some hardware,
121
00:08:04,750 --> 00:08:10,100
after you blink an LED, what is the first
interface you use to talk to your hardware?
122
00:08:10,100 --> 00:08:14,880
Serial port! So we run
PCIe over RS232 at 115 kBaud
123
00:08:14,880 --> 00:08:16,490
which makes this PCIe…
laughter and applause
124
00:08:21,500 --> 00:08:27,710
I said it was delay-tolerant!
So it makes this PCIe 0.00002x.
125
00:08:27,710 --> 00:08:30,199
And eventually there was a
Gigabit ethernet port on the FPGA
126
00:08:30,199 --> 00:08:35,000
so I upgraded to that, but I only got
around to doing it in one direction.
127
00:08:35,000 --> 00:08:39,019
So now it’s PCIe 0.0002x in one direction
and 0.5x in the other direction
128
00:08:39,019 --> 00:08:42,099
which has to make this one of the most
asymmetric buses in the world.
129
00:08:43,489 --> 00:08:45,870
But it works, believe me.
This his hilarious.
130
00:08:45,870 --> 00:08:50,920
We can run PCIe over serial out. Also, we
were ASCII encoding, so half the bandwidth.
131
00:08:50,920 --> 00:08:52,940
It works fine. It’s fine.
132
00:08:52,940 --> 00:08:56,550
So, PCIe 101.
It’s a reliable packet-switched network.
133
00:08:56,550 --> 00:08:59,270
It uses a thing called
“Transaction Layer Packets”
134
00:08:59,270 --> 00:09:03,440
which are basically just packets you send.
It can be… Memory Read, Memory Write,
135
00:09:03,440 --> 00:09:06,140
IO Read, IO Write,
Configuration Read, Configuration Write.
136
00:09:06,140 --> 00:09:09,600
There can be a message-signaled interrupt
which is a way of saying: “Hey,
137
00:09:09,600 --> 00:09:13,470
listen to me!” by writing
to an address in memory.
138
00:09:13,470 --> 00:09:16,010
Because we can write the thing,
so why not write for interrupts?
139
00:09:16,010 --> 00:09:20,320
It has legacy interrupts
which are basically emulating the old
140
00:09:20,320 --> 00:09:24,430
wire-low-for-interrupt-and-
high-for-no-interrupt thing,
141
00:09:24,430 --> 00:09:25,750
you can tunnel that over PCIe.
142
00:09:25,750 --> 00:09:29,380
And it has completions, which are
basically the replies. So if you read
143
00:09:29,380 --> 00:09:31,930
a value from memory the completion
is what you get back with the value
144
00:09:31,930 --> 00:09:36,040
you tried to read. So that’s PCIe,
we can just go wild with DMA.
145
00:09:36,040 --> 00:09:39,250
We can just read all memory, dump
the kernel. Hey, it’s awesome, right?
146
00:09:39,250 --> 00:09:41,470
Except there’s an IOMMU in the APU.
147
00:09:41,470 --> 00:09:46,180
But... first, the IOMMU will protect
the devices. It will only let you access
148
00:09:46,180 --> 00:09:50,430
what memory is mapped to your device.
So the host has to allow you
149
00:09:50,430 --> 00:09:53,070
to read and write to memory.
But just because there’s an IOMMU
150
00:09:53,070 --> 00:09:58,190
doesn’t mean that Sony uses it properly.
Here’s some pseudo-code,
151
00:09:58,190 --> 00:10:01,390
it has a buffer on the stack, it says:
“please read from flash to this buffer”
152
00:10:01,390 --> 00:10:04,810
with the correct length. Can anyone
see the problem with this code?
153
00:10:04,810 --> 00:10:09,290
Well, it maps the buffer and it
reads and it unmaps the buffer.
154
00:10:09,290 --> 00:10:13,100
But IOMMUs don’t just map
byte “foo” to byte “bar”,
155
00:10:13,100 --> 00:10:16,570
they map pages, and
pages are 64k on the PS4.
156
00:10:16,570 --> 00:10:19,910
So Sony has just mapped 64k
of its stack to the device so
157
00:10:19,910 --> 00:10:25,720
it can just DMA straight into the stack,
basically the whole stack, and take over.
158
00:10:25,720 --> 00:10:29,660
Now we got code execution, FreeBSD
kernel dump, and WebKit and OS libs dump,
159
00:10:29,660 --> 00:10:32,500
just from mapping the flash.
160
00:10:32,500 --> 00:10:36,080
Okay, that’s step zero.
We have the code.
161
00:10:36,080 --> 00:10:39,930
But that’s not the PS4 that we did this
on, it was a giant mess of wires.
162
00:10:39,930 --> 00:10:43,019
Someone here knows about that,
you know, flying over on Facebook.
163
00:10:43,019 --> 00:10:46,480
We don’t make a ‘nice’ exploit.
We’ve done that because, as I said,
164
00:10:46,480 --> 00:10:50,089
WebKit, FreeBSD, whatever.
What comes after that?
165
00:10:50,089 --> 00:10:55,010
We want to do something.
Of course we want to run Linux, duh!
166
00:10:55,010 --> 00:10:58,590
How do you go from FreeBSD to Linux?
It’s not a trivial process.
167
00:10:58,590 --> 00:11:02,660
But you use something
that we call “ps4-kexec”.
168
00:11:02,660 --> 00:11:06,640
So how does this work? It’s simple,
right? You just want to run Linux?
169
00:11:06,640 --> 00:11:10,190
Just ‘jmp’ to Linux, right?
Well… kind of.
170
00:11:10,190 --> 00:11:13,180
You need to load Linux into contiguous
physical RAM, set up boot parameters,
171
00:11:13,180 --> 00:11:16,700
shut down FreeBSD cleanly, halt secondary
CPUs, make new pagetables etc.
172
00:11:16,700 --> 00:11:19,540
A lot of random things. I’m not going to
bore you with this crap because you
173
00:11:19,540 --> 00:11:23,459
can read the code. But there’s a lot
of iteration in getting this to work.
174
00:11:23,459 --> 00:11:26,930
Let’s assume that you do all this magical
cleanup and you get Linux into
175
00:11:26,930 --> 00:11:32,850
a nice state and you can ‘jmp’ Linux.
Now we jmp Linux, right? It’s cool.
176
00:11:32,850 --> 00:11:35,440
Yeah, you can technically jmp to Linux,
and it will technically run
177
00:11:35,440 --> 00:11:41,370
…for a little bit. And it will stop.
178
00:11:41,370 --> 00:11:45,290
And you will not get any serial or any
video or anything. What’s going on here?
179
00:11:45,290 --> 00:11:49,430
Let’s talk about hardware.
What is x86?
180
00:11:49,430 --> 00:11:53,050
x86 is a mediocre instruction set
architecture by Intel.
181
00:11:53,050 --> 00:11:56,190
It’s okay, I guess.
It’s not great.
182
00:11:56,190 --> 00:12:00,250
PS4 is definitely x86, it’s x86-64.
183
00:12:00,250 --> 00:12:03,580
What is a PC? Aah!
PC is a horrible, horrible thing
184
00:12:03,580 --> 00:12:07,220
built upon piles and piles of legacy crap
dating back to 1981.
185
00:12:07,220 --> 00:12:10,310
The PS4 is definitely -not- a PC.
186
00:12:10,310 --> 00:12:15,190
That’s practically Sony-level hardware fail,
so it could be, but it’s not.
187
00:12:15,190 --> 00:12:19,480
So what’s going on? A legacy PC
188
00:12:19,480 --> 00:12:22,660
basically has an 8259 Programmable
Interrupt Controller,
189
00:12:22,660 --> 00:12:27,360
a 8253 Programmable Interval Timer,
a UART at I/O 3f8h,
190
00:12:27,360 --> 00:12:29,399
which is the standard address
for a serial port.
191
00:12:29,399 --> 00:12:33,709
It has a PS/2 keyboard controller, 8042.
It has an RTC, a real-time clock
192
00:12:33,709 --> 00:12:35,510
with a CMOS, and everyone
knows the CMOS, right?
193
00:12:35,510 --> 00:12:40,240
MC146818 is the chip number for that. An
ISA bus – even if you think you don’t have
194
00:12:40,240 --> 00:12:43,010
an ISA bus your computer has an ISA bus
inside the southbridge somewhere.
195
00:12:43,010 --> 00:12:48,019
And it has VGA.
The PS4 doesn’t have -any- of these things.
196
00:12:48,019 --> 00:12:51,880
So what do we do?
Let’s look a little bit how a PC works
197
00:12:51,880 --> 00:12:55,760
and how a PS4 works. This is a general
simple PC system. There’s an APU
198
00:12:55,760 --> 00:13:00,170
or an Intel Core CPU with a southbridge,
Intel calls it PCH, AMD FCH.
199
00:13:00,170 --> 00:13:03,750
There’s an interface that is basically
PCIe although Intel calls it DMI and AMD
200
00:13:03,750 --> 00:13:08,270
calls it UMI. DDR3 RAM and a bunch
of peripherals and SATA, whatever.
201
00:13:08,270 --> 00:13:12,120
The PS4 kind of looks like that, right?
So you think this can’t be that dif…
202
00:13:12,120 --> 00:13:15,810
What’s so hard about this?
Because all the crap I mentioned earlier
203
00:13:15,810 --> 00:13:20,410
is in the southbridge on a PC, right?
The PS4 has a southbridge, right?
204
00:13:20,410 --> 00:13:23,870
Right? Right? Umm… so
the southbridge, the AMD standard FCH
205
00:13:23,870 --> 00:13:27,959
implements Intel legacy from 1981.
The Marvell Aeolia
206
00:13:27,959 --> 00:13:31,030
– Marvell is the maker of the PS4
southbridge – implements Intel legacy
207
00:13:31,030 --> 00:13:35,550
from 2002. What does that mean?
Ah! That’s no southbridge,
208
00:13:35,550 --> 00:13:40,300
that’s a Marvell Armada SoC!
So it’s not actually a southbridge,
209
00:13:40,300 --> 00:13:43,760
it was never a southbridge.
It’s an ARM system-on-a-chip CPU
210
00:13:43,760 --> 00:13:47,120
with everything. It’s a descendant
from Intel StrongARM or XScale.
211
00:13:47,120 --> 00:13:49,120
It has a bunch of peripherals.
And what they did is, they stuck
212
00:13:49,120 --> 00:13:53,240
a PCIe bridge on the side and said: “Hey
x86, you can now use all my ARM shit.”
213
00:13:53,240 --> 00:13:56,270
So it exposes all of its ARM peripherals
to the x86. They added some stuff
214
00:13:56,270 --> 00:13:59,100
they really needed for PCs
and it has its own RAM.
215
00:13:59,100 --> 00:14:03,720
Why do they do this? Well, it also runs
FreeBSD on the ARM in standby mode.
216
00:14:03,720 --> 00:14:06,019
And that’s how they do the whole
“download updates in the background,
217
00:14:06,019 --> 00:14:08,760
get content, update, whatever”.
All that crap is because they have
218
00:14:08,760 --> 00:14:12,851
a separate OS on a separate chip running
in standby mode. Okay, that’s great, but
219
00:14:12,851 --> 00:14:17,860
it’s also batshit insane.
laughter
220
00:14:17,860 --> 00:14:21,540
Quick recap: This is what a
PCIe bus number looks like,
221
00:14:21,540 --> 00:14:24,459
sorry, a device number.
It has a bus number, which is 8 bits,
222
00:14:24,459 --> 00:14:27,980
a device number, which is 5 bits,
and a function number, which is 3 bits.
223
00:14:27,980 --> 00:14:31,339
You’ve probably seen this in lspci
if you ever done that.
224
00:14:31,339 --> 00:14:34,480
This is what a regular southbridge
looks like. It has a USB controller,
225
00:14:34,480 --> 00:14:38,180
a PCI, ISA bridge, SATA, whatever.
And it has a bunch of devices.
226
00:14:38,180 --> 00:14:41,110
So one southbridge pretends
to be multiple devices.
227
00:14:41,110 --> 00:14:43,769
Because you only have three bits
for a function number so you can only have
228
00:14:43,769 --> 00:14:47,200
up to eight functions in one device.
229
00:14:47,200 --> 00:14:48,860
Intel southbridge just says:
“I’m device 14, 16, 1a, 1…,
230
00:14:48,860 --> 00:14:51,860
I’m just a bunch of devices,
and you can talk to all of them.”
231
00:14:51,860 --> 00:14:57,670
If you lspci on a roughly unpatched
Linux kernel on the PS4
232
00:14:57,670 --> 00:15:00,649
you get something like this.
So the Aeolia first of all
233
00:15:00,649 --> 00:15:03,740
clones itself into every PCIe device
because they were too lazy to do
234
00:15:03,740 --> 00:15:08,110
“if device equals my number then
reply, otherwise don’t reply”. No,
235
00:15:08,110 --> 00:15:11,470
they just said: “Oh, just reply to every
single PCIe device that might query”.
236
00:15:11,470 --> 00:15:16,870
Linux sees the southbridge 31 different
times, which is kind of annoying
237
00:15:16,870 --> 00:15:20,380
because it gets really confused when it
sees 31 clones of the same southbridge.
238
00:15:20,380 --> 00:15:24,540
And then it has eight functions:
ACPI, ethernet, SATA, SDMC, PCIe,…
239
00:15:24,540 --> 00:15:27,839
Eight functions, so all three bits.
240
00:15:27,839 --> 00:15:29,790
Turns out, eight functions
are not enough for everybody.
241
00:15:29,790 --> 00:15:34,490
Function no. 4, “PCI Express Glue”, has a
bridge config, MSI interrupt controller,
242
00:15:34,490 --> 00:15:37,410
ICC – we’ll talk about that later –,
HPET timers, Flash controller,
243
00:15:37,410 --> 00:15:44,920
RTC, timers, 2 serial ports, I2C… All
this smashed into one single PCIe device.
244
00:15:44,920 --> 00:15:49,210
Linux has a minimum system requirement
to run on anything.
245
00:15:49,210 --> 00:15:53,520
You need a timer, you need interrupts,
and you need some kind of console.
246
00:15:53,520 --> 00:15:57,010
The PS4 has no PIT, no PIC and no standard
serial so none of the standard PC stuff
247
00:15:57,010 --> 00:16:01,639
is going to work here. The board has
test points for an 8250 standard serial
248
00:16:01,639 --> 00:16:05,529
in a different place. So we run
DMESG over that, okay, fine.
249
00:16:05,529 --> 00:16:08,300
Linux has earlycon which we can
point to a serial port and say:
250
00:16:08,300 --> 00:16:11,221
“Please send all your DMESG here
very early because I really want to see
251
00:16:11,221 --> 00:16:16,030
what’s going on”. Doesn’t need IRQs,
you set console=uart8250,
252
00:16:16,030 --> 00:16:20,420
the type, the address, the speed.
And you’ll see it says 3200 instead of
253
00:16:20,420 --> 00:16:23,420
115 kBaud. That’s because their clock
is different. So you set 3200 but
254
00:16:23,420 --> 00:16:27,540
it really means 115k.
And that gets you DMESG.
255
00:16:27,540 --> 00:16:29,710
That actually gets you “Linux booting,
uncompressing”, whatever.
256
00:16:29,710 --> 00:16:32,400
That’s pretty good.
257
00:16:32,400 --> 00:16:36,540
Okay, we need a timer.
Because otherwise everything explodes.
258
00:16:36,540 --> 00:16:40,360
Linux supports the TSC, a built-in CPU
timer which is super nice and super fun.
259
00:16:40,360 --> 00:16:44,420
And PS4 has that. But Linux tries to
calibrate it against the legacy timer
260
00:16:44,420 --> 00:16:47,430
which on the PS4 doesn’t exist
so that’s fail.
261
00:16:47,430 --> 00:16:52,149
So again, the PS4 -really- is not a PC.
262
00:16:52,149 --> 00:16:54,270
What we need to do here is
defining a new subarchitecture
263
00:16:54,270 --> 00:16:58,519
because Linux supports this concept.
Says: “this is not a PC, this is a PS4”.
264
00:16:58,519 --> 00:17:01,290
The bootloader tells Linux:
“Hey! This is a PS4!”
265
00:17:01,290 --> 00:17:04,010
And then Linux says: “Okay, I’m not gonna
do the old timestamp calibration,
266
00:17:04,010 --> 00:17:07,829
I’m gonna do it for the PS4” which has
a special code that we wrote
267
00:17:07,829 --> 00:17:11,339
that calibrates against the PS4 timer.
And it disables the legacy crap.
268
00:17:11,339 --> 00:17:13,790
So now this is officially
not a PC anymore.
269
00:17:13,790 --> 00:17:18,539
Now we can talk about ACPI.
270
00:17:18,539 --> 00:17:21,479
You might know ACPI for all its
horribleness and all its evilness
271
00:17:21,479 --> 00:17:25,059
and all its Microsoft-y-ness.
ACPI - most people associate it with
272
00:17:25,059 --> 00:17:28,069
“Suspend” and “Suspend to Hibernate”.
It’s not just power,
273
00:17:28,069 --> 00:17:31,940
it does other stuff, too.
So we need ACPI for PCI config,
274
00:17:31,940 --> 00:17:34,139
for the IOMMU, for the CPU frequency.
275
00:17:34,139 --> 00:17:38,389
The PS4 of course has broken ACPI tables
because, of course it would be.
276
00:17:38,389 --> 00:17:42,190
So we fixed them in ps4-kexec.
277
00:17:42,190 --> 00:17:44,789
Now interrupts. We have timers,
we have serial, we fixed some stuff.
278
00:17:44,789 --> 00:17:48,619
The PS4 does message-signaled interrupts
which is, what I said, the non-legacy,
279
00:17:48,619 --> 00:17:51,490
the nice new thing where you just write
a value, and what you do is you tell
280
00:17:51,490 --> 00:17:55,129
the device when you want to interrupt
“please write this value to this address”.
281
00:17:55,129 --> 00:17:58,450
The device does that, and the CPU
interrupt controller sees that write
282
00:17:58,450 --> 00:18:01,049
and says: “Oh, this is an interrupt”
and then just fires off that interrupt
283
00:18:01,049 --> 00:18:06,490
into the CPU. That’s great.
It’s super fast and very efficient.
284
00:18:06,490 --> 00:18:08,739
And the value directly tells the CPU:
“That’s the interrupt vector you have
285
00:18:08,739 --> 00:18:14,460
to go to”. Okay, that’s the standard MSI
way there. Your computer does MSI that way.
286
00:18:14,460 --> 00:18:19,700
This is how the PS4 does MSI: The Aeolia
ignores the MSI config registers
287
00:18:19,700 --> 00:18:24,419
in the standard location. Instead of
has its own MSI controller,
288
00:18:24,419 --> 00:18:28,279
all stuff that’s in Function 4,
which is that “glue” device.
289
00:18:28,279 --> 00:18:32,460
Each function gets a shared address in
memory to write to and the top 27 bits
290
00:18:32,460 --> 00:18:36,119
of data. And every sub function, because
you can’t do a lot of things into one place,
291
00:18:36,119 --> 00:18:40,309
only gets the different 5 bits.
And all MSIs originate from Function 4,
292
00:18:40,309 --> 00:18:43,399
so this device has to fire an interrupt,
then it goes to here, and then
293
00:18:43,399 --> 00:18:48,700
that device fires an interrupt. Like… what…
this is all… what the hell is going on?
294
00:18:48,700 --> 00:18:53,769
Seriously, this is really fucked up. And
– the i’s are missing in the front there.
295
00:18:53,769 --> 00:18:59,299
But yeah. So, driver hell. Now the devices
are interdependent. Then the IRQ vector
296
00:18:59,299 --> 00:19:02,831
location is not sequential, so that’s not
gonna work. And you need to modify
297
00:19:02,831 --> 00:19:07,590
all the drivers. This is really painful to
develop for. So what we ended up doing
298
00:19:07,590 --> 00:19:11,950
is there is a core driver that implements
an interrupt controller for this thing.
299
00:19:11,950 --> 00:19:15,779
And then we have to make sure that loads
first, before the device driver. So Linux
300
00:19:15,779 --> 00:19:19,399
has a mechanism for that. And we had to
patch the drivers. Some drivers we patched,
301
00:19:19,399 --> 00:19:22,820
so to use these interrupts. And others
we wrapped around to use these interrupts.
302
00:19:22,820 --> 00:19:26,350
Unfortunately, because of the top bit
thing, everything has to share one interrupt
303
00:19:26,350 --> 00:19:31,279
within a function. Thankfully, we can fix
that with a IOMMU because it can read
304
00:19:31,279 --> 00:19:34,320
direct interrupt. So we can say:
“Oh, interrupt no. 0 goes to here,
305
00:19:34,320 --> 00:19:39,209
1 goes to here, 2 goes to here…”.
That’s great 'cause it's consecutive, right?
306
00:19:39,209 --> 00:19:45,490
0 1 2 3 4 5… it’s obviously gonna have
the same top bits. But we have to fix
307
00:19:45,495 --> 00:19:49,152
the ACPI table for that because it’s
broken. But this does work. So this
308
00:19:49,152 --> 00:19:54,109
gets us interrupts that function and
they’re individual. So let’s look at
309
00:19:54,109 --> 00:19:58,220
the check list: we have interrupts, timers,
early serial, late serial with interrupts.
310
00:19:58,220 --> 00:20:03,169
We can get some user space, we can stash
some user space and binaries into the kernel.
311
00:20:03,169 --> 00:20:06,060
And it will boot and you can get a console,
but you get a console and you try
312
00:20:06,060 --> 00:20:12,880
writing commands and sometimes it hangs.
Okay. What’s going on there?
313
00:20:12,880 --> 00:20:16,700
So it turns out that FreeBSD masks
interrupts with an AMD proprietary
314
00:20:16,700 --> 00:20:21,149
register set. We had to clean that up,
too. And that fixes serial,
315
00:20:21,149 --> 00:20:24,729
and all the other interrupts.
This took ages to find. It’s like: “why…
316
00:20:24,729 --> 00:20:26,909
interrupts on CPU serial
sometimes don’t…, yeah”.
317
00:20:26,909 --> 00:20:33,789
I ended up dumping register sets,
and I saw this #FFFFF here, not #FFFFF,
318
00:20:33,789 --> 00:20:39,350
what’s that? But tracking through this
stack to find this was really annoying.
319
00:20:39,350 --> 00:20:45,780
Alright. So we have the basics. We have
like a core platform we can run Linux on,
320
00:20:45,780 --> 00:20:49,500
even though it won’t do anything
interesting. Add drivers!
321
00:20:49,500 --> 00:20:54,450
So we have USB xHCI which has three
controllers in one device. Again, because
322
00:20:54,450 --> 00:20:59,899
“Let’s make it insane!”. We have SDHCI,
that’s SDIO for the Wi-Fi and the Bluetooth.
323
00:20:59,899 --> 00:21:03,509
Needs a non-standard config, it needs
quirks. Ethernet needs more hacks.
324
00:21:03,509 --> 00:21:07,139
It’s still partially broken, it only runs at
Gigabit speed. If you plug in a 100Mbit/s
325
00:21:07,139 --> 00:21:10,320
switch it just doesn’t send any data.
Not sure why.
326
00:21:10,320 --> 00:21:13,809
And then all of this worked fine in
Linux 4.4, and then just three days ago
327
00:21:13,809 --> 00:21:18,190
I think I tried to rebase on 4.9, and so
we have the latest and the greatest.
328
00:21:18,190 --> 00:21:21,249
And everything failed. And DMA didn’t
work. And all the drivers were just
329
00:21:21,249 --> 00:21:24,200
throwing their hands up in the air,
“what’s going on here?”.
330
00:21:24,200 --> 00:21:27,279
exhales
Aeolia strikes back. So.
331
00:21:27,279 --> 00:21:32,549
That’s what… the Aeolia looks like,
normally. So you have… again,
332
00:21:32,549 --> 00:21:36,690
it’s an ARM SoC, it’s really not a device.
It’s like its own little system. But
333
00:21:36,690 --> 00:21:40,750
it maps, it’s low 2 GB of the address base
to memory on the PC. And then the PC
334
00:21:40,750 --> 00:21:45,080
has a window into its registers that it
can use to control those devices.
335
00:21:45,080 --> 00:21:48,429
So the PC can kind of play with the
devices, and the DMA is to the same address
336
00:21:48,429 --> 00:21:53,149
and that works great. Because it’s mapped
in the same place. And then has its own RAM,
337
00:21:53,149 --> 00:21:58,580
in its own address space. This works fine.
But now we had an IOMMU. Because
338
00:21:58,580 --> 00:22:01,869
we needed it for the interrupts. And the
IOMMU inserts its own address space
339
00:22:01,869 --> 00:22:05,190
in between and says: “Okay, you can map
anything to anything you want, that’s great.“
340
00:22:05,190 --> 00:22:08,320
It’s a page table, you can say “this
address goes to that address.”
341
00:22:08,320 --> 00:22:13,099
Linux 4.4 did this: it would find some
addresses at the bottom of the IOMMU
342
00:22:13,099 --> 00:22:17,659
address space, say: “page 1 goes to this,
page 2 goes to that, page 3 goes to that”.
343
00:22:17,659 --> 00:22:22,870
And say: “device, you can now write to these
pages”. And they go to this place in the x86.
344
00:22:22,870 --> 00:22:28,200
That worked fine. It turns out Linux 4.9,
or somewhere between 4.4 and 4.9
345
00:22:28,200 --> 00:22:32,549
it started doing this: it would map pages
from the top of the IOMMU address space
346
00:22:32,549 --> 00:22:36,749
and that’s fine for the IOMMU but it’s
not in the window in the Aeolia, so
347
00:22:36,749 --> 00:22:42,140
you say “ethernet DMA to address
FExxx”, and instead of DMA-ing
348
00:22:42,140 --> 00:22:49,830
to the RAM on the PC it DMA-s to the RAM
on the Aeolia which is not gonna work.
349
00:22:49,830 --> 00:22:53,980
Effectively the Aeolia implements 31 bit
DMA, not 32 bit DMA because only
350
00:22:53,980 --> 00:23:00,009
the bottom half is usable. It’s like why…
this is all really fucked up, guys!
351
00:23:00,009 --> 00:23:03,799
Seriously. And this is littered all over
the code in Linux, so they seeded
352
00:23:03,799 --> 00:23:07,409
more patches, and it works, but, yeah.
353
00:23:07,409 --> 00:23:11,029
Painful. Okay. Devices, laying out (?)
devices’ work.
354
00:23:11,029 --> 00:23:16,259
Now for something completely different.
Who can tell me who this character is?
355
00:23:16,259 --> 00:23:20,659
That’s Starsha from Space Battleship Yamato.
And apparently that’s the code name
356
00:23:20,659 --> 00:23:24,840
for the PS4 graphics chip. Or at least that’s
one of the code names. Because
357
00:23:24,840 --> 00:23:27,940
they don’t seem to be able to agree
on like what the code names are.
358
00:23:27,940 --> 00:23:31,860
It’s got “Liverpool” in some places, and
“Starsha” in other places. Then “ThebeJ”
359
00:23:31,860 --> 00:23:36,210
in other places. And we think Sony calls
it “Starsha” and AMD calls it “Liverpool”
360
00:23:36,210 --> 00:23:39,789
but we’re not sure. We are calling it
“Liverpool” everywhere just to avoid
361
00:23:39,789 --> 00:23:43,660
confusion. Okay.
What’s this GPU about?
362
00:23:43,660 --> 00:23:47,230
Well, it’s an AMD Sea
Islands generation GPU,
363
00:23:47,230 --> 00:23:52,940
which is spelled CI instead of SI because
“S” was taken. It’s similar to other chips
364
00:23:52,940 --> 00:23:57,969
in the generation. So at least that’s
not a bat shit crazy new thing.
365
00:23:57,969 --> 00:24:00,950
But it does have quirks and customizations
and oddities and things that don’t work.
366
00:24:00,950 --> 00:24:03,769
What we did is we took Bonaire which is
another GPU that is already supported
367
00:24:03,769 --> 00:24:06,919
by Linux in that generation, and just kind
of added a new chip and said, okay,
368
00:24:06,919 --> 00:24:12,769
do all the Bonaire stuff, and then change
things. And hopefully adapt it to the PS4.
369
00:24:12,769 --> 00:24:16,440
So hacking AMD drivers, okay, well,
they’re open-source but AMD does not
370
00:24:16,440 --> 00:24:20,190
publish register docs. They publish 3D
shader and command queue documentations,
371
00:24:20,190 --> 00:24:24,280
so we get all the user space 3D rendering
commands, that’s documented. But they
372
00:24:24,280 --> 00:24:27,609
don’t publish all the kernel hardware
register documentation. That’s what
373
00:24:27,609 --> 00:24:30,740
we really want for hacking on drivers. So
that’s annoying. And you’re thinking
374
00:24:30,740 --> 00:24:34,389
“the code is the documentation”,
right? “Just read the Linux drivers”.
375
00:24:34,389 --> 00:24:39,299
That’s great. Yeah, but they’re incomplete,
then they have magic numbers, and
376
00:24:39,299 --> 00:24:43,229
it’s, you know, you don’t know if you need
to write a new register that’s not there,
377
00:24:43,229 --> 00:24:47,399
and it really sucks to try to write a GPU
driver by reading other GPU drivers
378
00:24:47,399 --> 00:24:50,840
with no docs. So what do we do? We’re
hackers, right? We google. Everytime
379
00:24:50,840 --> 00:24:54,480
we need information, hopefully Google will
find it because Google knows everything.
380
00:24:54,480 --> 00:24:59,109
And any tip that you could find in any
forum or code dumped somewhere is
381
00:24:59,109 --> 00:25:05,850
great. One of the things we found is we
googled this little string, “R8XXGPU”.
382
00:25:05,850 --> 00:25:10,730
And we get nine results. And the second
result is this place, it’s “Siliconkit”,
383
00:25:10,730 --> 00:25:15,629
token, was that okay? It’s an XML file.
And if we look at that it looks like
384
00:25:15,629 --> 00:25:21,499
it’s an XML file that contains a dump of
the Bonaire GPU register documentation.
385
00:25:21,499 --> 00:25:26,389
But it’s like broken XML, and it’s
incomplete, it stops at one point.
386
00:25:26,389 --> 00:25:31,379
But like: “what’s this doing here?”
And where did this come from, right?
387
00:25:31,379 --> 00:25:35,539
So let’s dig a little deeper. Okay Google,
what do you know about this website?
388
00:25:35,539 --> 00:25:39,789
Well, there’s some random things like
whatthehellno.txt and whatthehellyes.txt
389
00:25:39,789 --> 00:25:46,200
and some Excel files. Those are
really Excel like XML cell sheets.
390
00:25:46,200 --> 00:25:50,890
And then there’s a thing in the (?) there
called RAI.GRAMMAR.4.TXT.
391
00:25:50,890 --> 00:25:56,960
I wonder what that is. And it looks like
it’s a grammar, being a notation description
392
00:25:56,960 --> 00:26:03,490
for a syntax, of some kind of register
documentation file. This looks like
393
00:26:03,490 --> 00:26:10,749
an AMD internal format but it’s on this
website. Okay. So we have these two URLs,
394
00:26:10,749 --> 00:26:14,559
/pragmatic/bonaire.xml
and /RAI/rai.grammar4.txt.
395
00:26:14,559 --> 00:26:22,199
Let’s try something. How about maybe
/pragmatic/bonaire.rai – nah, it’s a 404.
396
00:26:22,199 --> 00:26:26,539
Okay, /pragmatic/RAI/bonaire.rai – aah!
Bingo!
397
00:26:26,539 --> 00:26:34,869
laughter and applause
398
00:26:34,869 --> 00:26:39,249
So this is a full – almost full Bonaire
register documentation with like
399
00:26:39,249 --> 00:26:44,350
full register field descriptions, breakdowns,
all the addresses. It’s not 100% but
400
00:26:44,350 --> 00:26:48,829
like of the vast majority. This seems to
be AMD-internal stuff. And I looked
401
00:26:48,829 --> 00:26:53,469
this guy up, and apparently he worked
at AMD at some point. So…
402
00:26:53,469 --> 00:26:56,849
But yeah… This is really, really helpful
because now you know what everything
403
00:26:56,849 --> 00:27:03,249
means, and debug registers, and… yeah.
So I wrote a working parser for this format.
404
00:27:03,249 --> 00:27:06,559
This was effectively writing an XML parser,
something like convert this thing to XML
405
00:27:06,559 --> 00:27:10,833
but it was all broken. Oh – he was writing
it in PHP, by the way, so there you go …
406
00:27:10,833 --> 00:27:14,580
So I wrote a working one in Python and
you can dump it and then you can see
407
00:27:14,580 --> 00:27:18,309
what each register means, and it’ll tell
you all the options. You can take
408
00:27:18,309 --> 00:27:22,519
a register dump and map it to the (?)(?)
documented. You can diff dumps,
409
00:27:22,519 --> 00:27:26,529
you can generic defines, it’s very useful
for AMD GPUs. And this, grossly speaking
410
00:27:26,529 --> 00:27:31,109
applies to a lot of AMD GPUs, like they
share a lot of registers. So this is useful
411
00:27:31,109 --> 00:27:36,090
for anyone hacking on AMD GPU stuff. Over
4.000 registers are documented in the …
412
00:27:36,090 --> 00:27:42,019
just in the main GPU address space alone.
That’s great. Okay. So we have some docs.
413
00:27:42,019 --> 00:27:49,969
How do we get to a frame buffer? So if you…
Israel (?) is HDMI it’s easy, right? The GPU
414
00:27:49,969 --> 00:27:52,489
has HDMI, and if you query the GPU
information you actually get that it has
415
00:27:52,489 --> 00:27:57,860
an HDMI port and a DisplayPort port. Okay,
maybe it’s unconnected, that’s fine, right?
416
00:27:57,860 --> 00:28:03,509
But if you actually ask the GPU it tells
you: “HDMI is not connected, DP is connected”.
417
00:28:03,509 --> 00:28:09,919
Okay. Yeah, they have an external HDMI
encoder from DisplayPort to HDMI because
418
00:28:09,919 --> 00:28:13,029
just putting a wire from A to B is too
difficult, because this is Sony, so:
419
00:28:13,029 --> 00:28:19,759
“let’s put a chip that converts some
protocol A to protocol B…” sighs
420
00:28:19,759 --> 00:28:25,700
Yeah, yeah.
applause
421
00:28:25,700 --> 00:28:33,549
It’s a Panasonic DisplayPort to HDMI
bridge, not documented by the way.
422
00:28:33,549 --> 00:28:37,429
We parsed config to work, that’s why it
doesn’t just work. Even though some bridges do.
423
00:28:37,429 --> 00:28:41,389
And you’d think, okay, it’s hooked up to the
GPU I2C bus, because GPUs have in the past
424
00:28:41,389 --> 00:28:45,309
used these bridges, and, not this one
particularly but other AMD cards have had
425
00:28:45,309 --> 00:28:48,659
various chips that they stuck in front. And
the code has support for talking to them
426
00:28:48,659 --> 00:28:54,309
through the GPU I2C interface, right?
That’s easy. Yay, you wish – it’s a Sony.
427
00:28:54,309 --> 00:28:57,909
sighs
Enter ICC! So, remember the ICC thing
428
00:28:57,909 --> 00:29:02,169
in the Aeolia – it’s an RPC protocol you
use to send commands to an MCU that is
429
00:29:02,169 --> 00:29:05,549
somewhere else on the motherboard. It’s
a message box system, so you write some
430
00:29:05,549 --> 00:29:09,519
message to a memory place, and then you
tell: “Hey, read this message!” and then
431
00:29:09,519 --> 00:29:12,090
it writes some message back, and it tells
you “Hey, it’s the reply!”.
432
00:29:12,090 --> 00:29:15,019
The Aeolia – not the otherGPU – uses it for things like
433
00:29:15,019 --> 00:29:20,989
Power Button, the LEDs, turning the power
on and off, and also the HDMI encoder I2C.
434
00:29:20,989 --> 00:29:25,460
So now we have the dependency from the
GPU driver to the Aeolia driver, two different
435
00:29:25,460 --> 00:29:30,200
PCI devices and two different… sighs
Yeah. And okay, again, ICC, but it’s I2C,
436
00:29:30,200 --> 00:29:34,099
you know, I2C is a simple protocol.
You read a register, you write a register,
437
00:29:34,099 --> 00:29:38,549
that’s all you need. It super simple.
Right? Now let’s make a byte code
438
00:29:38,549 --> 00:29:41,479
fucking scripting engine to which you I2C
commands and delays and bit masking
439
00:29:41,479 --> 00:29:47,029
and everything. And why, Sony, why, like
why would you do this? Well, because
440
00:29:47,029 --> 00:29:50,769
ICC is so slow? That if you actually tried
to do one read and one write at a time
441
00:29:50,769 --> 00:29:55,500
it takes 2 seconds to bring up HDMI.
exhales
442
00:29:55,500 --> 00:29:57,039
Yeah…
443
00:29:57,039 --> 00:30:01,820
I don’t even know at this point…
applause
444
00:30:01,820 --> 00:30:04,059
I have no idea.
continued applause
445
00:30:04,059 --> 00:30:10,499
And by the way this thing has commands
where you can send scripts in a script
446
00:30:10,499 --> 00:30:13,849
to be run when certain events happen. So
“Yo dawg, I heard you like scripts, I put
447
00:30:13,849 --> 00:30:16,960
scripts in your scripts so you can I2C
while you I2C”. Like: “let’s just go
448
00:30:16,960 --> 00:30:23,769
even deeper at this point”, right? Yeah.
exhales
449
00:30:23,769 --> 00:30:29,009
Okay. We wrote some code for this,
you need more hacks, it needs all
450
00:30:29,009 --> 00:30:33,599
DisplayPort lanes up, Linux tries to downscale,
doesn’t work. Memory bandwidth calculation
451
00:30:33,599 --> 00:30:37,289
is broken. Mouse cursor size is from the
previous GPU generation for some reason,
452
00:30:37,289 --> 00:30:41,750
I guess they forgot to update that. So
wait! All this crap – we get a frame buffer.
453
00:30:41,750 --> 00:30:47,159
But X won’t start. Ah. Well, it turns out
that PS4 uses a unified memory architecture
454
00:30:47,159 --> 00:30:52,580
so it has a single memory pool that is
shared between the x86 and the GPU.
455
00:30:52,580 --> 00:30:56,110
And games just put a texture in memory
and say: “Hey, GPU, render this!” and
456
00:30:56,110 --> 00:31:00,889
that works great. And this makes a lot of
sense, and their driver uses this to the
457
00:31:00,889 --> 00:31:06,369
fullest extents. So there’s a VRAM,
you know, the legacy… GPUs had
458
00:31:06,369 --> 00:31:10,229
a separate VRAM and all these integrated
chip sets can emulate VRAM using a chunk
459
00:31:10,229 --> 00:31:13,739
of the system memory. And you can usually
configure that in the BIOS if you have
460
00:31:13,739 --> 00:31:18,729
a PC that does this. And PS4 sets it to
16 MB which is actually the lowest possible
461
00:31:18,729 --> 00:31:24,659
setting. And 16 Megs is not enough to have
more than one Full HD frame buffer. So,
462
00:31:24,659 --> 00:31:28,519
obviously, that’s going to explode in
Linux pretty badly. So what we do is
463
00:31:28,519 --> 00:31:31,749
we actually reconfigure the memory
controller in the system to give 1 GB
464
00:31:31,749 --> 00:31:36,719
of RAM to the VRAM, and we did it on the
psd-kexec. So it’s basically doing like
465
00:31:36,719 --> 00:31:41,519
BIOSy things. We were reconfiguring the
Northbridge at this point to make this work.
466
00:31:41,519 --> 00:31:46,299
But it works. And with this we can get X
to start because it can allocate its frame buffer.
467
00:31:46,299 --> 00:31:53,659
But okay, it’s 3D time, right? – Neeaah,
GPU acceleration doesn’t quite work yet.
468
00:31:53,659 --> 00:31:58,560
So we got at least, you know, X but let’s
talk a bit about the Radeon GPU
469
00:31:58,560 --> 00:32:03,179
for a second. So when you want to draw
something on the GPU you send it a command
470
00:32:03,179 --> 00:32:06,289
and you do this by putting it into ‘ring’
which is really just a structure in memory,
471
00:32:06,289 --> 00:32:11,499
that’s a (?)(?)(?)(?). And it wraps around.
So that way you can queue things to be done
472
00:32:11,499 --> 00:32:15,600
in the GPU, and then it does it on its own
and you can go and do other things.
473
00:32:15,600 --> 00:32:20,330
There’s a Graphics Ring for drawing,
a Compute Ring for GPGPU, and a DMA Ring
474
00:32:20,330 --> 00:32:24,809
for copying things around. The commands
are processed by the GPU Command Processor
475
00:32:24,809 --> 00:32:32,419
which is really a bunch of different CPUs
inside the GPU. They are called F32.
476
00:32:32,419 --> 00:32:36,570
And they run a proprietary AMD microcode.
So this is a custom architecture.
477
00:32:36,570 --> 00:32:40,419
Also the rings can call out to IBs which
are indirect buffers. So you can say
478
00:32:40,419 --> 00:32:44,999
basically “Call this piece of memory, do
this stuff there, return back to the ring”.
479
00:32:44,999 --> 00:32:48,629
And that’s actually how the user space
thing does things. So this says:
480
00:32:48,629 --> 00:32:51,750
“Draw this stuff” and it tells the kernel:
“Hey, draw this stuff”. And the kernel
481
00:32:51,750 --> 00:32:57,269
tells the GPU: “Jump to that stuff,
read it come back, keep doing stuff”.
482
00:32:57,269 --> 00:33:01,999
This is basically how most GPUs work but
Radeon specifically works like, you know…
483
00:33:01,999 --> 00:33:06,649
with this F32 stuff. Okay. The driver
complains: “Ring 0 test failed”.
484
00:33:06,649 --> 00:33:10,669
Technically (?), you test them, so at least
you know it has nice diagnostic,
485
00:33:10,669 --> 00:33:13,669
and how does the test work? It’s really
easy. It writes a register with a value,
486
00:33:13,669 --> 00:33:16,649
and then it tells the GPU with a command
“Please write this other value
487
00:33:16,649 --> 00:33:21,159
to the register”, runs it and the checks
to see if the register was actually written
488
00:33:21,159 --> 00:33:29,190
with the new value. So the write doesn’t
happen. Thankfully, thanks to that RAI file
489
00:33:29,190 --> 00:33:32,459
earlier we found some debug registers that
tell you exactly what’s going on inside
490
00:33:32,459 --> 00:33:36,809
the GPU. And it shows the Command
Processor is stuck, waiting for data
491
00:33:36,809 --> 00:33:41,549
in the ring, so it needs more data.
After a NOP command?! Yeah…
492
00:33:41,549 --> 00:33:46,950
NOP is hard, let’s go stalling. So packet
headers in this GPU thing have a size
493
00:33:46,950 --> 00:33:51,700
that is SIZE-2. Whoever thought that was
a good idea. So a 2 word packet
494
00:33:51,700 --> 00:33:58,919
has a size of zero. Then AMD implemented
a 1 word packet with a size of -1.
495
00:33:58,919 --> 00:34:03,309
And old firmware doesn’t support that and
thinks: “Oh it’s 3FFF so I’m just gonna wait
496
00:34:03,309 --> 00:34:08,540
for a shitload of code in the buffer”,
right? It turns out that Hawaii,
497
00:34:08,540 --> 00:34:12,418
which is another GPU in the same gen
has the same problem with old firmware.
498
00:34:12,418 --> 00:34:14,772
So they use a different NOP packet, so
there was an exception in the driver
499
00:34:14,772 --> 00:34:18,940
for this. And we had to add ours to that.
But again – getting to this point, many,
500
00:34:18,940 --> 00:34:23,110
many, many hours of headbanging.
501
00:34:23,110 --> 00:34:28,230
Okay. We fixed that. Now it says:
“Ring 3 test failed”.
502
00:34:28,230 --> 00:34:31,069
That’s the SDMA ring. That’s for copying
things in memory and it works
503
00:34:31,069 --> 00:34:34,909
in the same way. It puts a value in RAM.
It tells the SDMA engine: “hey, write
504
00:34:34,909 --> 00:34:40,429
a different value”. And checks. This time
we see the write happens but it writes “0”
505
00:34:40,429 --> 00:34:44,839
instead if the 0xDEADBEEF or whatever.
Okay. So I tried this.
506
00:34:44,839 --> 00:34:48,139
I put two Write commands in the ring
saying: “Write to one place, write to
507
00:34:48,139 --> 00:34:52,518
a different place”. And this time,
if I saw, what it did is it wrote “1”
508
00:34:52,518 --> 00:34:56,619
to the first destination and “0” to the
second destination. I’m thinking:
509
00:34:56,619 --> 00:35:00,380
“Okay, it’s supposed to write 0xDEADBEEF…”
which is what you see there, it’s…
510
00:35:00,380 --> 00:35:04,450
0xDEADBEEF is that word
with the value. It writes “1”.
511
00:35:04,450 --> 00:35:08,980
Well, there’s a “1” there that
wasn’t there before, it was a “0”,
512
00:35:08,980 --> 00:35:13,640
because of this padding, right? So it
turns out they have it off by four,
513
00:35:13,640 --> 00:35:17,890
in the SDMA command parser
and it reads from four words later
514
00:35:17,890 --> 00:35:21,670
than it should.
exhales
515
00:35:21,670 --> 00:35:26,910
Again, this took many hours of
headbanging. It was like:
516
00:35:26,910 --> 00:35:32,390
“Randomly try two commands, oh, one, one?”
– “One”.
517
00:35:32,390 --> 00:35:37,779
So it reads four words too late but only
in ring buffers. Indirect buffers work fine.
518
00:35:37,779 --> 00:35:40,940
That’s good because those come from user
space. So we don’t have to mock with those.
519
00:35:40,940 --> 00:35:43,480
We can work around this, because it’s
only used in two places in the kernel,
520
00:35:43,480 --> 00:35:47,540
by using a Fill command instead of a Write
command. That works fine. Again,…
521
00:35:47,540 --> 00:35:52,490
how do they even make these mistakes?!
Okay. But still the GPU doesn’t work.
522
00:35:52,490 --> 00:35:55,640
The ring tests pass but if you tried
to draw you get a bunch of page faults.
523
00:35:55,640 --> 00:35:59,369
And it turns out that what happens is that
on the PS4 you can’t write the page table
524
00:35:59,369 --> 00:36:05,829
registers from actual commands in the GPU
itself. You can write to them from the CPU
525
00:36:05,829 --> 00:36:09,319
directly. You can say just: “Write memory
– memory register write”, and then
526
00:36:09,319 --> 00:36:14,519
I’ll write. But you can’t tell the GPU:
“Please write to the page table register this”.
527
00:36:14,519 --> 00:36:18,520
So the page tables don’t work, the GPU
can’t see any memory, so everything is broken.
528
00:36:18,520 --> 00:36:22,920
Linux uses this, FreeBSD doesn’t. It uses
direct writes. And we think this is maybe
529
00:36:22,920 --> 00:36:27,290
a Firewall somewhere in the Liverpool,
some kind of security thing they added.
530
00:36:27,290 --> 00:36:30,940
We can directly write from the CPU.
But it like breaks the regular…
531
00:36:30,940 --> 00:36:34,830
like it’s not asynchronous anymore. So
this could break things. And it’s a really
532
00:36:34,830 --> 00:36:39,000
hacky solution. I would really like to fix
this. And I’m thinking: “Maybe the firewall
533
00:36:39,000 --> 00:36:42,940
is in the firmware, right?”. But it’s
proprietary and undocumented firmware.
534
00:36:42,940 --> 00:36:47,630
So let’s look at that firmware. It’s
a thing, it needs microcode, a CP thing.
535
00:36:47,630 --> 00:36:51,440
It’s undocumented. But we take the blobs
out of FreeBSD. And that’s great because
536
00:36:51,440 --> 00:36:56,510
we have don’t have to ship them. Let’s
dig deeper into those blobs. So how do you
537
00:36:56,510 --> 00:37:00,599
reverse-engineer an unknown CPU
architecture? That’s really easy,
538
00:37:00,599 --> 00:37:05,039
run an instruction and see what it did.
And then just keep doing that. Thankfully,
539
00:37:05,039 --> 00:37:07,710
we upload custom firmwares, so it’s
actually really easy to just have like
540
00:37:07,710 --> 00:37:10,450
a two-instruction firmware that does
something, and then writes a register
541
00:37:10,450 --> 00:37:14,220
to a memory location. And that’s actually
really easy to find. If you first like
542
00:37:14,220 --> 00:37:17,460
write the memory instruction, it’s really
easy to find in the binary because you see
543
00:37:17,460 --> 00:37:23,559
like GPU register offsets that stand out
a bit in one column. So long story short,
544
00:37:23,559 --> 00:37:27,799
we wrote F32DIS which is a disassembler
for the proprietary AMD F32 microcode.
545
00:37:27,799 --> 00:37:31,619
I shamelessly stole the instruction
syntax from ARM. So you may recognize
546
00:37:31,619 --> 00:37:35,130
that if you’ve ever seen an ARM disassembly.
And this is not complete but it can
547
00:37:35,130 --> 00:37:38,980
disassemble every single instruction
in all the firmware in Liverpool for PFP,
548
00:37:38,980 --> 00:37:43,110
ME, CE, MEC and RLC which are five
different blocks in the GPU. As far
549
00:37:43,110 --> 00:37:46,319
as I notice that’s never been done before,
all the firmware was like in a voodoo
550
00:37:46,319 --> 00:37:50,099
black magic thing that’s been shipped.
Not even the non-AMD kernel developers
551
00:37:50,099 --> 00:37:54,710
know anything about this. So…
applause
552
00:37:54,710 --> 00:37:57,290
ongoing applause
553
00:37:57,290 --> 00:38:01,839
And you can disassemble the desktop
GPU stuff, too. So this could be good for
554
00:38:01,839 --> 00:38:06,133
debugging strange GPU shenanigans
in non-PS4 stuff.
555
00:38:06,133 --> 00:38:10,660
Alright. Alas, it’s not in the firmware.
It seems to be blocked in hardware.
556
00:38:10,660 --> 00:38:14,510
I found a debug register that actually
says: “there was an access violation
557
00:38:14,510 --> 00:38:17,340
in the bus when you try to write this
thing”. And I tried a bunch of workarounds
558
00:38:17,340 --> 00:38:22,789
and I even bought an AMD APU system,
desktop. Dumped all the registers,
559
00:38:22,789 --> 00:38:26,780
diff’ed them against the one I had on Linux
and tried setting every single value
560
00:38:26,780 --> 00:38:30,880
from the other GPU and hoping I find some
magic bits somewhere, but… no.
561
00:38:30,880 --> 00:38:35,420
They probably have a setting for this,
somewhere, but it’s a sea of ones and zeros,
562
00:38:35,420 --> 00:38:40,210
good luck finding it. It does work with
a CPU Write, workaround, though.
563
00:38:40,210 --> 00:38:43,769
So, hey, at least we get 3D! And it’s
actually pretty stable, so if there’s
564
00:38:43,769 --> 00:38:49,210
a race condition I’m not really seeing it.
So – checklist! What works,
565
00:38:49,210 --> 00:38:52,640
what doesn’t work. We have interrupts,
and timers – the core thing you need
566
00:38:52,640 --> 00:38:56,490
to run any OS – we have a serial port,
we can shutdown the system and reboot,
567
00:38:56,490 --> 00:38:59,559
and you’ll think that’s funny but actually
that goes through ICC, so again,
568
00:38:59,559 --> 00:39:02,420
at least some interesting code there.
I actually just implemented that about
569
00:39:02,420 --> 00:39:08,700
four hours ago. Because pulling the plug
was getting old. The Power button works.
570
00:39:08,700 --> 00:39:13,280
USB works. There’s a funny story with USB
as it used not to work. And we said:
571
00:39:13,280 --> 00:39:17,430
“Fix it later, there seems to be special
code missing.” And then someone
572
00:39:17,430 --> 00:39:20,499
pulled a repo from the USB-not-working
branch, and tested it, and said:
573
00:39:20,499 --> 00:39:25,450
“It’s working!” It seems we fixed it by
accident, by changing something else.
574
00:39:25,450 --> 00:39:29,170
The hard disk works which is via the USB.
Blu-ray works, I wrote a driver for that,
575
00:39:29,170 --> 00:39:32,170
also four hours ago. – Three hours ago
now? Yeah, something like that.
576
00:39:32,170 --> 00:39:34,930
And I spent 20 minutes looking for someone
in the Hackcenter that had a DVD I could
577
00:39:34,930 --> 00:39:40,400
stick in to try. Apparently I’m from
he past if I ask for DVDs.
578
00:39:40,400 --> 00:39:45,390
But it does work. So that’s good. Wi-Fi
and Bluetooth works.
579
00:39:45,390 --> 00:39:49,119
Ethernet works, except only at GBit speeds.
Frame buffer works. HDMI works.
580
00:39:49,119 --> 00:39:54,829
It’s currently hard-coded to 1080p so…
It does work. We can fix that
581
00:39:54,829 --> 00:40:00,960
by improving the encoder implementation.
3D works with the ugly register write hack.
582
00:40:00,960 --> 00:40:06,659
And SPDIF audio works. So that’s good.
HDMI audio doesn’t work. Mostly because
583
00:40:06,659 --> 00:40:10,450
I only got audio grossly working, in
general, recently, and I haven’t had
584
00:40:10,450 --> 00:40:15,250
a chance to program the encoder to support
the audio stuff yet. Because, again,
585
00:40:15,250 --> 00:40:18,619
new more annoying hacks there. And the
real-time clock doesn’t work and everything.
586
00:40:18,619 --> 00:40:23,350
That’s simple, the clock, that device is
simple. But ever since the PS2 the way
587
00:40:23,350 --> 00:40:27,410
Sony has implemented real-time clocks
is that instead of reading and writing
588
00:40:27,410 --> 00:40:29,920
the time on the clock, which is what you
would think is the normal thing to do,
589
00:40:29,920 --> 00:40:33,480
they never write the time on the clock.
Instead, they store an offset from the clock
590
00:40:33,480 --> 00:40:39,579
to the real time, in some kind of storage
location. And there’s a giant mess of…
591
00:40:39,579 --> 00:40:44,269
…registry it’s called, in the PS4, and
I don’t even know where it’s stored.
592
00:40:44,269 --> 00:40:46,970
It might be on the hard drive, it might be
encrypted. So basically, getting
593
00:40:46,970 --> 00:40:50,259
the real-time clock to actually show the
right time involves a pile of nonsense
594
00:40:50,259 --> 00:40:53,980
that I haven’t had the chance to look at
yet. But… we have NTP, right?
595
00:40:53,980 --> 00:40:59,030
So it’s good enough. – Oh, and we have
Blinkenlights! Important! The Power LED
596
00:40:59,030 --> 00:41:04,329
does some interesting things, if you’re
on Linux. So that’s good.
597
00:41:04,329 --> 00:41:10,610
So – the code: you can get the ps4-kexec
code on our Github page. That has
598
00:41:10,610 --> 00:41:14,910
the kexec and the hardware configuration,
and the bootloader Linux stuff.
599
00:41:14,910 --> 00:41:18,599
You can get the ps4 Linux branch which is
the… our fork of the kernel,
600
00:41:18,599 --> 00:41:22,769
rebased on 4.9 which is the latest (?)
version, I think.
601
00:41:22,769 --> 00:41:26,319
You can get our Radeon patches which are
three, I think, really tiny patches for
602
00:41:26,319 --> 00:41:30,410
user space libraries just to support this
new chip. Really simple stuff, the NOP
603
00:41:30,410 --> 00:41:35,289
thing, and a couple of commands. And the
RAI and F32DIS thing I mentioned.
604
00:41:35,289 --> 00:41:40,779
You can get Radeon tools at that Github
repo. Just push that right before the stock.
605
00:41:40,779 --> 00:41:44,089
So if you’re interested – there you go.
And if you going at the RAI file, well,
606
00:41:44,089 --> 00:41:47,569
we wanna put you on a run before the guys
at that website realize they really should
607
00:41:47,569 --> 00:41:52,589
take that down! But I’m sure the internet
wayback machine has it somewhere.
608
00:41:52,589 --> 00:42:00,279
Okay! That’s everything for the story of
how we got Linux running on the PS4.
609
00:42:00,279 --> 00:42:08,710
And you can reach us at that website
or fail0verflow on Twitter.
610
00:42:08,710 --> 00:42:14,440
applause
Thank you!
611
00:42:14,440 --> 00:42:18,259
ongoing applause
612
00:42:18,259 --> 00:42:24,309
I hope that wasn’t too fast, sorry, I had
to rush through my 89 slides a little bit
613
00:42:24,309 --> 00:42:29,460
because I really wanted to do a demo.
I think this kind of is the demo, right.
614
00:42:29,460 --> 00:42:33,180
But we can try something else.
So maybe I can shut this –
615
00:42:33,180 --> 00:42:39,839
so I can aim with my controller.
616
00:42:39,839 --> 00:42:43,960
This is really not meant as a mouse!
That’s not Right Button.
617
00:42:43,960 --> 00:42:46,809
Come on! Yeah, I think it is…
618
00:42:46,809 --> 00:42:48,810
Close? Close! Maybe…
619
00:42:48,810 --> 00:42:51,099
So we have this little icon here.
I wonder what happens if it works.
620
00:42:51,099 --> 00:42:55,740
Do we have internet access? Hopefully
Wi-Fi works, let’s then just check real quick.
621
00:42:55,740 --> 00:42:57,730
keyboard typing sounds
622
00:42:57,730 --> 00:42:59,849
This could bork really badly if we don’t.
623
00:42:59,849 --> 00:43:02,039
keyboard typing sounds
624
00:43:02,039 --> 00:43:03,500
mumbles ping 8.8.8.8
625
00:43:03,500 --> 00:43:06,009
Yeah, we have internet access.
So, Wi-Fi works!
626
00:43:06,009 --> 00:43:08,710
Okay. I wonder what happens
if we click that!
627
00:43:08,710 --> 00:43:15,160
It takes a while to load.
This is not optimized for…
628
00:43:15,160 --> 00:43:23,859
laughter and applause
marcan laughs
629
00:43:23,859 --> 00:43:28,410
So the CPUs on this thing are
a little bit slow. But…
630
00:43:28,410 --> 00:43:31,990
sounds of the machine
Hey, it works!
631
00:43:31,990 --> 00:43:35,880
And now it’s a real game console!
632
00:43:35,880 --> 00:43:42,089
laughter and applause
633
00:43:42,089 --> 00:43:49,069
And this is… there we go, okay.
634
00:43:49,069 --> 00:43:54,290
So I think we can probably take some Q&A
because this is a little bit slow to load.
635
00:43:54,290 --> 00:43:56,529
But we can try a game, maybe.
636
00:43:56,529 --> 00:44:03,020
Herald: If you are for Q&A I think
there will be some questions.
637
00:44:03,020 --> 00:44:07,089
So shall we start with one
from the internet.
638
00:44:07,089 --> 00:44:16,029
Signal Angel: Hey! The internet wants to
know if most of your research will be
639
00:44:16,029 --> 00:44:18,470
published, or if stuff’s
going to stay private.
640
00:44:18,470 --> 00:44:21,992
marcan: All of this… the publishing is
basically the code which… and you know
641
00:44:21,992 --> 00:44:26,660
the explanation I just gave… I said that
everything’s on Github. So all the drivers
642
00:44:26,660 --> 00:44:30,950
we wrote, all the… I mean… and in this
case also the spec is the code.
643
00:44:30,950 --> 00:44:34,300
If you really want to I could write some
Wiki pages on this. But roughly speaking,
644
00:44:34,300 --> 00:44:37,890
what’s in the drivers is what we found
out. The really interesting bit,
645
00:44:37,890 --> 00:44:44,269
I think, is that F32 stuff from the AMD
GPU stuff. And that we have a repo for.
646
00:44:44,269 --> 00:44:48,369
But if you have any general questions, or
name a particular device, or any details,
647
00:44:48,369 --> 00:44:54,069
feel free to ask. I don’t know… again, it
would be nice if we wrote a bunch
648
00:44:54,069 --> 00:44:57,220
of docs and everything. But it’s not really
a matter of not wanting to write them,
649
00:44:57,220 --> 00:45:01,250
it’s lazy engineers not wanting to write
documentation. But the code is at least…
650
00:45:01,250 --> 00:45:05,250
the things we have on Github are fairly
clean. So.
651
00:45:05,250 --> 00:45:08,630
Herald: Okay, so, someone is piling up
on 4. Guys, if you have questions
652
00:45:08,630 --> 00:45:11,990
you see the microphones over here.
Just pile up over there
653
00:45:11,990 --> 00:45:14,539
and I’m gonna point… 4 please!
654
00:45:14,539 --> 00:45:19,210
Question: Just a small question.
How likely is it that you upstream
655
00:45:19,210 --> 00:45:22,700
some of that stuff. Because… I mean…
656
00:45:22,700 --> 00:45:27,299
marcan: So there’s two sides to that.
One side is that we need to actually
657
00:45:27,299 --> 00:45:31,059
get together and upstream it. The code…
some of it has horrible hacks, some of it
658
00:45:31,059 --> 00:45:36,539
isn’t too bad. So we want to upstream it.
659
00:45:36,539 --> 00:45:42,099
We have to sit down and actually do it.
I think most of the custom x86 based
660
00:45:42,099 --> 00:45:45,280
machine stuff and the kernel is doable.
The drivers are probably doable.
661
00:45:45,280 --> 00:45:49,609
Some people might scream at the interrupt
hacks. But it’s probably not terrible.
662
00:45:49,609 --> 00:45:53,580
And if they have a better way of doing it
I’m all ears, there are other kernel devs.
663
00:45:53,580 --> 00:45:59,589
The Radeon stuff is quite fishy because of
the encoder thing that is like (?) non-standard.
664
00:45:59,589 --> 00:46:03,880
And also understandably
AMD GPU driver developers
665
00:46:03,880 --> 00:46:07,380
that work for AMD may want to have nothing
to do with this. And in fact I know
666
00:46:07,380 --> 00:46:11,570
for a fact that at least
one of them doesn’t. But
667
00:46:11,570 --> 00:46:16,609
they can’t really stop us from upstreaming
things into the Linux kernel, right?
668
00:46:16,609 --> 00:46:20,210
So I think as long as we get to come
to a state where it’s doable it’s fine.
669
00:46:20,210 --> 00:46:23,250
But most likely I think…
laughter
670
00:46:23,250 --> 00:46:27,910
…I think most likely the non-GPU stuff
will go in first if we have a chance
671
00:46:27,910 --> 00:46:30,940
to do that. And of course, if you wanna
try upstreaming it go ahead!
672
00:46:30,940 --> 00:46:33,470
It’s open source, right? So.
673
00:46:33,470 --> 00:46:35,460
Herald: Over to microphone 1, please.
674
00:46:35,460 --> 00:46:42,079
Question: Hi. First I think I should
employ you to try and find trouble Hudson. (?)
675
00:46:42,079 --> 00:46:48,430
And control him into using your FreeBSD
kexec implementation in heads.
676
00:46:48,430 --> 00:46:55,210
Instead of having to run all of Linux in it,
as a joke. But my real question is:
677
00:46:55,210 --> 00:46:59,160
if the reason you used Gentoo was
because systemd was yet another hurdle
678
00:46:59,160 --> 00:47:00,519
in getting this to run?
679
00:47:00,519 --> 00:47:02,710
laughter
marcan laughs
680
00:47:02,710 --> 00:47:06,430
marcan: I run Gentoo on my main machine,
I run Gentoo on most of the machines
681
00:47:06,430 --> 00:47:10,950
I care about. I do run Arch on a few of
the others and then I’d live with systemd.
682
00:47:10,950 --> 00:47:15,661
But the reason why I run Gentoo is, first
it’s what I like and use. And second it’s
683
00:47:15,661 --> 00:47:19,119
super easy to use patches on Gentoo.
You get those things we put onto Github,
684
00:47:19,119 --> 00:47:21,549
which are just patch files, it’s not really
a repo. Because they’re so easy
685
00:47:21,549 --> 00:47:24,869
it’s not worth cloning everything. Just
get those patch files, stick them on
686
00:47:24,869 --> 00:47:28,480
/etc/portage/patches/, have a little hook to patch,
and that’s all you need. So it’s really
687
00:47:28,480 --> 00:47:33,070
easy to patch packages in Gentoo,
that’s one of the main reasons.
688
00:47:33,070 --> 00:47:37,730
laughs about something in audience
689
00:47:37,730 --> 00:47:39,599
Herald: No. 3 please!
690
00:47:39,599 --> 00:47:43,550
Question: Will there be new exploits,
new way to boot Linux
691
00:47:43,550 --> 00:47:48,400
on PS3 with modern firmwares
because finding one
692
00:47:48,400 --> 00:47:51,109
with firmware 1.76 is really rare.
693
00:47:51,109 --> 00:47:52,460
marcan: That was 4.05!
694
00:47:52,460 --> 00:47:58,500
Question: Ah, okay.
marcan: But again, our goal is to focus
695
00:47:58,500 --> 00:48:01,369
on… I just told you the story of the
pre-exploit thing because I think
696
00:48:01,369 --> 00:48:05,089
that’s good like a hacker story, a good
knowledge suite trying new platforms.
697
00:48:05,089 --> 00:48:07,740
And the Linux thing we’re working on.
The reason why we don’t want to publish
698
00:48:07,740 --> 00:48:11,599
the exploit or really get involved in the
whole exploit scene is that there is
699
00:48:11,599 --> 00:48:17,099
a lot of drama, it’s not rocket science
in that it’s like super custom code,
700
00:48:17,099 --> 00:48:21,400
this is WebKit and FreeBSD. It’s actually not
that hard. And we know for a fact
701
00:48:21,400 --> 00:48:25,751
that several people have reproduced this
on various firmwares. So there’s no need
702
00:48:25,751 --> 00:48:29,980
for us to be the exploit provider. And
we don’t want to get into that because
703
00:48:29,980 --> 00:48:37,420
it’s a giant drama fest as we all know,
anyway. Please DIY it this time!
704
00:48:37,420 --> 00:48:39,470
Question: Okay. Thanks.
705
00:48:39,470 --> 00:48:41,329
Herald: And what is the internet saying?
706
00:48:41,329 --> 00:48:46,440
Signal Angel: The internet wants to know
if you ever had fun with the BSD
707
00:48:46,440 --> 00:48:47,749
on the second processor.
708
00:48:47,749 --> 00:48:52,460
marcan: Oh, that’s a very good question.
I myself haven’t. I don’t know if anyone
709
00:48:52,460 --> 00:48:55,930
else has looked at it briefly. One of the
commands for rebooting will boot
710
00:48:55,930 --> 00:49:01,339
that CPU into FreeBSD. And there’s
probably fun to be had there.
711
00:49:01,339 --> 00:49:03,869
But we haven’t really looked into it.
712
00:49:03,869 --> 00:49:06,819
Herald: And over to 5, please.
713
00:49:06,819 --> 00:49:13,000
Question: I was wondering if any of that
stuff was applicable to the PS4 VR edition
714
00:49:13,000 --> 00:49:18,800
or whatever it’s called, the new one?
Did you ever test it?
715
00:49:18,800 --> 00:49:20,460
marcan: Sorry, say it again!
716
00:49:20,460 --> 00:49:22,359
Question: Sony brought up a new PS4
I thought.
717
00:49:22,359 --> 00:49:24,299
marcan: Oh, the Pro you mean,
the PS4 Pro?
718
00:49:24,299 --> 00:49:26,670
Question: Yes.
marcan: So Linux boots on the Pro,
719
00:49:26,670 --> 00:49:30,289
we got that far. GPU is broken. So we
would like to get this ported to the Pro
720
00:49:30,289 --> 00:49:34,140
and also working. It’s basically an
incremental update, so it’s not that hard,
721
00:49:34,140 --> 00:49:36,999
but the GPU needs a new definition,
new jBullet(?) stuff.
722
00:49:36,999 --> 00:49:40,940
Yeah, you get a lot of C frames
down-burned (?), yeah…
723
00:49:40,940 --> 00:49:45,280
So, as you can see, 3D works,
and, there you go!
724
00:49:45,280 --> 00:49:52,340
synth speech from game
applause
725
00:49:52,340 --> 00:49:56,119
I only have to look up and down in this game!
726
00:49:56,119 --> 00:49:58,230
continued synth speech from game
727
00:49:58,230 --> 00:50:01,019
Herald: Well, then number 3, please.
728
00:50:01,019 --> 00:50:07,679
Question: I want to ask you if you want to
port these Radeon patches to the new
729
00:50:07,679 --> 00:50:16,274
amdgpu driver because AMD now supports
the Southern Island GPUs?
730
00:50:16,274 --> 00:50:19,354
marcan: Yes, that’s a very good question.
Actually, the first attempt we made
731
00:50:19,354 --> 00:50:22,609
at writing this driver was with amdgpu.
And at the time it wasn’t working at all.
732
00:50:22,609 --> 00:50:26,559
And there was a big concern about its
freshness at the time and it was
733
00:50:26,559 --> 00:50:31,130
experimentally supporting this GPU
generation. I’m told it should work.
734
00:50:31,130 --> 00:50:35,720
So I would like to port this… move to
amdgpu and we have a working
735
00:50:35,720 --> 00:50:38,970
implementation, and we got to clean up
code much better, we know where all
736
00:50:38,970 --> 00:50:42,050
the nits are, I want to try again with
amdgpu and see if that works.
737
00:50:42,050 --> 00:50:47,019
That’s a very good question because the
newer gen might require the driver maybe, so …
738
00:50:47,019 --> 00:50:49,029
Question: Thank you.
Herald: Well then I’m gonna guess we ask
739
00:50:49,029 --> 00:50:50,220
the internet again.
740
00:50:50,220 --> 00:50:56,210
Signal Angel: Okay, the internet states
that about a year ago you argued
741
00:50:56,210 --> 00:51:02,069
with someone on twitter that the PS4 wasn’t
a PC and now you’re saying that kind of
742
00:51:02,069 --> 00:51:05,330
is something. And what’s about that?
743
00:51:05,330 --> 00:51:11,249
marcan: So again, the reason of saying
it’s not a PC is that it’s not an IBM
744
00:51:11,249 --> 00:51:17,369
Personal Computer compatible device.
It’s an x86 device that happens to
745
00:51:17,369 --> 00:51:20,470
be structured roughly like a current PC
but if you look at the details
746
00:51:20,470 --> 00:51:24,280
so many things are completely different.
It really isn’t a PC. Like on Linux I had
747
00:51:24,280 --> 00:51:29,730
to define “sub arch PS4”. It’s an x86
but it’s not a PC. And that’s actually
748
00:51:29,730 --> 00:51:32,520
a very important distinction because
there’s a lot of things you have
749
00:51:32,520 --> 00:51:36,210
never heard of that are x86 but not PC.
It’s like e.g. there’s a high chance
750
00:51:36,210 --> 00:51:40,480
your monitor at home has
an 8186 CPU in it. So, yeah.
751
00:51:40,480 --> 00:51:45,200
Herald: So nobody’s piling at the
microphones any more.
752
00:51:45,200 --> 00:51:47,430
Is there one last question
from the internet?
753
00:51:47,430 --> 00:51:51,299
Signal Angel: Yes, there is.
754
00:51:51,299 --> 00:51:53,819
The question is…
755
00:51:53,819 --> 00:51:59,660
…if there was any
decryption needed.
756
00:51:59,660 --> 00:52:05,509
marcan: No. So this is purely… you
exploit WebKit, you get user mode,
757
00:52:05,509 --> 00:52:08,769
you exploit the kernel, you got kernel
mode. You jump Linux…
758
00:52:08,769 --> 00:52:12,240
there’s no security like… there’s nothing
like stopping you from doing
759
00:52:12,240 --> 00:52:15,160
all that stuff. There’s a sand box in
FreeBSD but obviously you exploit
760
00:52:15,160 --> 00:52:20,920
around the sand box. There’s nothing…
there’s no hypervisor, there’s no monitoring,
761
00:52:20,920 --> 00:52:24,650
there’s nothing like saying: “Oh this code
should not be running.” There’s no
762
00:52:24,650 --> 00:52:29,089
like integrity checking. They have a security
architecture but as it’s tradition for Sony
763
00:52:29,089 --> 00:52:35,230
you can just walk around it.
laughter
764
00:52:35,230 --> 00:52:37,730
applause
765
00:52:37,730 --> 00:52:42,660
The PS3 was notable for the fact that
the PS Jailbreak which is a USB…
766
00:52:42,660 --> 00:52:47,470
it’s effectively a piracy device
that was released by someone
767
00:52:47,470 --> 00:52:51,510
that basically used a USB exploit
in the kernel and only a USB exploit
768
00:52:51,510 --> 00:52:54,990
in the kernel to effectively enable piracy.
So when you have like a stack of security
769
00:52:54,990 --> 00:52:58,400
and you break one thing and you get
piracy that’s a fail! This is basically
770
00:52:58,400 --> 00:53:02,050
the same idea. Except I have no idea what
you do to do piracy and I don’t care.
771
00:53:02,050 --> 00:53:09,780
But Sony doesn’t really know how to
architecture secure systems.
772
00:53:09,780 --> 00:53:11,500
That’s it.
773
00:53:11,500 --> 00:53:14,689
Herald: That’s it, here we go,
that’s your applause!
774
00:53:14,689 --> 00:53:20,230
applause
775
00:53:20,230 --> 00:53:21,810
postroll music
776
00:53:21,810 --> 00:53:32,109
subtitles created by c3subtitles.de
in the year 2017. Join, and help us!