0:00:00.000,0:00:15.740
<i>34c3 intro</i>

0:00:15.740,0:00:22.800
Herald: The next talk is called "End-to-[br]end formal ISA verification of RISC-V

0:00:22.800,0:00:28.410
processors with riscv-formal". I have no[br]idea what this means, but I'm very excited

0:00:28.410,0:00:33.540
to understand what it means and Clifford[br]promised he's going to make sure everyone

0:00:33.540,0:00:39.320
will. Clifford has been very known in the[br]open-source and free-software... free-

0:00:39.320,0:00:45.219
source... oh my gosh... free-and-open-[br]source community and especially he's known

0:00:45.219,0:00:50.300
for the project IceStorm. Please help me[br]welcome Clifford!

0:00:50.300,0:00:57.260
<i>applause</i>

0:00:57.260,0:01:12.170
<i>inaudible</i>[br]Clifford: RISC-V is an open instruction

0:01:12.170,0:01:17.189
set architecture. It's an open ISA, so[br]it's not a processor, but it's a processor

0:01:17.189,0:01:22.860
specification -- an ISA specification --[br]that is free to use for everyone and if

0:01:22.860,0:01:26.789
you happen to have already implemented[br]your own processor at one point in time,

0:01:26.789,0:01:30.880
you might know that it's actually much[br]easier to implement a processor than to

0:01:30.880,0:01:35.560
implement all the tools that you need to[br]compile programs for your processor. And

0:01:35.560,0:01:39.750
if you use something like RISC-V, then you[br]can reuse the tools that are already out

0:01:39.750,0:01:45.170
there, so that's a great benefit. However,[br]for this endeavor we need processors that

0:01:45.170,0:01:50.280
actually really compatible to each other:[br]Processors that implement the RISC-V ISA

0:01:50.280,0:01:54.970
correctly. So, with many other ISAs, we[br]start with one processor and we say "Oh,

0:01:54.970,0:01:59.660
that's the processor" and later on, we[br]figure out there was a bug, and what

0:01:59.660,0:02:03.049
people sometimes do is, they just change[br]the specification, so that the

0:02:03.049,0:02:06.770
specification all fits the hardware they[br]actually have. We can't do something like

0:02:06.770,0:02:09.860
that with RISC-V, but there are many[br]implementations out there, all being

0:02:09.860,0:02:14.440
developed in parallel to fit the same[br]specification, so we want to have some

0:02:14.440,0:02:20.670
kind of means to make sure that all these[br]processors actually agree about what the

0:02:20.670,0:02:27.081
ISA specification is. So, what's formal[br]verification? Formal verification is a

0:02:27.081,0:02:32.960
super-broad term. In the context of this[br]talk, I'm talking about hardware model

0:02:32.960,0:02:38.230
checking. More specifically, I'm talking[br]about checking of so-called "safety

0:02:38.230,0:02:43.161
properties". So, we have some hardware[br]design and we have an initial state and we

0:02:43.161,0:02:48.540
would like to know if this hardware design[br]can reach a bad state from the initial

0:02:48.540,0:02:55.280
state. That's formally the problem that we[br]are trying to solve here. And there are

0:02:55.280,0:03:00.180
two means to do that, two different[br]categories of proofs that are bounded and

0:03:00.180,0:03:05.090
unbounded proofs, and with the bounded[br]proofs we only prove that it's impossible

0:03:05.090,0:03:07.720
to reach a bad state within a certain[br]number of cycles.

0:03:07.720,0:03:12.410
So, we give a maximum bound for the length[br]of a counterexample and with unbounded

0:03:12.410,0:03:18.240
proofs, we prove that a bad state can[br]actually never be reached. So, unbounded

0:03:18.240,0:03:23.060
proofs are, of course, better -- if you[br]can make an unbounded proof -- but in many

0:03:23.060,0:03:28.560
cases, this is very hard to achieve, but[br]bounded proofs is something that we can

0:03:28.560,0:03:35.880
do, so I'm talking about bounded proofs[br]here for the most part. So, what's end-to-

0:03:35.880,0:03:40.530
end formal verification? Because that's[br]also in my title. So, historically when

0:03:40.530,0:03:45.110
you formally verify something like a[br]processor, you break down the processor in

0:03:45.110,0:03:49.550
many small components and then you write[br]properties for each component and prove

0:03:49.550,0:03:54.510
for each component individually that they[br]adhere to the properties and then you make

0:03:54.510,0:03:59.290
a more abstract proof, that if you put in[br]system together from components that have

0:03:59.290,0:04:04.230
this property, then this system will have[br]the properties that you want. With end-to-

0:04:04.230,0:04:10.340
end verification, we treat the processors[br]one huge black box and just ask the

0:04:10.340,0:04:15.710
question "Does this one huge thing fit our[br]specification? Have the properties that we

0:04:15.710,0:04:22.190
want?" That has a couple of advantages:[br]It's much, much easier this way to take

0:04:22.190,0:04:26.460
one specification and port it from one[br]processor to another, because we don't

0:04:26.460,0:04:31.850
care about how the processor is built[br]internally, and it's much easier to take

0:04:31.850,0:04:35.970
the specification that we have and[br]actually match it to other specifications

0:04:35.970,0:04:39.520
of the ISA, because we have a[br]specification that says, what is the

0:04:39.520,0:04:45.190
overall behavior we expect from our[br]processor. But the big disadvantage, of

0:04:45.190,0:04:49.700
cause, is that it's computationally much[br]more expensive to do end-to-end formal

0:04:49.700,0:04:55.240
verifications and doing this end-to-end[br]verification of a processor against an ISA

0:04:55.240,0:04:59.140
specification is something that[br]historically was always fueled as the

0:04:59.140,0:05:03.550
textbook example of things that you can't[br]do with formal methods, but fortunately

0:05:03.550,0:05:08.670
the solvers... they became much better in[br]the last couple of years and now if we use

0:05:08.670,0:05:14.480
the right tricks, we can do stuff like[br]that with the solvers we have nowadays.

0:05:14.480,0:05:19.570
So, that's riscv-formal. riscv-formal is a[br]framework that allows us to do end-to-end

0:05:19.570,0:05:24.460
formal verification of RISC-V processors[br]against a formal version of the ISA

0:05:24.460,0:05:29.850
specification. So, riscv-formal is not a[br]formally verified processor. Instead, if

0:05:29.850,0:05:35.230
you happen to have a RISC-V processor, you[br]can use riscv-formal to prove that your

0:05:35.230,0:05:40.980
processor confirms to the ISA[br]specification. For the most part, this is

0:05:40.980,0:05:46.120
using bounded methods. Theoretically, you[br]could do unbounded proofs with riscv-

0:05:46.120,0:05:52.230
formal, but it's not the main use case.[br]So, it's good for what we call "bug

0:05:52.230,0:05:56.020
hunting", because maybe there is a[br]counterexample that would show, that the

0:05:56.020,0:06:02.510
processor could diverge from the desired[br]behavior with 1000 or 5000 cycles, but

0:06:02.510,0:06:06.669
usually, when you have something like a[br]processor and you can't reach a bad state

0:06:06.669,0:06:11.860
within the very short bounds, you have[br]high confidence that actually your

0:06:11.860,0:06:16.910
processor implements the ISA correctly.[br]So, if you have a processor and you would

0:06:16.910,0:06:22.470
like to integrate it with riscv-formal,[br]you need to do 2 things: You need to add a

0:06:22.470,0:06:27.610
special trace port to your processor; it's[br]called the RVFI trace port -- riscv-formal

0:06:27.610,0:06:31.790
interface trace port. And you have to[br]configure riscv-formal so that riscv-

0:06:31.790,0:06:39.520
formal understands the attributes of your[br]processor. So, for example, RISC-V is

0:06:39.520,0:06:44.770
available in a 32-bit and a 64-bit[br]version. You have to tell riscv-formal, if

0:06:44.770,0:06:51.020
you want to verify a 32-bit or 64-bit[br]processor. RISC-V is a modular ISA, so

0:06:51.020,0:06:54.120
there're a couple of extensions and you[br]have to tell riscv-formal, which

0:06:54.120,0:06:59.350
extensions your processor[br]actually implements.

0:06:59.350,0:07:04.510
And then there're a couple of other things[br]that are transparent for a userland

0:07:04.510,0:07:12.630
process, like if unaligned loads or stores[br]are supported by the hardware natively,

0:07:12.630,0:07:18.010
because RISC-V... the spec only says, that[br]when you do an unaligned load or store,

0:07:18.010,0:07:23.980
then a userspace program can expect this[br]load or store to succeed, but it might

0:07:23.980,0:07:28.400
take a long time, because there might be a[br]machine interrupt handler that is

0:07:28.400,0:07:34.949
emulating an unaligned load store by doing[br]alligned loads and stores, but if we do

0:07:34.949,0:07:40.850
this formal verification of the processor,[br]then the riscv-formal framework must be

0:07:40.850,0:07:44.990
aware: "What is the expected behavior for[br]your course?", "Should it trap when it

0:07:44.990,0:07:50.960
sees an unaligned load store or should it[br]just perform the load store unaligned?"

0:07:50.960,0:07:54.250
So, what does this interface look like[br]that you need to implement in your

0:07:54.250,0:07:59.490
processor if you would like to use riscv-[br]formal? This is the current version of the

0:07:59.490,0:08:03.930
riscv-formal interface. Right now, there[br]is no support for floating-point

0:08:03.930,0:08:09.150
instructions and there is no support for[br]CSRs, but this is on the to-do list, so

0:08:09.150,0:08:14.321
this interface will grow larger and larger[br]when we add these additional features, but

0:08:14.321,0:08:18.139
all these additional features will be[br]optional. And one of the reasons is that

0:08:18.139,0:08:22.400
some might implement just small[br]microcontrollers that actually don't have

0:08:22.400,0:08:29.510
floating-point cores or that don't have[br]support for the privileged specification,

0:08:29.510,0:08:35.328
so they don't have CSRs. Through this[br]interface, whenever the core retires an

0:08:35.328,0:08:40.750
instruction, it documents which[br]instruction it retired; so it tells us

0:08:40.750,0:08:45.829
"This is the instruction where I retired;[br]this was the program counter where I found

0:08:45.829,0:08:49.939
the instruction; this is the program[br]counter for the next instruction; these

0:08:49.939,0:08:53.809
are the registers that I read and these[br]are the values that I've observed in the

0:08:53.809,0:08:58.029
register file; this is the register that[br]I've written and this is the value that I

0:08:58.029,0:09:00.639
have written to the register file"[br]All that stuff.

0:09:00.639,0:09:06.949
So, in short, what we document through the[br]riscv-formal interface, is the part of the

0:09:06.949,0:09:12.269
processor state that is observed by an[br]instruction and the change to the state of

0:09:12.269,0:09:16.980
the processor that is performed by an[br]instruction -- like changes to the

0:09:16.980,0:09:24.309
register file or changes to the program[br]counter. And of course, most processors

0:09:24.309,0:09:28.980
actually are superscalar, even those[br]processors that say they're non-

0:09:28.980,0:09:33.579
superscalar. In-order pipelines usually[br]can do stuff like retire memory load

0:09:33.579,0:09:38.079
instructions out of order, in parallel to[br]another instruction that does not write

0:09:38.079,0:09:43.670
the register, things like that. So, even[br]with processors we usually don't think of

0:09:43.670,0:09:48.680
as superscalar processors, even with those[br]processors, we need the capability to

0:09:48.680,0:09:53.712
retire more than one instruction each[br]cycle and this can be done with this

0:09:53.712,0:10:03.839
"NRET" parameter. And we see all the ports[br]5 times wider if NRET is 5. Okay, so then

0:10:03.839,0:10:07.850
we have a processor that implements this[br]interface. What is the verification

0:10:07.850,0:10:14.269
strategy that riscv-formal follows in[br]order to do this proof to formally verify

0:10:14.269,0:10:20.720
that our processor's actually correct? So,[br]there is not one big proof that we run.

0:10:20.720,0:10:26.990
Instead, there is a large number of very[br]small proofs that we run. This is the most

0:10:26.990,0:10:32.559
important trick when it comes to this and[br]there are 2 categories of proofs: One

0:10:32.559,0:10:37.619
category is what I call the "instruction[br]checks". We have one of those proofs for

0:10:37.619,0:10:42.880
each instruction in the ISA specification[br]and each of the channels in the riscv-

0:10:42.880,0:10:48.470
formal interface. So, this is easily a[br]couple of hundred proofs right there

0:10:48.470,0:10:52.120
because you easily have 100 instructions[br]and if you have 2 channels, you already

0:10:52.120,0:10:58.470
have 200 proofs that you have to run. And[br]what this instruction checks do, they

0:10:58.470,0:11:04.079
reset the processor or they started a[br]symbolic state, if you would like to run a

0:11:04.079,0:11:09.199
unbounded proof, let the process run for a[br]certain number of cycles and then it

0:11:09.199,0:11:14.540
assumes that in the last cycle, the[br]processor will retire a certain

0:11:14.540,0:11:16.769
instruction. So, if this[br]check checks if the "add"

0:11:16.769,0:11:21.410
instruction works correctly, it assumes[br]that the last instruction retired in the

0:11:21.410,0:11:27.809
last cycle of this bounder checked will be[br]an "add" instruction and then it looks at

0:11:27.809,0:11:33.550
all the interfaces on the riscv-formal[br]interface, to make sure that this is

0:11:33.550,0:11:37.269
compliant with an "add" instruction. It[br]checks if the instruction has been decoded

0:11:37.269,0:11:42.220
correctly; it checks if the register value[br]we write to the register file is actually

0:11:42.220,0:11:47.479
the sum of the values we read from the[br]register file. All that kind of stuff.

0:11:47.479,0:11:51.030
But, of course, if you just have these[br]instruction checks, there is still a

0:11:51.030,0:11:56.410
certain verification gap, because the core[br]might lie to us: The core might say "Oh, I

0:11:56.410,0:12:00.019
write this value to the register file",[br]but then not write the value to the

0:12:00.019,0:12:06.160
register file, so we have to have a[br]separate set of proofs that do not look at

0:12:06.160,0:12:11.249
the entire riscv-formal interface in one[br]cycle, but look at only a small fraction

0:12:11.249,0:12:15.850
of the riscv- formal interface, but over a[br]span of cycles. So, for example, there is

0:12:15.850,0:12:20.199
one check that says "If I write the[br]register and then later I read the

0:12:20.199,0:12:24.930
register, I better read back the value[br]that I've written to the register file"

0:12:24.930,0:12:33.899
and this I call "consistency checks". So,[br]that's I think what I said already... So

0:12:33.899,0:12:40.170
for each instruction with riscv-formal, we[br]have a instruction model that looks like

0:12:40.170,0:12:45.920
that. So, these are 2 slides: The first[br]slides is just the interface there we have

0:12:45.920,0:12:51.939
a couple of signals from this riscv-formal[br]interface that we read like the

0:12:51.939,0:12:58.189
instruction that we are executing, the[br]program counter where we found this

0:12:58.189,0:13:04.079
instruction, the register values we read,[br]and then we have a couple of signals that

0:13:04.079,0:13:09.899
are generated by our specification that[br]are output of this specification module.

0:13:09.899,0:13:17.601
Like, which registers should we read?[br]Which registers should we write? What

0:13:17.601,0:13:22.129
values should we write to that register?[br]Stuff like that. So, that's the interface.

0:13:22.129,0:13:26.739
It's the same all the instructions and[br]then we have a body that looks more

0:13:26.739,0:13:31.190
like that. For all the instructions that[br]just decodes the instruction, checks if

0:13:31.190,0:13:34.780
this is actually the instruction the check[br]is for. So, in this case it's an "add

0:13:34.780,0:13:42.030
immediate" instruction. And then you have[br]things like the line near the bottom above

0:13:42.030,0:13:46.290
"default assignments": "assign[br]spec_pc_wdata", for example, says "Okay,

0:13:46.290,0:13:54.680
the next PC must be 4 bytes later than the[br]PC for this instruction. We must increment

0:13:54.680,0:13:58.860
the program counter by a value of 4 when[br]we execute this instruction." Things like

0:13:58.860,0:14:09.509
that. So, you might see, there is no[br]assert here, there are no assertions,

0:14:09.509,0:14:13.639
because this is just the model of what[br]kind of behavior we would expect. And then

0:14:13.639,0:14:17.769
there is a wrapper that instantiates this[br]and instantiates the call and builds the

0:14:17.769,0:14:23.079
proof and there are the assertions. The[br]main reason why we don't have assertions

0:14:23.079,0:14:27.819
here, but instead we output the desired[br]behavior here, is because I can also

0:14:27.819,0:14:32.749
generate monitor cores that can run[br]alongside your core and check in

0:14:32.749,0:14:38.160
simulation or an emulation, an FPGA, if[br]your core is doing the right thing. That

0:14:38.160,0:14:43.939
can be very, very helpful if you have a[br]situation where you run your core for

0:14:43.939,0:14:49.940
maybe days and then you have some[br]observable behavior that's not right, but

0:14:49.940,0:14:53.709
maybe there are thousands, even million,[br]cycles between the point where you can

0:14:53.709,0:14:58.139
observe that something is wrong and the[br]point where the process actually started

0:14:58.139,0:15:03.379
diverging from what the specification said[br]and if you can use a monitor core like

0:15:03.379,0:15:10.230
that, then it's much easier to find bugs[br]like this. Okay, so some examples of those

0:15:10.230,0:15:15.410
consistency checks.; the list is actually[br]not complete and it varies a little bit

0:15:15.410,0:15:20.079
from processor to processor what kind of[br]consistency checks we can actually run

0:15:20.079,0:15:26.480
with the processor we are looking at.[br]There is a check if the program counter

0:15:26.480,0:15:30.100
for one instruction - so I have an[br]instruction it says "This is the program

0:15:30.100,0:15:32.339
counter for the instruction and this is[br]the program counter for the next

0:15:32.339,0:15:37.329
instruction" -- and then we can look at[br]the next instruction and we can see is...

0:15:37.329,0:15:40.920
the program counter for that instruction[br]actually approved the next program counter

0:15:40.920,0:15:46.369
value for the previous instruction and[br]they must link together like that, but the

0:15:46.369,0:15:52.129
core might retire instructions out of[br]order. So it might be, that we see the

0:15:52.129,0:15:55.649
first instruction for us and then the[br]second instruction later, but it's also

0:15:55.649,0:15:58.670
possible that we will see the second[br]instruction first and then the first

0:15:58.670,0:16:03.579
instruction later and because of that,[br]there're 2 different checks: One for a

0:16:03.579,0:16:08.939
pair in the non-reversed order and for a[br]pair of instruction in the reversed order.

0:16:08.939,0:16:13.480
There is one check that checks, if[br]register value reads and writes are

0:16:13.480,0:16:21.049
consistent. There is one check that sees,[br]if the processor is alive, so when I give

0:16:21.049,0:16:29.139
the processor certain fairness constrains,[br]that the memory will always return, memory

0:16:29.139,0:16:32.600
reads at number of cycles, things like[br]that, then I can use this to prove that

0:16:32.600,0:16:37.459
the process will not just suddenly freeze.[br]This is very important and this will also

0:16:37.459,0:16:41.459
prove that the processor is not skipping[br]instruction indices, which is very

0:16:41.459,0:16:45.620
important, because some of the other[br]checks actually depend on the processor

0:16:45.620,0:16:51.100
behaving in this way. And so forth. So,[br]there are couple of these consistency

0:16:51.100,0:16:55.379
checks and it's a nice exercise to sit[br]down in a group of people and go through

0:16:55.379,0:17:00.679
the list of consistency checks and see[br]which which set of them actually is

0:17:00.679,0:17:05.799
meaningful or which set of them actually[br]leaves an interesting verification gap and

0:17:05.799,0:17:11.789
we still need to add checks for this or[br]that processor then. Okay, so what kind of

0:17:11.789,0:17:19.949
bugs can it find? That's a super-hard[br]question, because it's really hard to give

0:17:19.949,0:17:26.209
a complete list. It can definitely find[br]incorrect single-threaded instruction

0:17:26.209,0:17:28.880
semantics. So, if you[br]just implement a instruction

0:17:28.880,0:17:34.350
incorrectly in your core, then this will[br]find it; no question about it. It can find

0:17:34.350,0:17:38.740
a lot of bugs and things like bypassing[br]and forwarding and pipeline interlocks,

0:17:38.740,0:17:46.600
things like that. Things where you reorder[br]stuff in a way you shouldn't reorder them,

0:17:46.600,0:17:51.950
freezes if you have this lifeness check,[br]some bugs related to memory interfaces and

0:17:51.950,0:17:58.610
load/store consistency and things like[br]that, but that depends on things like the

0:17:58.610,0:18:04.110
size of your cache lines, if this is a[br]feasible proof or not. Bugs that we can't

0:18:04.110,0:18:09.280
find yet with riscv-formal are things that[br]are not yet covered with the riscv-formal

0:18:09.280,0:18:13.179
interface, like the floating-point stuff[br]or CSRs, but this is all on the to-do

0:18:13.179,0:18:18.280
list, so they are actively working on that[br]and a year from now, this stuff will be

0:18:18.280,0:18:24.650
included. And anything related to[br]concurrency between multiple heats. So

0:18:24.650,0:18:28.210
far, my excuse for that was, that the[br]RISC-V memory model is not completely

0:18:28.210,0:18:33.789
specified yet, so I would not actually[br]know what to check exactly, but right now

0:18:33.789,0:18:38.559
the RISC-V memory model is in the process[br]of being finalized, so I won't have this

0:18:38.559,0:18:45.740
excuse for much, much longer. So, the[br]processors currently supported PicoRV32,

0:18:45.740,0:18:50.740
which is my own processor, then RISC-V[br]Rocket, which is probably the most famous

0:18:50.740,0:18:56.519
RISC-V implementation, and VexRiscv. And[br]there are also a couple of others, but

0:18:56.519,0:19:04.039
they are not part of the open source[br]release of riscv-formal. So, if you would

0:19:04.039,0:19:11.159
like to add support to riscv-formal for[br]your RISC-V processor, then just check out

0:19:11.159,0:19:15.950
the riscv-formal repository, look at the[br]"cores" directory, see which of the

0:19:15.950,0:19:20.210
supported cores's most closely to the code[br]that you actually have and then just copy

0:19:20.210,0:19:27.800
that directory and make a couple of small[br]modifications. So, I have a few minutes

0:19:27.800,0:19:33.980
left to talk about things like cut-points[br]and blackboxes and other abstractions. So,

0:19:33.980,0:19:37.919
the title of this slide could just be[br]"abstractions", because cut-points and

0:19:37.919,0:19:41.760
blackboxes are just abstractions.[br]The idea behind an abstraction and formal

0:19:41.760,0:19:48.221
methods is that I switch out part of my[br]design with a different part with a

0:19:48.221,0:19:54.639
different circuit that is less[br]constrained, it includes the behavior of

0:19:54.639,0:19:59.040
the original circuit, but might do other[br]stuff as well. So, the textbook example

0:19:59.040,0:20:03.340
would be, I have a design with a counter[br]and usually the counter would just

0:20:03.340,0:20:08.610
increment in steps of 1, but now I create[br]an abstraction that can skip numbers and

0:20:08.610,0:20:15.860
will just increment in strictly increasing[br]steps. And this, of course, includes the

0:20:15.860,0:20:20.039
behavior of the original design, so if I[br]can prove a property with this abstraction

0:20:20.039,0:20:24.889
in place instead of the just-increment-[br]by-1 counter, then we have proven even a

0:20:24.889,0:20:29.620
stronger property and that includes the[br]same property for the thing in the

0:20:29.620,0:20:35.690
original design. And actually this idea of[br]abstractions works very well with riscv-

0:20:35.690,0:20:40.950
formal. So, the main reason why we do[br]abstractions is because it leads to easier

0:20:40.950,0:20:46.540
proofs. So, for example, consider an[br]instruction checker that just checks if

0:20:46.540,0:20:52.280
the core implements the "add" instruction[br]correctly. For this checker, we don't

0:20:52.280,0:20:56.020
actually need a register file that's[br]working. We could replace the register

0:20:56.020,0:21:00.580
file by something that just ignores all[br]writes to it and whenever we read

0:21:00.580,0:21:04.289
something from the register file, it[br]returns an arbitrary value. That would

0:21:04.289,0:21:09.870
still include the behavior of a core with[br]a functional register file, but, because

0:21:09.870,0:21:13.610
the instruction checker does not care[br]about consistency between register file

0:21:13.610,0:21:17.929
writes and register file reads, we can[br]still prove that the instruction is

0:21:17.929,0:21:22.889
implemented correctly, and therefore we[br]get an easier proof. Of course, we can't

0:21:22.889,0:21:27.100
use this abstraction for all those proofs,[br]because the other proofs that actually

0:21:27.100,0:21:33.330
check if my register file works as I would[br]expect it to work, but if we go through

0:21:33.330,0:21:37.429
the list of proofs and we run all these[br]proofs independently, then you will see

0:21:37.429,0:21:41.960
that for each of them, it's possible to[br]abstract away a large portion of your

0:21:41.960,0:21:47.380
processor and therefore yield[br]and an easier proof.

0:21:47.380,0:21:51.169
Depending on what kind of solvers you use,[br]some solvers're actually very capable of

0:21:51.169,0:21:54.779
finding these kind of abstractions[br]themselves. So in that cases, it doesn't

0:21:54.779,0:21:59.470
really help by adding these abstractions[br]manually, but just realizing that the

0:21:59.470,0:22:04.490
potential for these abstractions is there,[br]is something that's very useful when

0:22:04.490,0:22:09.020
guiding your decisions how to split up a[br]large verification problem into smaller

0:22:09.020,0:22:12.610
verification problems, because you would[br]like to split up the problem in a way so

0:22:12.610,0:22:17.100
that the solver is always capable of[br]finding useful abstractions that actually

0:22:17.100,0:22:27.700
lead to easier circuits to prove. With a[br]bounded check, we also have the questions

0:22:27.700,0:22:32.850
of what bounds do we use. Of course,[br]larger bounds are better, but larger

0:22:32.850,0:22:41.860
bounds also yield something that is harder[br]to compute, and if you have a small bound,

0:22:41.860,0:22:45.340
then you have a proof that runs very, very[br]quickly, but maybe you're not very

0:22:45.340,0:22:49.970
confident that it actually has proven[br]something that's relevant for you. So, I

0:22:49.970,0:22:57.350
propose 2 solutions for this: The first[br]solution is, you can use the same solvers

0:22:57.350,0:23:02.490
to find traces that cover certain events[br]and you could write a list and say "I

0:23:02.490,0:23:07.409
would like to see 1 memory read and 1[br]memory write and at least one ALU

0:23:07.409,0:23:10.690
instruction executed" and things like[br]that. And then, you can ask the solver

0:23:10.690,0:23:18.830
"What is the shortest trace that would[br]actually satisfy all this stuff?" And when

0:23:18.830,0:23:24.570
that's a trace of, say 25 cycles, then you[br]know "Okay, when I look at a prove that's

0:23:24.570,0:23:29.789
25 cyclic deep, I know at least these are[br]the cases that are going to be covered."

0:23:29.789,0:23:34.730
But more important, I think, is: Usually[br]when you have a processor, you already

0:23:34.730,0:23:41.269
found bugs and it's a good idea to not[br]just fix the bugs and forget about them,

0:23:41.269,0:23:46.610
but preserve some way of re-introducing[br]the bugs, just to see if your testing

0:23:46.610,0:23:51.509
framework works. So,[br]if you have already a couple of bugs

0:23:51.509,0:23:55.120
and you know "It took me a week to find[br]that and took me a month to find that",

0:23:55.120,0:24:00.549
the best thing is to just add the bugs to[br]your design again and see "What are the

0:24:00.549,0:24:04.610
bounds that are necessary for riscv-formal[br]to actually discover those bugs" and then

0:24:04.610,0:24:09.009
you will have some degree of confidence[br]that other similar bugs would also have

0:24:09.009,0:24:15.779
been found with the same bounds. So,[br]"Results": I have found bugs in pretty much

0:24:15.779,0:24:23.029
every implementation I looked at; I found[br]bugs in all 3 processors; we found bugs in

0:24:23.029,0:24:28.139
Spike, which is the official implementation[br]of RISC-V in C and I found

0:24:28.139,0:24:33.760
a way to formally verify my specification[br]against Spike and in some cases where I found

0:24:33.760,0:24:37.250
the difference between my specification[br]and Spike, it turned out, it

0:24:37.250,0:24:41.830
was actually a bug in the English language[br]specification. So, because of that, I also

0:24:41.830,0:24:47.950
found bugs in the English language[br]specification with riscv-formal. "Future

0:24:47.950,0:24:53.519
work": Multipliers already supported,[br]floating-point is still on the to-do list,

0:24:53.519,0:25:01.600
64-bit is half done. We would like to add[br]support for CSRs. We would like to add

0:25:01.600,0:25:06.120
support for more cores, but this is[br]something that I would like to do slowly,

0:25:06.120,0:25:10.649
because adding more cores also means we[br]have less flexibility with changing

0:25:10.649,0:25:17.970
things. And better integration with non-[br]free tools, because right now all of that

0:25:17.970,0:25:22.600
runs with open source tools that I also[br]happen to write -- so, I wrote those

0:25:22.600,0:25:27.580
tools, but some people actually don't want[br]to use my open source tools; they would

0:25:27.580,0:25:31.260
like to use the commercial tools -- and[br]it's the to-do list that I have better

0:25:31.260,0:25:35.750
integration with those tools. Maybe,[br]because I don't get licenses to those

0:25:35.750,0:25:41.049
tools, so we will see how this works.[br]That's it. Do we have still time for

0:25:41.049,0:26:02.709
questions?[br]<i>applause</i>

0:26:02.709,0:26:05.380
Clifford: So, I'd say we start with[br]questions at 1.

0:26:05.380,0:26:11.240
Herald: I'm sorry; here we go. We have 2[br]questions; we have time for 2 questions

0:26:11.240,0:26:15.720
and we're going to start with microphone[br]number 1, please.

0:26:15.720,0:26:22.779
M1: Hello, thanks for your talk and for[br]your work. First question: You told about

0:26:22.779,0:26:26.690
riscv-formal interface.[br]Clifford: Yes.

0:26:26.690,0:26:34.199
M1: So, does vendor ship the final[br]processor with this interface available?

0:26:34.199,0:26:39.529
Clifford: Oh, that's a great question,[br]thank you. This interface has only output

0:26:39.529,0:26:45.290
ports and actually when you implement this[br]interface, you should not add something to

0:26:45.290,0:26:49.230
your design that's not needed to generate[br]those output ports. So, what you can do

0:26:49.230,0:26:53.510
is, you can take the version of your core[br]with that interface, the version of the

0:26:53.510,0:26:58.480
core without that interface. Then, in your[br]synthesis script, just remove those output

0:26:58.480,0:27:03.720
ports and then run a formal equivalence[br]check between that version and the version

0:27:03.720,0:27:10.820
that you actually deploy on your ASIC.[br]M1: Thanks; and one short question: When

0:27:10.820,0:27:17.381
people say formal verification, usually[br]others think "Oh, if it is verified, it is

0:27:17.381,0:27:26.220
excellent, absolutely excellent. Do you[br]plan to say that it will find all the

0:27:26.220,0:27:30.649
erato for the processor?[br]Clifford: Well, it depends on what kind of

0:27:30.649,0:27:35.990
proof you run. The most work I do is with[br]bounded proofs and there you only get a

0:27:35.990,0:27:40.830
certain degree of confidence, because you[br]only see bugs that can occur within a

0:27:40.830,0:27:46.160
certain number of cycles from reset, but[br]if you want, you can also run a complete

0:27:46.160,0:27:52.539
proof, where you start with a symbolic[br]state instead of a reset state and then

0:27:52.539,0:27:57.510
you can can make sure that you actually[br]check the entire reachable state space,

0:27:57.510,0:28:01.509
but that's a very, very hard thing to do[br]So, that's not a weekend project. Just

0:28:01.509,0:28:05.730
adding the riscv-formal interface and[br]running some bounded proofs is probably a

0:28:05.730,0:28:10.560
weekend project if you already have[br]your RISC-V processor.

0:28:10.560,0:28:13.400
M1: Thank you.[br]Herald: Thank you. We actually do not have

0:28:13.400,0:28:17.919
time for any more questions, but there[br]will be time after the talk to ask you

0:28:17.919,0:28:22.130
questions... maybe?[br]Clifford: Yeah. So, maybe you can find me

0:28:22.130,0:28:27.390
at the open FPGA assembly, which is part[br]of the hardware hacking area.

0:28:27.390,0:28:32.740
Herald: Super. Very great job to put that[br]much information into 30 minutes. Please

0:28:32.740,0:28:35.830
help me thank Cliff[br]for his wonderful talk.

0:28:35.830,0:28:44.232
<i>applause</i>

0:28:44.232,0:28:49.447
<i>34c3 outro</i>

0:28:49.447,0:29:06.000
subtitles created by c3subtitles.de[br]in the year 2018. Join, and help us!