WEBVTT

00:00:00.000 --> 00:00:15.740
<i>34c3 intro</i>

00:00:15.740 --> 00:00:22.800
Herald: The next talk is called "End-to-
end formal ISA verification of RISC-V

00:00:22.800 --> 00:00:28.410
processors with riscv-formal". I have no
idea what this means, but I'm very excited

00:00:28.410 --> 00:00:33.540
to understand what it means and Clifford
promised he's going to make sure everyone

00:00:33.540 --> 00:00:39.320
will. Clifford has been very known in the
open-source and free-software... free-

00:00:39.320 --> 00:00:45.219
source... oh my gosh... free-and-open-
source community and especially he's known

00:00:45.219 --> 00:00:50.300
for the project IceStorm. Please help me
welcome Clifford!

00:00:50.300 --> 00:00:57.260
<i>applause</i>

00:00:57.260 --> 00:01:12.170
<i>inaudible</i>
Clifford: RISC-V is an open instruction

00:01:12.170 --> 00:01:17.189
set architecture. It's an open ISA, so
it's not a processor, but it's a processor

00:01:17.189 --> 00:01:22.860
specification -- an ISA specification --
that is free to use for everyone and if

00:01:22.860 --> 00:01:26.789
you happen to have already implemented
your own processor at one point in time,

00:01:26.789 --> 00:01:30.880
you might know that it's actually much
easier to implement a processor than to

00:01:30.880 --> 00:01:35.560
implement all the tools that you need to
compile programs for your processor. And

00:01:35.560 --> 00:01:39.750
if you use something like RISC-V, then you
can reuse the tools that are already out

00:01:39.750 --> 00:01:45.170
there, so that's a great benefit. However,
for this endeavor we need processors that

00:01:45.170 --> 00:01:50.280
actually really compatible to each other:
Processors that implement the RISC-V ISA

00:01:50.280 --> 00:01:54.970
correctly. So, with many other ISAs, we
start with one processor and we say "Oh,

00:01:54.970 --> 00:01:59.660
that's the processor" and later on, we
figure out there was a bug, and what

00:01:59.660 --> 00:02:03.049
people sometimes do is, they just change
the specification, so that the

00:02:03.049 --> 00:02:06.770
specification all fits the hardware they
actually have. We can't do something like

00:02:06.770 --> 00:02:09.860
that with RISC-V, but there are many
implementations out there, all being

00:02:09.860 --> 00:02:14.440
developed in parallel to fit the same
specification, so we want to have some

00:02:14.440 --> 00:02:20.670
kind of means to make sure that all these
processors actually agree about what the

00:02:20.670 --> 00:02:27.081
ISA specification is. So, what's formal
verification? Formal verification is a

00:02:27.081 --> 00:02:32.960
super-broad term. In the context of this
talk, I'm talking about hardware model

00:02:32.960 --> 00:02:38.230
checking. More specifically, I'm talking
about checking of so-called "safety

00:02:38.230 --> 00:02:43.161
properties". So, we have some hardware
design and we have an initial state and we

00:02:43.161 --> 00:02:48.540
would like to know if this hardware design
can reach a bad state from the initial

00:02:48.540 --> 00:02:55.280
state. That's formally the problem that we
are trying to solve here. And there are

00:02:55.280 --> 00:03:00.180
two means to do that, two different
categories of proofs that are bounded and

00:03:00.180 --> 00:03:05.090
unbounded proofs, and with the bounded
proofs we only prove that it's impossible

00:03:05.090 --> 00:03:07.720
to reach a bad state within a certain
number of cycles.

00:03:07.720 --> 00:03:12.410
So, we give a maximum bound for the length
of a counterexample and with unbounded

00:03:12.410 --> 00:03:18.240
proofs, we prove that a bad state can
actually never be reached. So, unbounded

00:03:18.240 --> 00:03:23.060
proofs are, of course, better -- if you
can make an unbounded proof -- but in many

00:03:23.060 --> 00:03:28.560
cases, this is very hard to achieve, but
bounded proofs is something that we can

00:03:28.560 --> 00:03:35.880
do, so I'm talking about bounded proofs
here for the most part. So, what's end-to-

00:03:35.880 --> 00:03:40.530
end formal verification? Because that's
also in my title. So, historically when

00:03:40.530 --> 00:03:45.110
you formally verify something like a
processor, you break down the processor in

00:03:45.110 --> 00:03:49.550
many small components and then you write
properties for each component and prove

00:03:49.550 --> 00:03:54.510
for each component individually that they
adhere to the properties and then you make

00:03:54.510 --> 00:03:59.290
a more abstract proof, that if you put in
system together from components that have

00:03:59.290 --> 00:04:04.230
this property, then this system will have
the properties that you want. With end-to-

00:04:04.230 --> 00:04:10.340
end verification, we treat the processors
one huge black box and just ask the

00:04:10.340 --> 00:04:15.710
question "Does this one huge thing fit our
specification? Have the properties that we

00:04:15.710 --> 00:04:22.190
want?" That has a couple of advantages:
It's much, much easier this way to take

00:04:22.190 --> 00:04:26.460
one specification and port it from one
processor to another, because we don't

00:04:26.460 --> 00:04:31.850
care about how the processor is built
internally, and it's much easier to take

00:04:31.850 --> 00:04:35.970
the specification that we have and
actually match it to other specifications

00:04:35.970 --> 00:04:39.520
of the ISA, because we have a
specification that says, what is the

00:04:39.520 --> 00:04:45.190
overall behavior we expect from our
processor. But the big disadvantage, of

00:04:45.190 --> 00:04:49.700
cause, is that it's computationally much
more expensive to do end-to-end formal

00:04:49.700 --> 00:04:55.240
verifications and doing this end-to-end
verification of a processor against an ISA

00:04:55.240 --> 00:04:59.140
specification is something that
historically was always fueled as the

00:04:59.140 --> 00:05:03.550
textbook example of things that you can't
do with formal methods, but fortunately

00:05:03.550 --> 00:05:08.670
the solvers... they became much better in
the last couple of years and now if we use

00:05:08.670 --> 00:05:14.480
the right tricks, we can do stuff like
that with the solvers we have nowadays.

00:05:14.480 --> 00:05:19.570
So, that's riscv-formal. riscv-formal is a
framework that allows us to do end-to-end

00:05:19.570 --> 00:05:24.460
formal verification of RISC-V processors
against a formal version of the ISA

00:05:24.460 --> 00:05:29.850
specification. So, riscv-formal is not a
formally verified processor. Instead, if

00:05:29.850 --> 00:05:35.230
you happen to have a RISC-V processor, you
can use riscv-formal to prove that your

00:05:35.230 --> 00:05:40.980
processor confirms to the ISA
specification. For the most part, this is

00:05:40.980 --> 00:05:46.120
using bounded methods. Theoretically, you
could do unbounded proofs with riscv-

00:05:46.120 --> 00:05:52.230
formal, but it's not the main use case.
So, it's good for what we call "bug

00:05:52.230 --> 00:05:56.020
hunting", because maybe there is a
counterexample that would show, that the

00:05:56.020 --> 00:06:02.510
processor could diverge from the desired
behavior with 1000 or 5000 cycles, but

00:06:02.510 --> 00:06:06.669
usually, when you have something like a
processor and you can't reach a bad state

00:06:06.669 --> 00:06:11.860
within the very short bounds, you have
high confidence that actually your

00:06:11.860 --> 00:06:16.910
processor implements the ISA correctly.
So, if you have a processor and you would

00:06:16.910 --> 00:06:22.470
like to integrate it with riscv-formal,
you need to do 2 things: You need to add a

00:06:22.470 --> 00:06:27.610
special trace port to your processor; it's
called the RVFI trace port -- riscv-formal

00:06:27.610 --> 00:06:31.790
interface trace port. And you have to
configure riscv-formal so that riscv-

00:06:31.790 --> 00:06:39.520
formal understands the attributes of your
processor. So, for example, RISC-V is

00:06:39.520 --> 00:06:44.770
available in a 32-bit and a 64-bit
version. You have to tell riscv-formal, if

00:06:44.770 --> 00:06:51.020
you want to verify a 32-bit or 64-bit
processor. RISC-V is a modular ISA, so

00:06:51.020 --> 00:06:54.120
there're a couple of extensions and you
have to tell riscv-formal, which

00:06:54.120 --> 00:06:59.350
extensions your processor
actually implements.

00:06:59.350 --> 00:07:04.510
And then there're a couple of other things
that are transparent for a userland

00:07:04.510 --> 00:07:12.630
process, like if unaligned loads or stores
are supported by the hardware natively,

00:07:12.630 --> 00:07:18.010
because RISC-V... the spec only says, that
when you do an unaligned load or store,

00:07:18.010 --> 00:07:23.980
then a userspace program can expect this
load or store to succeed, but it might

00:07:23.980 --> 00:07:28.400
take a long time, because there might be a
machine interrupt handler that is

00:07:28.400 --> 00:07:34.949
emulating an unaligned load store by doing
alligned loads and stores, but if we do

00:07:34.949 --> 00:07:40.850
this formal verification of the processor,
then the riscv-formal framework must be

00:07:40.850 --> 00:07:44.990
aware: "What is the expected behavior for
your course?", "Should it trap when it

00:07:44.990 --> 00:07:50.960
sees an unaligned load store or should it
just perform the load store unaligned?"

00:07:50.960 --> 00:07:54.250
So, what does this interface look like
that you need to implement in your

00:07:54.250 --> 00:07:59.490
processor if you would like to use riscv-
formal? This is the current version of the

00:07:59.490 --> 00:08:03.930
riscv-formal interface. Right now, there
is no support for floating-point

00:08:03.930 --> 00:08:09.150
instructions and there is no support for
CSRs, but this is on the to-do list, so

00:08:09.150 --> 00:08:14.321
this interface will grow larger and larger
when we add these additional features, but

00:08:14.321 --> 00:08:18.139
all these additional features will be
optional. And one of the reasons is that

00:08:18.139 --> 00:08:22.400
some might implement just small
microcontrollers that actually don't have

00:08:22.400 --> 00:08:29.510
floating-point cores or that don't have
support for the privileged specification,

00:08:29.510 --> 00:08:35.328
so they don't have CSRs. Through this
interface, whenever the core retires an

00:08:35.328 --> 00:08:40.750
instruction, it documents which
instruction it retired; so it tells us

00:08:40.750 --> 00:08:45.829
"This is the instruction where I retired;
this was the program counter where I found

00:08:45.829 --> 00:08:49.939
the instruction; this is the program
counter for the next instruction; these

00:08:49.939 --> 00:08:53.809
are the registers that I read and these
are the values that I've observed in the

00:08:53.809 --> 00:08:58.029
register file; this is the register that
I've written and this is the value that I

00:08:58.029 --> 00:09:00.639
have written to the register file"
All that stuff.

00:09:00.639 --> 00:09:06.949
So, in short, what we document through the
riscv-formal interface, is the part of the

00:09:06.949 --> 00:09:12.269
processor state that is observed by an
instruction and the change to the state of

00:09:12.269 --> 00:09:16.980
the processor that is performed by an
instruction -- like changes to the

00:09:16.980 --> 00:09:24.309
register file or changes to the program
counter. And of course, most processors

00:09:24.309 --> 00:09:28.980
actually are superscalar, even those
processors that say they're non-

00:09:28.980 --> 00:09:33.579
superscalar. In-order pipelines usually
can do stuff like retire memory load

00:09:33.579 --> 00:09:38.079
instructions out of order, in parallel to
another instruction that does not write

00:09:38.079 --> 00:09:43.670
the register, things like that. So, even
with processors we usually don't think of

00:09:43.670 --> 00:09:48.680
as superscalar processors, even with those
processors, we need the capability to

00:09:48.680 --> 00:09:53.712
retire more than one instruction each
cycle and this can be done with this

00:09:53.712 --> 00:10:03.839
"NRET" parameter. And we see all the ports
5 times wider if NRET is 5. Okay, so then

00:10:03.839 --> 00:10:07.850
we have a processor that implements this
interface. What is the verification

00:10:07.850 --> 00:10:14.269
strategy that riscv-formal follows in
order to do this proof to formally verify

00:10:14.269 --> 00:10:20.720
that our processor's actually correct? So,
there is not one big proof that we run.

00:10:20.720 --> 00:10:26.990
Instead, there is a large number of very
small proofs that we run. This is the most

00:10:26.990 --> 00:10:32.559
important trick when it comes to this and
there are 2 categories of proofs: One

00:10:32.559 --> 00:10:37.619
category is what I call the "instruction
checks". We have one of those proofs for

00:10:37.619 --> 00:10:42.880
each instruction in the ISA specification
and each of the channels in the riscv-

00:10:42.880 --> 00:10:48.470
formal interface. So, this is easily a
couple of hundred proofs right there

00:10:48.470 --> 00:10:52.120
because you easily have 100 instructions
and if you have 2 channels, you already

00:10:52.120 --> 00:10:58.470
have 200 proofs that you have to run. And
what this instruction checks do, they

00:10:58.470 --> 00:11:04.079
reset the processor or they started a
symbolic state, if you would like to run a

00:11:04.079 --> 00:11:09.199
unbounded proof, let the process run for a
certain number of cycles and then it

00:11:09.199 --> 00:11:14.540
assumes that in the last cycle, the
processor will retire a certain

00:11:14.540 --> 00:11:16.769
instruction. So, if this
check checks if the "add"

00:11:16.769 --> 00:11:21.410
instruction works correctly, it assumes
that the last instruction retired in the

00:11:21.410 --> 00:11:27.809
last cycle of this bounder checked will be
an "add" instruction and then it looks at

00:11:27.809 --> 00:11:33.550
all the interfaces on the riscv-formal
interface, to make sure that this is

00:11:33.550 --> 00:11:37.269
compliant with an "add" instruction. It
checks if the instruction has been decoded

00:11:37.269 --> 00:11:42.220
correctly; it checks if the register value
we write to the register file is actually

00:11:42.220 --> 00:11:47.479
the sum of the values we read from the
register file. All that kind of stuff.

00:11:47.479 --> 00:11:51.030
But, of course, if you just have these
instruction checks, there is still a

00:11:51.030 --> 00:11:56.410
certain verification gap, because the core
might lie to us: The core might say "Oh, I

00:11:56.410 --> 00:12:00.019
write this value to the register file",
but then not write the value to the

00:12:00.019 --> 00:12:06.160
register file, so we have to have a
separate set of proofs that do not look at

00:12:06.160 --> 00:12:11.249
the entire riscv-formal interface in one
cycle, but look at only a small fraction

00:12:11.249 --> 00:12:15.850
of the riscv- formal interface, but over a
span of cycles. So, for example, there is

00:12:15.850 --> 00:12:20.199
one check that says "If I write the
register and then later I read the

00:12:20.199 --> 00:12:24.930
register, I better read back the value
that I've written to the register file"

00:12:24.930 --> 00:12:33.899
and this I call "consistency checks". So,
that's I think what I said already... So

00:12:33.899 --> 00:12:40.170
for each instruction with riscv-formal, we
have a instruction model that looks like

00:12:40.170 --> 00:12:45.920
that. So, these are 2 slides: The first
slides is just the interface there we have

00:12:45.920 --> 00:12:51.939
a couple of signals from this riscv-formal
interface that we read like the

00:12:51.939 --> 00:12:58.189
instruction that we are executing, the
program counter where we found this

00:12:58.189 --> 00:13:04.079
instruction, the register values we read,
and then we have a couple of signals that

00:13:04.079 --> 00:13:09.899
are generated by our specification that
are output of this specification module.

00:13:09.899 --> 00:13:17.601
Like, which registers should we read?
Which registers should we write? What

00:13:17.601 --> 00:13:22.129
values should we write to that register?
Stuff like that. So, that's the interface.

00:13:22.129 --> 00:13:26.739
It's the same all the instructions and
then we have a body that looks more

00:13:26.739 --> 00:13:31.190
like that. For all the instructions that
just decodes the instruction, checks if

00:13:31.190 --> 00:13:34.780
this is actually the instruction the check
is for. So, in this case it's an "add

00:13:34.780 --> 00:13:42.030
immediate" instruction. And then you have
things like the line near the bottom above

00:13:42.030 --> 00:13:46.290
"default assignments": "assign
spec_pc_wdata", for example, says "Okay,

00:13:46.290 --> 00:13:54.680
the next PC must be 4 bytes later than the
PC for this instruction. We must increment

00:13:54.680 --> 00:13:58.860
the program counter by a value of 4 when
we execute this instruction." Things like

00:13:58.860 --> 00:14:09.509
that. So, you might see, there is no
assert here, there are no assertions,

00:14:09.509 --> 00:14:13.639
because this is just the model of what
kind of behavior we would expect. And then

00:14:13.639 --> 00:14:17.769
there is a wrapper that instantiates this
and instantiates the call and builds the

00:14:17.769 --> 00:14:23.079
proof and there are the assertions. The
main reason why we don't have assertions

00:14:23.079 --> 00:14:27.819
here, but instead we output the desired
behavior here, is because I can also

00:14:27.819 --> 00:14:32.749
generate monitor cores that can run
alongside your core and check in

00:14:32.749 --> 00:14:38.160
simulation or an emulation, an FPGA, if
your core is doing the right thing. That

00:14:38.160 --> 00:14:43.939
can be very, very helpful if you have a
situation where you run your core for

00:14:43.939 --> 00:14:49.940
maybe days and then you have some
observable behavior that's not right, but

00:14:49.940 --> 00:14:53.709
maybe there are thousands, even million,
cycles between the point where you can

00:14:53.709 --> 00:14:58.139
observe that something is wrong and the
point where the process actually started

00:14:58.139 --> 00:15:03.379
diverging from what the specification said
and if you can use a monitor core like

00:15:03.379 --> 00:15:10.230
that, then it's much easier to find bugs
like this. Okay, so some examples of those

00:15:10.230 --> 00:15:15.410
consistency checks.; the list is actually
not complete and it varies a little bit

00:15:15.410 --> 00:15:20.079
from processor to processor what kind of
consistency checks we can actually run

00:15:20.079 --> 00:15:26.480
with the processor we are looking at.
There is a check if the program counter

00:15:26.480 --> 00:15:30.100
for one instruction - so I have an
instruction it says "This is the program

00:15:30.100 --> 00:15:32.339
counter for the instruction and this is
the program counter for the next

00:15:32.339 --> 00:15:37.329
instruction" -- and then we can look at
the next instruction and we can see is...

00:15:37.329 --> 00:15:40.920
the program counter for that instruction
actually approved the next program counter

00:15:40.920 --> 00:15:46.369
value for the previous instruction and
they must link together like that, but the

00:15:46.369 --> 00:15:52.129
core might retire instructions out of
order. So it might be, that we see the

00:15:52.129 --> 00:15:55.649
first instruction for us and then the
second instruction later, but it's also

00:15:55.649 --> 00:15:58.670
possible that we will see the second
instruction first and then the first

00:15:58.670 --> 00:16:03.579
instruction later and because of that,
there're 2 different checks: One for a

00:16:03.579 --> 00:16:08.939
pair in the non-reversed order and for a
pair of instruction in the reversed order.

00:16:08.939 --> 00:16:13.480
There is one check that checks, if
register value reads and writes are

00:16:13.480 --> 00:16:21.049
consistent. There is one check that sees,
if the processor is alive, so when I give

00:16:21.049 --> 00:16:29.139
the processor certain fairness constrains,
that the memory will always return, memory

00:16:29.139 --> 00:16:32.600
reads at number of cycles, things like
that, then I can use this to prove that

00:16:32.600 --> 00:16:37.459
the process will not just suddenly freeze.
This is very important and this will also

00:16:37.459 --> 00:16:41.459
prove that the processor is not skipping
instruction indices, which is very

00:16:41.459 --> 00:16:45.620
important, because some of the other
checks actually depend on the processor

00:16:45.620 --> 00:16:51.100
behaving in this way. And so forth. So,
there are couple of these consistency

00:16:51.100 --> 00:16:55.379
checks and it's a nice exercise to sit
down in a group of people and go through

00:16:55.379 --> 00:17:00.679
the list of consistency checks and see
which which set of them actually is

00:17:00.679 --> 00:17:05.799
meaningful or which set of them actually
leaves an interesting verification gap and

00:17:05.799 --> 00:17:11.789
we still need to add checks for this or
that processor then. Okay, so what kind of

00:17:11.789 --> 00:17:19.949
bugs can it find? That's a super-hard
question, because it's really hard to give

00:17:19.949 --> 00:17:26.209
a complete list. It can definitely find
incorrect single-threaded instruction

00:17:26.209 --> 00:17:28.880
semantics. So, if you
just implement a instruction

00:17:28.880 --> 00:17:34.350
incorrectly in your core, then this will
find it; no question about it. It can find

00:17:34.350 --> 00:17:38.740
a lot of bugs and things like bypassing
and forwarding and pipeline interlocks,

00:17:38.740 --> 00:17:46.600
things like that. Things where you reorder
stuff in a way you shouldn't reorder them,

00:17:46.600 --> 00:17:51.950
freezes if you have this lifeness check,
some bugs related to memory interfaces and

00:17:51.950 --> 00:17:58.610
load/store consistency and things like
that, but that depends on things like the

00:17:58.610 --> 00:18:04.110
size of your cache lines, if this is a
feasible proof or not. Bugs that we can't

00:18:04.110 --> 00:18:09.280
find yet with riscv-formal are things that
are not yet covered with the riscv-formal

00:18:09.280 --> 00:18:13.179
interface, like the floating-point stuff
or CSRs, but this is all on the to-do

00:18:13.179 --> 00:18:18.280
list, so they are actively working on that
and a year from now, this stuff will be

00:18:18.280 --> 00:18:24.650
included. And anything related to
concurrency between multiple heats. So

00:18:24.650 --> 00:18:28.210
far, my excuse for that was, that the
RISC-V memory model is not completely

00:18:28.210 --> 00:18:33.789
specified yet, so I would not actually
know what to check exactly, but right now

00:18:33.789 --> 00:18:38.559
the RISC-V memory model is in the process
of being finalized, so I won't have this

00:18:38.559 --> 00:18:45.740
excuse for much, much longer. So, the
processors currently supported PicoRV32,

00:18:45.740 --> 00:18:50.740
which is my own processor, then RISC-V
Rocket, which is probably the most famous

00:18:50.740 --> 00:18:56.519
RISC-V implementation, and VexRiscv. And
there are also a couple of others, but

00:18:56.519 --> 00:19:04.039
they are not part of the open source
release of riscv-formal. So, if you would

00:19:04.039 --> 00:19:11.159
like to add support to riscv-formal for
your RISC-V processor, then just check out

00:19:11.159 --> 00:19:15.950
the riscv-formal repository, look at the
"cores" directory, see which of the

00:19:15.950 --> 00:19:20.210
supported cores's most closely to the code
that you actually have and then just copy

00:19:20.210 --> 00:19:27.800
that directory and make a couple of small
modifications. So, I have a few minutes

00:19:27.800 --> 00:19:33.980
left to talk about things like cut-points
and blackboxes and other abstractions. So,

00:19:33.980 --> 00:19:37.919
the title of this slide could just be
"abstractions", because cut-points and

00:19:37.919 --> 00:19:41.760
blackboxes are just abstractions.
The idea behind an abstraction and formal

00:19:41.760 --> 00:19:48.221
methods is that I switch out part of my
design with a different part with a

00:19:48.221 --> 00:19:54.639
different circuit that is less
constrained, it includes the behavior of

00:19:54.639 --> 00:19:59.040
the original circuit, but might do other
stuff as well. So, the textbook example

00:19:59.040 --> 00:20:03.340
would be, I have a design with a counter
and usually the counter would just

00:20:03.340 --> 00:20:08.610
increment in steps of 1, but now I create
an abstraction that can skip numbers and

00:20:08.610 --> 00:20:15.860
will just increment in strictly increasing
steps. And this, of course, includes the

00:20:15.860 --> 00:20:20.039
behavior of the original design, so if I
can prove a property with this abstraction

00:20:20.039 --> 00:20:24.889
in place instead of the just-increment-
by-1 counter, then we have proven even a

00:20:24.889 --> 00:20:29.620
stronger property and that includes the
same property for the thing in the

00:20:29.620 --> 00:20:35.690
original design. And actually this idea of
abstractions works very well with riscv-

00:20:35.690 --> 00:20:40.950
formal. So, the main reason why we do
abstractions is because it leads to easier

00:20:40.950 --> 00:20:46.540
proofs. So, for example, consider an
instruction checker that just checks if

00:20:46.540 --> 00:20:52.280
the core implements the "add" instruction
correctly. For this checker, we don't

00:20:52.280 --> 00:20:56.020
actually need a register file that's
working. We could replace the register

00:20:56.020 --> 00:21:00.580
file by something that just ignores all
writes to it and whenever we read

00:21:00.580 --> 00:21:04.289
something from the register file, it
returns an arbitrary value. That would

00:21:04.289 --> 00:21:09.870
still include the behavior of a core with
a functional register file, but, because

00:21:09.870 --> 00:21:13.610
the instruction checker does not care
about consistency between register file

00:21:13.610 --> 00:21:17.929
writes and register file reads, we can
still prove that the instruction is

00:21:17.929 --> 00:21:22.889
implemented correctly, and therefore we
get an easier proof. Of course, we can't

00:21:22.889 --> 00:21:27.100
use this abstraction for all those proofs,
because the other proofs that actually

00:21:27.100 --> 00:21:33.330
check if my register file works as I would
expect it to work, but if we go through

00:21:33.330 --> 00:21:37.429
the list of proofs and we run all these
proofs independently, then you will see

00:21:37.429 --> 00:21:41.960
that for each of them, it's possible to
abstract away a large portion of your

00:21:41.960 --> 00:21:47.380
processor and therefore yield
and an easier proof.

00:21:47.380 --> 00:21:51.169
Depending on what kind of solvers you use,
some solvers're actually very capable of

00:21:51.169 --> 00:21:54.779
finding these kind of abstractions
themselves. So in that cases, it doesn't

00:21:54.779 --> 00:21:59.470
really help by adding these abstractions
manually, but just realizing that the

00:21:59.470 --> 00:22:04.490
potential for these abstractions is there,
is something that's very useful when

00:22:04.490 --> 00:22:09.020
guiding your decisions how to split up a
large verification problem into smaller

00:22:09.020 --> 00:22:12.610
verification problems, because you would
like to split up the problem in a way so

00:22:12.610 --> 00:22:17.100
that the solver is always capable of
finding useful abstractions that actually

00:22:17.100 --> 00:22:27.700
lead to easier circuits to prove. With a
bounded check, we also have the questions

00:22:27.700 --> 00:22:32.850
of what bounds do we use. Of course,
larger bounds are better, but larger

00:22:32.850 --> 00:22:41.860
bounds also yield something that is harder
to compute, and if you have a small bound,

00:22:41.860 --> 00:22:45.340
then you have a proof that runs very, very
quickly, but maybe you're not very

00:22:45.340 --> 00:22:49.970
confident that it actually has proven
something that's relevant for you. So, I

00:22:49.970 --> 00:22:57.350
propose 2 solutions for this: The first
solution is, you can use the same solvers

00:22:57.350 --> 00:23:02.490
to find traces that cover certain events
and you could write a list and say "I

00:23:02.490 --> 00:23:07.409
would like to see 1 memory read and 1
memory write and at least one ALU

00:23:07.409 --> 00:23:10.690
instruction executed" and things like
that. And then, you can ask the solver

00:23:10.690 --> 00:23:18.830
"What is the shortest trace that would
actually satisfy all this stuff?" And when

00:23:18.830 --> 00:23:24.570
that's a trace of, say 25 cycles, then you
know "Okay, when I look at a prove that's

00:23:24.570 --> 00:23:29.789
25 cyclic deep, I know at least these are
the cases that are going to be covered."

00:23:29.789 --> 00:23:34.730
But more important, I think, is: Usually
when you have a processor, you already

00:23:34.730 --> 00:23:41.269
found bugs and it's a good idea to not
just fix the bugs and forget about them,

00:23:41.269 --> 00:23:46.610
but preserve some way of re-introducing
the bugs, just to see if your testing

00:23:46.610 --> 00:23:51.509
framework works. So,
if you have already a couple of bugs

00:23:51.509 --> 00:23:55.120
and you know "It took me a week to find
that and took me a month to find that",

00:23:55.120 --> 00:24:00.549
the best thing is to just add the bugs to
your design again and see "What are the

00:24:00.549 --> 00:24:04.610
bounds that are necessary for riscv-formal
to actually discover those bugs" and then

00:24:04.610 --> 00:24:09.009
you will have some degree of confidence
that other similar bugs would also have

00:24:09.009 --> 00:24:15.779
been found with the same bounds. So,
"Results": I have found bugs in pretty much

00:24:15.779 --> 00:24:23.029
every implementation I looked at; I found
bugs in all 3 processors; we found bugs in

00:24:23.029 --> 00:24:28.139
Spike, which is the official implementation
of RISC-V in C and I found

00:24:28.139 --> 00:24:33.760
a way to formally verify my specification
against Spike and in some cases where I found

00:24:33.760 --> 00:24:37.250
the difference between my specification
and Spike, it turned out, it

00:24:37.250 --> 00:24:41.830
was actually a bug in the English language
specification. So, because of that, I also

00:24:41.830 --> 00:24:47.950
found bugs in the English language
specification with riscv-formal. "Future

00:24:47.950 --> 00:24:53.519
work": Multipliers already supported,
floating-point is still on the to-do list,

00:24:53.519 --> 00:25:01.600
64-bit is half done. We would like to add
support for CSRs. We would like to add

00:25:01.600 --> 00:25:06.120
support for more cores, but this is
something that I would like to do slowly,

00:25:06.120 --> 00:25:10.649
because adding more cores also means we
have less flexibility with changing

00:25:10.649 --> 00:25:17.970
things. And better integration with non-
free tools, because right now all of that

00:25:17.970 --> 00:25:22.600
runs with open source tools that I also
happen to write -- so, I wrote those

00:25:22.600 --> 00:25:27.580
tools, but some people actually don't want
to use my open source tools; they would

00:25:27.580 --> 00:25:31.260
like to use the commercial tools -- and
it's the to-do list that I have better

00:25:31.260 --> 00:25:35.750
integration with those tools. Maybe,
because I don't get licenses to those

00:25:35.750 --> 00:25:41.049
tools, so we will see how this works.
That's it. Do we have still time for

00:25:41.049 --> 00:26:02.709
questions?
<i>applause</i>

00:26:02.709 --> 00:26:05.380
Clifford: So, I'd say we start with
questions at 1.

00:26:05.380 --> 00:26:11.240
Herald: I'm sorry; here we go. We have 2
questions; we have time for 2 questions

00:26:11.240 --> 00:26:15.720
and we're going to start with microphone
number 1, please.

00:26:15.720 --> 00:26:22.779
M1: Hello, thanks for your talk and for
your work. First question: You told about

00:26:22.779 --> 00:26:26.690
riscv-formal interface.
Clifford: Yes.

00:26:26.690 --> 00:26:34.199
M1: So, does vendor ship the final
processor with this interface available?

00:26:34.199 --> 00:26:39.529
Clifford: Oh, that's a great question,
thank you. This interface has only output

00:26:39.529 --> 00:26:45.290
ports and actually when you implement this
interface, you should not add something to

00:26:45.290 --> 00:26:49.230
your design that's not needed to generate
those output ports. So, what you can do

00:26:49.230 --> 00:26:53.510
is, you can take the version of your core
with that interface, the version of the

00:26:53.510 --> 00:26:58.480
core without that interface. Then, in your
synthesis script, just remove those output

00:26:58.480 --> 00:27:03.720
ports and then run a formal equivalence
check between that version and the version

00:27:03.720 --> 00:27:10.820
that you actually deploy on your ASIC.
M1: Thanks; and one short question: When

00:27:10.820 --> 00:27:17.381
people say formal verification, usually
others think "Oh, if it is verified, it is

00:27:17.381 --> 00:27:26.220
excellent, absolutely excellent. Do you
plan to say that it will find all the

00:27:26.220 --> 00:27:30.649
erato for the processor?
Clifford: Well, it depends on what kind of

00:27:30.649 --> 00:27:35.990
proof you run. The most work I do is with
bounded proofs and there you only get a

00:27:35.990 --> 00:27:40.830
certain degree of confidence, because you
only see bugs that can occur within a

00:27:40.830 --> 00:27:46.160
certain number of cycles from reset, but
if you want, you can also run a complete

00:27:46.160 --> 00:27:52.539
proof, where you start with a symbolic
state instead of a reset state and then

00:27:52.539 --> 00:27:57.510
you can can make sure that you actually
check the entire reachable state space,

00:27:57.510 --> 00:28:01.509
but that's a very, very hard thing to do
So, that's not a weekend project. Just

00:28:01.509 --> 00:28:05.730
adding the riscv-formal interface and
running some bounded proofs is probably a

00:28:05.730 --> 00:28:10.560
weekend project if you already have
your RISC-V processor.

00:28:10.560 --> 00:28:13.400
M1: Thank you.
Herald: Thank you. We actually do not have

00:28:13.400 --> 00:28:17.919
time for any more questions, but there
will be time after the talk to ask you

00:28:17.919 --> 00:28:22.130
questions... maybe?
Clifford: Yeah. So, maybe you can find me

00:28:22.130 --> 00:28:27.390
at the open FPGA assembly, which is part
of the hardware hacking area.

00:28:27.390 --> 00:28:32.740
Herald: Super. Very great job to put that
much information into 30 minutes. Please

00:28:32.740 --> 00:28:35.830
help me thank Cliff
for his wonderful talk.

00:28:35.830 --> 00:28:44.232
<i>applause</i>

00:28:44.232 --> 00:28:49.447
<i>34c3 outro</i>

00:28:49.447 --> 00:29:06.000
subtitles created by c3subtitles.de
in the year 2018. Join, and help us!