36C3 - SELECT code_execution FROM * USING SQLite;

Edit subtitles

0:00 - 0:18

36C3 preroll music
0:18 - 0:25

Herald Angel: On my left, I have Omer
Gull, and Omer Gull is gonna tell us - A
0:25 - 0:33

SELECT code_execution FROM * USING SQLite.
Omer, Omer Gull, that's your talk, your stage, have fun!
0:33 - 0:42

Omer Gull: Thank you. Thank you. Hello,
everyone. Welcome to my talk, SELECT code
0:42 - 0:49

execution from just about anything using
SQLite, where we will gain code execution
0:49 - 0:54

using malicious SQLite databases. So my
name is Omer Gull. I'm a vulnerability
0:54 - 1:00

researcher from Tel Aviv. I've been
working in Check Point Research for the
1:00 - 1:06

past three years, and I've recently moved
on to a new startup called Hunters.AI.
1:06 - 1:11

Our agenda for today: So we'll start with a
little motivation and the backstory for
1:11 - 1:18

this research. Then we'll have a brief
SQLite introduction, and we'll examine the
1:18 - 1:23

attack surface given to a malicious
database. Then we will discuss some
1:23 - 1:29

previous work done in the field of SQLite
exploitation and think about exploiting
1:29 - 1:36

memory corruption bugs, using nothing but
pure SQL. We'll then demonstrate our own
1:36 - 1:41

innovative technique, called "Query
Oriented Programing", or QOP, and take it
1:41 - 1:48

for a spin in a couple of demos. We'll
wrap things up with some future work
1:48 - 1:54

possibilities and some conclusion. So the
motivation for this research is quite
1:54 - 2:01

obvious. SQLite is one of the most
deployed pieces of software out there.
2:01 - 2:08

Whether it's PHP 5, PHP 7, Android, iOS,
Mac OS, it is now built into Windows 10.
2:08 - 2:14

It's in Firefox and Chrome. This list
could continue forever. Yet querying an
2:14 - 2:20

SQLite database is considered safe.
Hopefully, by the end of this talk, you
2:20 - 2:27

will realize why this is not necessarily
the case. So, it all began with password
2:27 - 2:32

stealers, which is pretty strange, and
there are many, many password stealers in
2:32 - 2:38

the wild, but the story is usually the
same. First of all, a computer gets
2:38 - 2:43

infected. Then some malware collects the
storage credentials, as they are
2:43 - 2:49

maintained by various clients. Now, some
of these clients actually store your
2:49 - 2:55

secrets within SQLite databases. So the
malware just ships these SQLite databases
2:55 - 3:02

to its C2 server, where the secrets are
extracted and stored within a collective
3:02 - 3:08

database with the rest of the loot. So,
one day, my colleague and I, Omri, we were
3:08 - 3:13

looking at the leaked sources of a very
well known password stealers. Then we
3:13 - 3:18

thought to ourself: "These guys are just
harvesting a bunch of our databases and
3:18 - 3:25

parse them in their own backend. Can we
actually leverage the load and query of an
3:25 - 3:30

untrusted database to our advantage?" And
if we could, this could have much bigger
3:30 - 3:37

implications, just because SQLite is used
in countless scenarios. And so began the
3:37 - 3:46

longest CTF challenge of my life so far.
So, SQLite. Unlike most SQL databases,
3:46 - 3:53

SQLite does not have that client server
architecture. Instead, it simply reads and
3:53 - 3:59

write files directly to the file system.
So, you have one complete database, with
3:59 - 4:03

multiple tables, and indices, and
triggers, and views, and everything is
4:03 - 4:10

contained within the single file. So let's
examine the attack surface given to a
4:10 - 4:15

potentially malicious SQLite database. So
again, this is a snippet of code, of a
4:15 - 4:21

very well known password stealer, and we
have two main points of interest here:
4:21 - 4:26

First of all, we have sqlite3_open, where
our potentially malicious database is
4:26 - 4:32

loaded, and some parsing is going on. And,
obviously, we have the query itself,
4:32 - 4:37

right? The SELECT statement. Now, do note
that we have no control over that
4:37 - 4:41

statement, right. It is hardcoded within
our target. It tries to extract the
4:41 - 4:47

secrets out of our database. Yet we do
control the content, so we might have some
4:47 - 4:52

effect on what's going on there, right.
So, starting with the first point, the
4:52 - 4:58

sqlite3_open. This is just a bunch of
setup and configuration code. Then we move
4:58 - 5:03

on to really straightforward header
parsing, and the header itself is not that
5:03 - 5:10

long, just 100 bytes. And thirdly, it was
already fuzzed to death by AFL. So it's
5:10 - 5:17

probably not a very promising path to
pursue. But the SEL- the sqlite3_query
5:17 - 5:22

might be a bit more interesting, because
using SQLite author's words, "The SELECT
5:22 - 5:29

statement is the most complicated command
in the SQL language". Now, you might be
5:29 - 5:36

aware that behind the scenes, SQLite is a
virtual machine. So every query must first
5:36 - 5:42

be compiled to some proprietary bytecode.
And this is also known as the preparation
5:42 - 5:48

step. So sqlite3_prepare would walk and
expand the query. So, for example, every
5:48 - 5:56

time you select an asterisk, it simply
rewrite this asterisk as all column names.
5:56 - 6:02

So, sqlite3LocateTable will actually
verified that all the relevant objects
6:02 - 6:07

that you are querying actually exist, and
will locate them in memory. Where does it
6:07 - 6:14

locate them? So, every SQLite database has
a table called sqlite_master, and this is
6:14 - 6:20

actually the schema that is defining the
database. And this is its structure. So,
6:20 - 6:26

for every object in the database, you have
an entry, so its type, like whether it's a
6:26 - 6:32

table or view, and its name, and at the
very bottom you can see something called
6:32 - 6:41

SQL. And SQL is actually the DDL that is
describing the object. And DDL stands for
6:41 - 6:47

data definition language, and you can sort
of look at it like header files in C. So,
6:47 - 6:52

they are used to define the structures,
and names, and types of the objects within
6:52 - 6:58

the database. Furthermore, they appear in
plaintext within the file. So, let me show
6:58 - 7:03

you an example. Here I opened the SQLite
interpreter, I create a table, and I
7:03 - 7:08

insert some values into it. Then I quit
the interpreter, and now I hex dump the
7:08 - 7:15

file that was created. And you can see,
highlighted in yellow, the DDL statement
7:15 - 7:20

that is part of the master schema. And at
the very bottom, you can also see the
7:20 - 7:26

values. So, let's go back to query
preparation. We have sqlite3LocateTable
7:26 - 7:31

that attempts to find the structure that
is describing the table that we are
7:31 - 7:37

interested in querying. So it goes on and
reads the schema available in
7:37 - 7:41

sqlite_master that we just described. And
if it's the first time that it is doing
7:41 - 7:48

so, it has some callback function for
every of these DDL statements. The
7:48 - 7:53

callback function would actually validate
the DDL, and then it will go on and build
7:53 - 7:58

the internal structures of the object in
question. So then we thought about the
7:58 - 8:06

concept of DDL patching. What if I simply
replace the SQL query within the DDL? So
8:06 - 8:11

it turns out that there is a slight
problem with it. And this is the callback
8:11 - 8:16

function that I mentioned earlier, and as
you can tell, the DDL is first verified to
8:16 - 8:23

begin with "CREATE ". And only if it does
then we continue with the preparation. So,
8:23 - 8:29

this is definitely a constraint, right?
Our DDL must begin with "CREATE ". Yet it
8:29 - 8:35

does leave some room for flexibility,
because judging by SQLite documentation,
8:35 - 8:41

many things can be created. We can create
indexes, and tables, and triggers, and
8:41 - 8:47

views, and something we still don't quite
understand, called virtual tables. So,
8:47 - 8:53

then we thought about "CREATE VIEW",
because view is simply a pre-packaged
8:53 - 9:00

SELECT statement, and views are queried
very similarly to tables. So, selecting a
9:00 - 9:05

column out of a table is semantically
equivalent to selecting a column out of a
9:05 - 9:11

view. Then when we thought about the
concept of query hijacking. We are going
9:11 - 9:17

to patch sqlite_master DDL with views
instead of tables. Now our patched views
9:17 - 9:23

can actually have any SELECT subquery that
we wish. And now with this subquery I can
9:23 - 9:27

suddenly interact with the SQLite
interpreter. And this is a huge step
9:27 - 9:32

forward! We just turned an uncontrollable
query to something that we have some
9:32 - 9:38

control over. So let me show you query
hijacking by example. So let's say that
9:38 - 9:44

some original database had a single table,
and this is the DDL that is defining it.
9:44 - 9:49

So it's called "dummy" and it has two
columns. So, obviously any target software
9:49 - 9:53

would try to query it in the following
way: It would just try to select these
9:53 - 9:58

columns out of the table, right. Yet the
following view can actually hijack this
9:58 - 10:03

query. I create a view, it has just the
same name, and just the same amount of
10:03 - 10:08

columns, and each column is named just the
same way, and you can see that now every
10:08 - 10:14

column can have any sub query that I wish,
highlighted in blue at the bottom. So
10:14 - 10:19

again, let me show you a practical example
of it. Here, I created a view called
10:19 - 10:24

"dummy" with cola and colb, and the first
column is utilizing the sqlite_version()
10:24 - 10:28

function, and that's a built in function
that simply returns the SQLite version,
10:28 - 10:35

obviously. The second column is utilizing
SQLite's own implementation of printf.
10:35 - 10:40

That's right, they have all these really
surprising features and capabilities. So,
10:40 - 10:46

let's see that from the supposedly - from
the target side. So, anyone trying to
10:46 - 10:52

select out of these column, is suddenly
executing our functions. So, at the left
10:52 - 10:57

you can see the SQLite version, and on the
right you can see the printf that was
10:57 - 11:02

executed on the target side. So again,
this is a huge step forward. We just
11:02 - 11:09

gained some control over that query,
right. And the question is, what can we do
11:09 - 11:15

with this control? Does SQLite have any
system commands? Can maybe - maybe we can
11:15 - 11:23

read and write to some other files on the
file system. So, this was a good point to
11:23 - 11:28

stop and look at some previous work done
in the field, because obviously, we are
11:28 - 11:35

not the first to notice SQLite's huge
potential, in terms of exploitation. A
11:35 - 11:39

reasonable place to start is SQLite
injection, because this is sort of a
11:39 - 11:46

similar scenario, right? Someone malicious
has some control on an SQL query. So,
11:46 - 11:52

there are a couple of known tricks in SQL
injection with SQLite. The first one has
11:52 - 11:56

something to do with attaching another
database, and then creating a table, and
11:56 - 12:01

inserting some strings into it. And
because, as I mentioned earlier, every -
12:01 - 12:07

every database is just a file, so this is
somewhat of an arbitrary file, right, on
12:07 - 12:11

the file system. Yet we do have this
constraint, if you remember, that we can't
12:11 - 12:18

ATTACH, because our DDL must begin with
"CREATE". Another cool trick is abusing
12:18 - 12:24

the load extension function. And here you
can see how you can potentially load a
12:24 - 12:28

remote DLL. In this case, the
meterpreter.dll, but obviously this very
12:28 - 12:35

dangerous function is disabled by default.
So again, no go. What about memory
12:35 - 12:42

corruption in SQLite, because SQLite is
really complex and it's all written in C.
12:42 - 12:48

So in his amazing blog post "Finding Bugs
in SQLite, the easy way", Michal Zalewski,
12:48 - 12:55

the author of AFL, described how he found
22 bugs in just under 30 minutes of
12:55 - 13:02

fuzzing. And actually, since then, since
that was version 3.8.10, that was in 2015,
13:02 - 13:07

SQLite actually started using AFL as an
integral part of the remarkable test
13:07 - 13:12

suite. Yet these memory corruption bugs
all proved to be really difficult to
13:12 - 13:17

exploit without some convenient
environment. Yet, the security research
13:17 - 13:25

community soon found the perfect target,
and it was called WebSQL. So, WebSQL is
13:25 - 13:30

essentially an API for storing data in
databases, and it is queried from
13:30 - 13:36

JavaScript and it has an SQLite backend.
Also, it is available in Chrome and
13:36 - 13:42

Safari. So here you can see a very simple
example of how to interact with WebSQL
13:42 - 13:51

from JavaScript. But in other words, what
I'm hearing here is that we have some
13:51 - 13:56

untrusted input to SQLite and it is
reachable from any website on the Internet
13:56 - 14:02

in two of the world's most popular
browsers. And suddenly these bugs, these
14:02 - 14:07

memory corruptions can now be leveraged
with the knowledge and - with the
14:07 - 14:11

knowledge and comfort of JavaScript
exploitation, right. The JavaScript
14:11 - 14:16

interpreter exploitation that we got, we
got pretty good over the years. So there
14:16 - 14:20

have been really several really impressive
research that were published regarding
14:20 - 14:29

WebSQL from really low hanging fruits like
CVE-2015-7036, that was an untrusted
14:29 - 14:35

pointer dereference in the fts3_tokenizer(),
to some more complex exploit as presented
14:35 - 14:41

in Blackhat 2017 by the awesome Chaitin
team, that found a type confusion in the
14:41 - 14:46

FTS optimizer, to the very recent Magellan
bugs found and exploited by Tencent, that
14:46 - 14:52

found an integer overflow in the FTS
segment reader. And if you are paying even
14:52 - 14:58

a tiny bit of attention by now, you must
see an interesting pattern arises. All
14:58 - 15:05

these vulnerable functions start with FTS.
So what is FTS? I have never heard of it.
15:05 - 15:12

And actually googling it just left me more
confused. After some time, I came to the
15:12 - 15:18

realization that FTS stands for "full text
search", and it is something called a
15:18 - 15:23

virtual table module and it allows for
some really cool textual search on a set
15:23 - 15:28

of documents. Or like the SQLite authors
described it, It's "like Google for your
15:28 - 15:34

SQlite databases". So virtual tables allow
for some pretty cool functionality in
15:34 - 15:39

SQLite, whether it's this free text search
or a virtual table module called RTREE,
15:39 - 15:45

that does some really clever geographical
indexing, or a virtual table called CSV,
15:45 - 15:50

that lets you treat your database as a CSV
file. And these virtual tables are
15:50 - 15:56

actually queried just like regular tables.
Yet behind the scenes, some dark magic
15:56 - 16:02

happens. And after every query, there is
some callback function that is invoked and
16:02 - 16:07

it works on something called shadow
tables. Now, shadow tables would be best
16:07 - 16:14

explained by example. So let's say that I
create a virtual table using that FTS
16:14 - 16:20

virtual table module and I insert a string
into it. Now, obviously, to allow for some
16:20 - 16:24

efficient search, I need to have some
metadata, right? I need to have some
16:24 - 16:29

offsets or indexes or tokens or stuff like
that. So - and obviously they're all text,
16:29 - 16:35

right? So that one virtual table is
actually, it's raw text, and metadata is
16:35 - 16:41

stored among three shadow tables. So the
raw text would go to vt_content and the
16:41 - 16:46

metadata would go to vt_segments and
vt_segdir. And in time, these shadow
16:46 - 16:52

tables actually have interfaces passing
information between them, right. Because
16:52 - 16:56

the metadata is storing all these
pointers, so you need to pass them between
16:56 - 17:02

each other. And these interfaces proved to
be really, really trusting in their
17:02 - 17:10

nature. And it makes them a really fertile
ground for bug hunting. So let me show you
17:10 - 17:16

a bug that I found in the RTREE virtual
table module. So, RTREE virtual table
17:16 - 17:21

module is available now in MacOS and iOS,
and it's really cool because now it's also
17:21 - 17:25

built into Windows 10. And as I've
mentioned, it does some really clever
17:25 - 17:31

geographical indexing. Now, the DDL is
supposed to be the following. Any RTREE
17:31 - 17:37

virtual table is supposed to begin with
"id", that needs to be an integer. Then
17:37 - 17:42

you have some X and Y coordinates. So
obviously every RTREE interface would
17:42 - 17:48

expect "id" to be an integer. But if I
create a virtual table and I insert into
17:48 - 17:54

"id" something that is definitely not an
integer, then I use one of these RTREE
17:54 - 18:00

interfaces, rtreenode at the very bottom,
you see that we got this crash, this out-
18:00 - 18:09

of-bounds read on the heap. And this
example is a crash in Windows 10. So
18:09 - 18:15

that's pretty good. We have established
that virtual table has bugs. And now,
18:15 - 18:21

using query hijacking technique, we can
suddenly trigger these bugs on our target,
18:21 - 18:26

which is a C2 of the password stealer, and
will cause it to segfault. And this is
18:26 - 18:32

nice, but actually gaining flow control
over our target requires us to have some
18:32 - 18:37

form of scripting, right? We want to
bypass ASLR and do all these crazy things.
18:37 - 18:42

Yet we don't have JavaScript, we don't
have JavaScript arrays and variables and,
18:42 - 18:48

like, logic statements, like if and and
loops and stuff like that. However, we do
18:48 - 18:55

vaguely recall hearing somewhere that SQL
is turing complete. So we decided to put
18:55 - 19:01

it to the test from exploitation
perspective, and we started creating our
19:01 - 19:07

own primitive wish list for exploitation.
So if it would create a full exploit
19:07 - 19:11

exploiting memory corruption bugs with
nothing but SQL, what capabilities do we
19:11 - 19:16

want? So, obviously, to bypass ASLR and
these kind of things, we want to leak some
19:16 - 19:21

memory. We need to have an info leak. And
if you've done any pwning in your past,
19:21 - 19:27

you must be familiar with really common
tasks like unpacking 64-bit pointers and
19:27 - 19:31

doing some pointer arithmetics, right?
Because we had an info leak, we converted
19:31 - 19:36

- we read this pointer, and it's a little
endian and so we need to flip it, and now
19:36 - 19:40

we want to calculate, let's say where's
the base of libsqlite so we can find some
19:40 - 19:46

more functions maybe. So we need some
pointer arithmetics. Obviously, after
19:46 - 19:51

reading pointers and manipulating them, we
want to pack them again and write them
19:51 - 19:58

somewhere. Obviously, writing a single
pointer is never enough. We want to create
19:58 - 20:03

fake objects in memory, like more complex
objects than this one pointer. And
20:03 - 20:08

finally, we would like to heap spray
because this might be really useful. So,
20:08 - 20:13

the question remains, can all this
exploitation be done with nothing but SQL?
20:13 - 20:19

So, the answer is "yes, it is". And I
proudly present to you Query Oriented
20:19 - 20:26

Programing, or QOP. And to demonstrate
QOP, we are going to exploit the unfixed
20:26 - 20:37

CVE-2015-7036. And you might ask yourself
"What? How come a four year old bug is
20:37 - 20:44

still unfixed?" And this is a great point
to our argument. This CVE was only ever
20:44 - 20:50

considered dangerous in the context of
untrusted WebSQL. So it was mitigated
20:50 - 20:55

accordingly, right. It is blacklisted
unless SQLite is compiled with is certain
20:55 - 21:00

flag. So, obviously browsers are not
compiled with this flag anymore. But let
21:00 - 21:05

me show you who is compiled with this
flag. So, we have PHP 5 and PHP 7, in
21:05 - 21:09

charge on most of the Internet, and iOS
and MacOS and probably so many other
21:09 - 21:15

targets that we just didn't have the time
to go over. So, let's explain this
21:15 - 21:21

vulnerability a little bit. I've mentioned
that it's in the FTS tokenizer. So, a
21:21 - 21:27

tokenizer is just a set of rules to
extract terms from documents or queries.
21:27 - 21:33

And the default tokanizer, that is named
"simple", just split the strings by
21:33 - 21:39

whitespaces. However, if you like, you can
register your own custom tokenizer. You
21:39 - 21:44

can just pass a C function and you
actually register this custom tokenizer
21:44 - 21:50

with the function fts3_tokenizer() in an
SQL query. This is a bit where it's all
21:50 - 21:56

repeated slowly. You pass a row pointer
to a C function in an SQL query. This is
21:56 - 22:01

absolutely insane. To be honest, after
studying this for quite a while, I still
22:01 - 22:06

don't understand how to use this feature
outside of my exploit.
22:06 - 22:16

audience laughing
clapping
22:16 - 22:18

So fts3_tokenizer()
22:18 - 22:23

is actually an overloaded function, and if
you call it with one argument that is the
22:23 - 22:28

name of a tokenizer, you get back the
address of that tokenizer, and to make it
22:28 - 22:35

a bit more human readable, or somewhat
human, we'll use the hex decoder. And you
22:35 - 22:40

can now see that we actually got an info
leak to libsqlite3. Now, because it's
22:40 - 22:43

little endian, so it's the other way
around, so we need to reverse it, but this
22:43 - 22:48

is already pretty cool. If you call
fts3_tokenizer() with two arguments, the
22:48 - 22:53

first one being a name of a tokenizer, and
the second one, again, is a row pointer,
22:53 - 22:58

this is absolutely insane, you rewrite the
address of that tokenizer. So now,
22:58 - 23:05

whenever someone will try to use a virtual
table, so it will instantiate our default
23:05 - 23:11

tokenizer, right. It will crash and burn.
And this is pretty amazing. Let's have a
23:11 - 23:17

short recap. So, we've established that
SQLite is a wonderful one-shot for many
23:17 - 23:22

targets, right? It's absolutely
everywhere. And it is a complex machine
23:22 - 23:27

that is written in C. Now, with query
hijacking, we can start triggering these
23:27 - 23:33

bugs, and we aim to write a full exploit,
implementing all necessary primitives
23:33 - 23:40

using SQL queries. Our exploitation game
plan is as follows: We will leak some
23:40 - 23:46

pointers, and then we'll calculate some
function addresses. We'll then create a
23:46 - 23:52

fake tokenizer object with some pointer to
system(). We will override the default
23:52 - 23:58

tokenizer and trigger our malicious
tokenizer. Then something will happen, and
23:58 - 24:03

obviously by the end of the process we
should be able to profit somehow, right?
24:03 - 24:09

So, starting with memory leak, an info
leak to libsqlite. So, you already know
24:09 - 24:13

how to do it, right? We've seen
fts3_tokenizer(), but we still have a tiny
24:13 - 24:19

problem of the little-endian pointer. So
we need to flip it. Now, surely we can use
24:19 - 24:25

the SUBSTR function and read this pointer
two characters at a time in a reverse
24:25 - 24:30

fashion, and then simply concatenate
everything throughout the pointer. So, we
24:30 - 24:35

get a SELECT query that looks the
following, but now we have our pointer.
24:35 - 24:40

This is great. What about an info leak to
the heap? We want to know where the heap
24:40 - 24:45

is located. So, to do that trick, I'm
going to do something pretty similar to
24:45 - 24:49

the RTREE bug that we've found. So, again,
I'm going to confuse some shadow table
24:49 - 24:58

interface. So, we created a virtual table
and inserted some values into it. And now
24:58 - 25:02

we're about to confuse the match
interface. So, the match interface does
25:02 - 25:07

many things, but behind the scenes, it
just finds - it serves the pointer in
25:07 - 25:11

memory to where the text is located,
right. It's this metadata, cool things
25:11 - 25:16

that virtual table has. So we're going to
confuse it, and instead of passing it to
25:16 - 25:21

another virtual table interface, we'll
simply pass it to the hex decoder, so we
25:21 - 25:26

will decode this raw pointer. And you can
see that, again in little-endian, but now
25:26 - 25:33

we have a link to the heap. We can cross
that off the list. And before we go on to
25:33 - 25:39

unpacking this pointer, we have a very
basic problem. How do we even save these
25:39 - 25:45

things? Because unlike browser WebSQL, we
don't have JavaScript variables or arrays
25:45 - 25:50

to use and then abuse them later, but we
need to create some complex logic, right?
25:50 - 25:55

We need to calculate the function address
and create things in memory, but how can
25:55 - 25:59

we do it? Obviously, with SQLite, when you
want to save some values, you need to have
25:59 - 26:05

insert statements, but we can only create
tables and views and index and triggers.
26:05 - 26:11

Then we thought about chaining this view
together to use them sort of like a
26:11 - 26:16

pseudo-variable. So again, let me show you
an example. Here, I create a view, it's
26:16 - 26:21

called "little-endian leak", right? And
again, I abuse the fts3_tokenizer()
26:21 - 26:26

function. Now I create another view on top
of it, and this one is called "leak", and
26:26 - 26:30

it's flipping it using the SUBSTR trick
that you know from before. But notice how
26:30 - 26:34

I referred to the first view, I referred
to "little-endian leak". So again, I do it
26:34 - 26:40

throughout the pointer, and eventually
what I have is a pseudo-variable that's
26:40 - 26:47

called "leak". And when I select from it,
I get the expected result. So now we can
26:47 - 26:51

really move forward. Now we can start
building some more complex things based on
26:51 - 26:57

this logic. And now we can go to unpacking
the pointers. So, we want to calculate a
26:57 - 27:03

base of an image, for example, or maybe
find the beginning of the heap. So first
27:03 - 27:09

of all, we want to convert our pointers to
integers. So, again, we're going to start
27:09 - 27:15

and read these pointers one character at a
time and in reverse fashion using SUBSTR.
27:15 - 27:21

And to get the value of this hex
character, we're going to use INSTR, that
27:21 - 27:26

is just like strchar, and using the
following string we'll get the value of
27:26 - 27:31

the hex character. Now, because it is one-
based, you have on the right the minus
27:31 - 27:40

one. Then I need to have some shifting,
like, dark magic, and then I go and just
27:40 - 27:45

concatenate everything throughout the
pointer. So the result is this monster
27:45 - 27:50

query. But eventually, when all of this is
done, I get an integer that is the
27:50 - 27:57

unpacked version of our initial leak. So I
successfully unpacked this pointer and we
27:57 - 28:02

now have integers at hand, so we can cross
that off the list as well. We know how to
28:02 - 28:08

convert pointers to integers. Now, pointer
arithmetics, right, because we want to
28:08 - 28:13

have the addresses of some functions in
memory and actually, with integer, this is
28:13 - 28:19

super simple. All we need to do is use
some more sub-queries. So, on the left you
28:19 - 28:22

can see that I'm referring to the, now,
pseudo-variables that we have, the
28:22 - 28:29

unpacked leak, and on the right I can
either have, like, subtract a silly
28:29 - 28:33

constant like I did here, or I can
actually use another pseudo-variable to
28:33 - 28:40

make it a bit more dynamic, can make some
a bit more reliable. So, eventually, what
28:40 - 28:47

I'm ending up with is the libsqlite base
in an integer form. So, we've read some
28:47 - 28:54

pointers and we manipulated them. Now it's
a good time to write them back. And
28:54 - 29:01

obviously we're all used to "char" being
the exact opposite of "hex". And you can
29:01 - 29:06

see that it works fairly well on most of
the values, but bigger integers were
29:06 - 29:11

actually translated to their two-byte
code-points. So this was a huge obstacle
29:11 - 29:20

for us. So, after bashing our head against
the documentation for quite a while, we
29:20 - 29:25

suddenly had the strangest epiphany. We
realized that our exploit is actually a
29:25 - 29:31

database. And if I want any conversion to
take place, I can simply create, ahead of
29:31 - 29:38

time, this key-value map and simply query
it to translate whatever value I want to
29:38 - 29:44

another value with sub-queries. So this is
the python function that I've used and you
29:44 - 29:48

can see that there's a very simple for
loop going from 0 to FF and just inserting
29:48 - 29:55

values to a table called "hex_map" with
its key-map value. And now our conversions
29:55 - 30:01

are using sub-queries. So again, let me
show you by example. You can see that I'm
30:01 - 30:07

selecting "val" from hex map, this key-
value map, where "int" is equal to, and
30:07 - 30:11

then I go and then doing some more like
shifting and modulo dark magic, but
30:11 - 30:18

eventually, what I'm ending up with is a
packed version of our libsqlite base. Now
30:18 - 30:22

it's, we have a packed little-endian
pointer, so we can cross that off the list
30:22 - 30:30

as well. As I've mentioned, writing a
single pointer is definitely useful, but
30:30 - 30:35

it's not enough. We want to be faking
complete objects, right? All the cool kids
30:35 - 30:40

are doing it and it's a pretty powerful
primitive. And if you actually recall, we
30:40 - 30:46

have to do it because fts3_tokenizer()
requires us to assign a tokenizer module.
30:46 - 30:51

Now, what is a tokenizer module? How does
it look? So, this is the beginning of its
30:51 - 30:55

structure and there is an "iVersion",
that's an integer at the beginning, we
30:55 - 31:00

don't really care about it, but following
it are three function pointers. We have
31:00 - 31:05

"xCreate", which is the constructor of the
tokenizer, and "xDestroy", which is the
31:05 - 31:11

destructor. We need to have both of them
valid so we don't crash during our
31:11 - 31:14

exploitation. The third function pointer
is really interesting, because this is
31:14 - 31:18

what actually tokenizes the string. So we
have a function pointer that we are
31:18 - 31:24

passing a controllable string into. This
would be a perfect place to put our system
31:24 - 31:33

gadget, right. So by now, I've used a fair
share of my SQL knowledge, right. But I do
31:33 - 31:38

have one more trick up my sleeve and
that's JOIN queries, right? Because I
31:38 - 31:44

learned about it at a time in the past. So
we're now going to create a fake tokenizer
31:44 - 31:48

view and we're going to concatenate a
bunch of "A"'s and then, using a JOIN
31:48 - 31:53

query, we'll concatenate it with a pointer
to simple_create and a pointer to
31:53 - 31:58

simple_destroy and then a bunch of "B"'s.
Now let's verify it from a low-level
31:58 - 32:03

debugger and you can actually see that at
some place in memory, we have a bunch of
32:03 - 32:07

"A"'s followed by a pointer to
simple_create, followed by a pointer to
32:07 - 32:14

simple_destroy, and a bunch of "B"'s. So,
we're almost done. But we need one more
32:14 - 32:20

primitive for this exploit. And this is
because we already have our malicious
32:20 - 32:25

tokenizer, and we know where the heap is
located, but we are not quite sure where
32:25 - 32:31

our tokenizer is. So this is a great time
for some heap spraying, right. And ideally
32:31 - 32:37

this would be some repetitive form of our
fake object primitive. So we've thought
32:37 - 32:45

about repeat, but sadly SQLite did not
implement it like mySQL. So, like anyone
32:45 - 32:51

else, we went to Stack Overflow, and we
found this really elegant solution. So
32:51 - 32:59

we're going to use the zeroblob function
that simply returns a blob of N zeros. And
32:59 - 33:04

then we will replace each of that zeros
with our fake tokenizer. And we're going
33:04 - 33:08

to do it ten thousand times, as you can
see above. Again, let's verifiy it with a
33:08 - 33:13

debugger. So, you see a bunch of "A"'s and
it's kind of hard to see, because these
33:13 - 33:17

are really bad colors, but we also got
perfect consistency, because these
33:17 - 33:25

structures repeat themselves every 20 hex
bytes. So we created a pretty good heap
33:25 - 33:31

spraying capabilities and we are done with
our exploitation primitive wish list, so
33:31 - 33:38

we can go back to our initial target.
Again, this is the code snippet of a very
33:38 - 33:43

well known password stealer. And at the
bottom, you can see that it's trying to
33:43 - 33:49

SELECT to extract the secrets by selecting
a column called "BodyRich" from a table
33:49 - 33:55

called "Notes". So, we're going to prepare
a little surprise for him, right? We're
33:55 - 33:59

going to create a view that is called
"Notes". And it has three sub-queries in a
33:59 - 34:04

column called "BodyRich". And each of
these sub-queries is actually a QOP chain
34:04 - 34:09

on its own. So if you remember my
exploitation game plan, we're going to
34:09 - 34:14

start with heap spray, and then we will
override the default tokenizer, and then
34:14 - 34:19

we will trigger our malicious tokenizer.
And you might ask yourself, what is
34:19 - 34:25

heap_spray? Obviously, heap_spray is a QOP
chain that utilizes our heap spraying
34:25 - 34:33

capabilities, right. We are spraying ten
thousand instances of our fake tokenizer,
34:33 - 34:38

that is a JOIN query of a bunch of "A"'s
and then some pointers like
34:38 - 34:44

p64_simple_create. And the party goes on,
because p64_ simple_create is actually
34:44 - 34:52

derived from u64_simple_create, right,
with our pointer-packing capabilities. And
34:52 - 34:57

this is just turtles all the way down,
because you have u64_simple_create, that
34:57 - 35:04

is derived from libsqlite_base plus some
constant. And this goes back to the
35:04 - 35:10

unpacked leak version, right, minus some
constant. So again, we're using our
35:10 - 35:16

pointer arithmetics capabilities. We can
continue with u64_leak being derived from
35:16 - 35:23

the almost initial leak. And we'll wrap up
by showing how the leak is actually
35:23 - 35:29

derived from the initial vulnerability
using fts3_tokenizer. And this was just
35:29 - 35:35

one out of three QOP chains that we used
in this exploit. And by now, every time
35:35 - 35:40

that I described this exploit, this is how
I must look. And to be honest, this is how
35:40 - 35:45

I feel. But luckily for you guys, you
don't have to look and feel like me
35:45 - 35:50

because we created QOP.py, and it is
available on Checkpoint Research GitHub.
35:50 - 35:57

And suddenly these crazy long chains can
now be created with four easy lines of
35:57 - 36:01

python. And it feels like pwntools if you're
familiar with it, so you can go ahead and
36:01 - 36:09

play with it and not look crazy on stage.
So, we'll go to our first demo. We'll own
36:09 - 36:16

a password stealer backend running the
latest PHP 7. So obviously that's a model
36:16 - 36:20

that we created with the leaked sources
and you can see all the infected victims,
36:20 - 36:30

right. Cool. Now we'll try to go to our
webshell, to p.php. Obviously it still
36:30 - 36:36

does not exist, we get a 404. Moving to
the attacker's computer. So, we see that
36:36 - 36:42

we have two scripts here. First, we're
going to use QOP.py, that will generate a
36:42 - 36:48

malicious database. Let's see, the
database was created. Now we're going to
36:48 - 36:53

emulate an infection. We're going to send
our malicious database to the C2 server as
36:53 - 36:57

if we were infected by a password stealer.
And because this process takes a bit of
36:57 - 37:01

time, we can look at all the cool DDL
statements, right. So you see that we
37:01 - 37:07

started with some bin leak and heap leak
and then we unpacked them. And at the very
37:07 - 37:11

bottom, you can see that our end gadget is
echoing the simplest webshell to p.php,
37:11 - 37:19

right. And this is the same page that we
just tried to reach. So hopefully, yeah -
37:19 - 37:27

great, it's done. Now we go back to the
password stealer backend. We go to p.php
37:27 - 37:35

and we got 200. Now, let's execute some
code on it. whoami. www-data. And
37:35 - 37:44

obviously we need to go for /etc/password.
Yay.
37:44 - 37:55

applause
So, what just happened is that we've shown
37:55 - 38:02

that, given just the query to our
malicious a database, we can execute code
38:02 - 38:08

on the querying process. Now, given the
fact that SQLite is so popular, this
38:08 - 38:13

really opens up the door to a wide range
of attacks. Let's explore another use case
38:13 - 38:21

that is completely different. And our next
target is going to be iOS persistency. So,
38:21 - 38:28

iOS uses SQLite extensively, and
persistency is really hard to achieve,
38:28 - 38:35

because all executable files must be
signed. Yet, SQLite databases are not
38:35 - 38:40

signed, right, they're data-only. There is
no need to sign them. And iOS and MacOS
38:40 - 38:44

are both compiled with
ENABLE_FTS3_TOKENIZER. That's the
38:44 - 38:51

dangerous compile-time flag. So, my plan
is to regain code execution after reboot
38:51 - 38:57

by replacing an arbitrary SQLite DB. And
to do this, I'm going to target that
38:57 - 39:03

contacts database, and its name is
"AddressBook.sqlite". So these are two
39:03 - 39:09

tables in the original database. They have
no significant meaning, right. Just for
39:09 - 39:15

example. And I'm going to create malicious
contacts DB, and I'm going to start with
39:15 - 39:20

two malicious DDL statements that you're
already familiar by now. First of all,
39:20 - 39:26

we'll override the default organizer
"simple" to a bunch of "A"'s. Then, our
39:26 - 39:31

second DDL statement would actually
instantiate this malicious trigger, so it
39:31 - 39:35

will actually crash the program, right,
because every time you create a virtual
39:35 - 39:40

table module, someone is trying to go to
the constructor of the tokenizer. So then
39:40 - 39:45

it will crash. Now, what I'm doing next is
I'm going to go over each and every of the
39:45 - 39:51

original table and rewrite them using our
query hijacking technique, right. So,
39:51 - 39:56

instead of the columns that it expected,
we are going to redirect the execution to
39:56 - 40:00

the override statement and the crash
statement. So, we did it for one table, we
40:00 - 40:08

also do it for the others. Now we reboot
and voila, we get the following CVE and
40:08 - 40:12

secure boots was bypassed. And if you pay
close attention, this is really cool,
40:12 - 40:19

because we see that the crash happened at
0x414141...49. And this is exactly what we
40:19 - 40:24

expected to happen, right, because this is
where our constructor should be, at an
40:24 - 40:31

offset of eight after the first integer of
the version. But actually, there is more.
40:31 - 40:36

We get a bonus here. Because the contacts
DB is actually used by many, many
40:36 - 40:41

different processes. So Contacts and
Facetime and Springboard and WhatsApp and
40:41 - 40:47

Telegram and XPCProxy and so many others.
And a lot of these processes are way more
40:47 - 40:51

privileged than others. And we've
established that we can now execute code
40:51 - 40:57

on the process that queries our malicious
database. So we also got a privilege
40:57 - 41:01

escalation in this process, right. And
this is really cool. And there's nothing
41:01 - 41:06

special about the contacts database.
Actually, any shared database can be used.
41:06 - 41:12

All we need is something to be writeable
by a weak user and then being queried by a
41:12 - 41:17

stronger user, right. And obviously all
these techniques and bugs were reported to
41:17 - 41:21

Apple and they're all fixed. And these are
the CVEs if you want to go and read about
41:21 - 41:27

it later. Now, to wrap things up, if
you'll take anything away from this talk,
41:27 - 41:34

I don't want it to be the crazy SQL
gymnastics or a bunch of CVE numbers. I
41:34 - 41:39

want it to be the following. Querying it
database might not be safe, whether if
41:39 - 41:45

it's across reboots or between users or
processes, querying a database might not
41:45 - 41:51

be safe with query hijacking. And now,
with query hijacking and query oriented
41:51 - 41:55

programing, these memory corruptions that
we can trigger can actually be reliably
41:55 - 42:00

exploited with nothing but SQL. We don't
need JavaScript anymore. We don't need
42:00 - 42:06

WebSQL. And we truly think that this is
just the tip of the iceberg. So far,
42:06 - 42:12

SQLite, super popular, yet it was only
assessed from the very narrow lands of
42:12 - 42:18

WebSQL, and as much as browser pwning is
exciting, SQLite has so much more
42:18 - 42:25

potential. So we do have some thoughts
about possible future work with this.
42:25 - 42:30

Obviously, something really cool to do
would be expanding our primitives to some
42:30 - 42:36

stronger primitives, right. We want to
gain things like absolute read and write.
42:36 - 42:44

And my sketchy POC exploit was pretty
silly because it had many constants in it,
42:44 - 42:49

like you've seen, but actually if you used
the internal function of SQLite, like, if
42:49 - 42:54

during the exploitation you use functions
like sqlite3_version(), you ask the
42:54 - 42:58

interpreter, what version are you, what
compile options where you compiled with,
42:58 - 43:04

you can dynamically build these QOP chains
as you go and actually target the specific
43:04 - 43:08

target environment that you are
exploiting. Like, actually utilizing the
43:08 - 43:12

fact that our exploit is a database, we
can create this really cool exploit
43:12 - 43:17

polyglot, and we think that these
techniques can be used to privilege
43:17 - 43:23

escalate so many situations, because the
developers never had it in mind that now,
43:23 - 43:28

we can take any database that is writeable
by a weak user and queried by a strong
43:28 - 43:36

user, and suddenly we can try to privilege
our escalation. Another cool thing to do
43:36 - 43:40

would be to note that many of the
primitives that I've shown you are not
43:40 - 43:45

exclusive to SQLite, right?
The pointer packing and unpacking and
43:45 - 43:50

heap spraying, and all these things are
not exclusive to SQLite. So it would be
43:50 - 43:54

really interesting to take these
primitives and see if we can go ahead and
43:54 - 44:00

exploit other memory corruption bugs in
different database engines.
44:00 - 44:02

Thank you very much.
44:02 - 44:09

Applause
44:09 - 44:14

Herald Angel: Omer Gull, thank you very
much. That gives us a lot of time for
44:14 - 44:18

questions. So we do have three microphones
here in the hall: Number one, Number two and
44:18 - 44:24

Number three, if you have questions,
please line up. Do we have some questions
44:24 - 44:31

from the net already? No. All right. Then
we're gonna start with microphone number
44:31 - 44:34

two.
Microphone two: Yeah. So the question is
44:34 - 44:43

regarding the hijacking of the CREATE
something. So you mentioned that at the
44:43 - 44:49

beginning of the research was assuming
that CREATE was the first order of
44:49 - 44:55

checking and then a space following the
CREATE word and then it could create all
44:55 - 45:00

of other things. Now, my question is, if
that was changed following your report,
45:00 - 45:07

because this seems like a way to expose
larger attack surface then. Well, most of
45:07 - 45:11

the other bugs. So I just wonder if they
changed. And I mean, what basically what
45:11 - 45:14

was the mitigation if and if that was all
of it?
45:14 - 45:18

Omer Gull: Yeah. So the escalate people,
we're really more concerned with specific
45:18 - 45:23

bugs and not exploitation techniques. And
this is really sad because we all know
45:23 - 45:27

that you can kill bugs, but exploitation
techniques is what sticks. So no, they
45:27 - 45:34

didn't change this verification. And
actually this validation of create space
45:34 - 45:39

was actually added not so long ago. Before
that, you can have any DDL statement
45:39 - 45:43

that you want. So the situation is not
really good over there.
45:43 - 45:46

Microphone two: Good for them and good
luck in the future. And that was the
45:46 - 45:49

question.
Herald Angel: All right. We head over to
45:49 - 45:52

microphone one, please.
Microphone one: Did you, maybe by
45:52 - 45:56

accident, attack a server which was used
for password stealing stealing?
45:56 - 46:00

Omer Gull: No, obviously, I would never do
that. I would never attack anyone. This is
46:00 - 46:04

just in our lab on POC. Right.
Microphone one: Thank you.
46:04 - 46:08

Omer Gull: Your passwords are safe with
the Stealers.
46:08 - 46:10

laughter
46:10 - 46:12

Herald Angel: Right. Nobody is queueing
anymore. Do we have questions from the
46:12 - 46:16

net? Nothing over there. Well, here we go.
Omer Gull: Thank you.
46:16 - 46:18

Herald Angel: Omer Gull, thank you very much.
46:18 - 46:21

applause
46:21 - 46:25

36c3 outro music
46:25 - 46:47

subtitles created by c3subtitles.de
in the year 2019. Join, and help us!

Title:: 36C3 - SELECT code_execution FROM * USING SQLite;
Description:: more » « less
Video Language:: English
Duration:: 46:47

	C3Subtitles edited English subtitles for 36C3 - SELECT code_execution FROM * USING SQLite;
	C3Subtitles edited English subtitles for 36C3 - SELECT code_execution FROM * USING SQLite;
	dajo-congress edited English subtitles for 36C3 - SELECT code_execution FROM * USING SQLite;
	Teilkasko edited English subtitles for 36C3 - SELECT code_execution FROM * USING SQLite;
	Teilkasko edited English subtitles for 36C3 - SELECT code_execution FROM * USING SQLite;
	dajo-congress edited English subtitles for 36C3 - SELECT code_execution FROM * USING SQLite;
	C3Subtitles edited English subtitles for 36C3 - SELECT code_execution FROM * USING SQLite;
	C3Subtitles edited English subtitles for 36C3 - SELECT code_execution FROM * USING SQLite;

English subtitles

Revisions

Revision 8 Edited

C3Subtitles

36C3 - SELECT code_execution FROM * USING SQLite;

Revisions

Our website uses cookies

Operating cookies (Required)