Return to Video

Marijn Haverbeke - The Rust That Could Have Been

  • 0:04 - 0:09
    [ applause ]
  • 0:10 - 0:12
    Yeah, I'm glad so many people
    still showed up
  • 0:12 - 0:15
    at the very end of the conference.
  • 0:15 - 0:20
    So yeah, I'm going to be talking about
    mostly a number of features
  • 0:20 - 0:23
    that were part of the Rust language
    at some point,
  • 0:23 - 0:24
    but no longer are.
  • 0:25 - 0:29
    And, why this is the case
    and usually why this is a good thing.
  • 0:30 - 0:33
    So, as Ryan said, I'm Marijn Haverbeke.
  • 0:33 - 0:37
    If you are into Javascript,
    you might have seen my name before.
  • 0:42 - 0:44
    Rust has been under development
    for a while:
  • 0:44 - 0:46
    about ten years now, I think.
  • 0:46 - 0:51
    First - long stretch - was just
    Graydon working in isolation and...
  • 0:51 - 0:55
    Who knows what kind of ideas and
    experiments he abandoned at that point.
  • 0:55 - 0:59
    There's not even a like code repository
    from that time public,
  • 0:59 - 1:02
    so it's like the prehistory.
  • 1:02 - 1:04
    And then, at some point,
  • 1:04 - 1:08
    Mozilla adopted the project
    and assembled a team.
  • 1:09 - 1:12
    From a little before that point,
    we do have Git history,
  • 1:12 - 1:15
    and at that point,
    everything was discussed in issues
  • 1:15 - 1:17
    and mailing lists, and there's...
  • 1:18 - 1:19
    lots of records.
  • 1:19 - 1:23
    I was part of this pretty much -
    the initial team - where we...
  • 1:24 - 1:29
    moved from the original OCaml-based
    compiler to a Rust-based compiler,
  • 1:29 - 1:32
    and we got a bunch more people
    involved in the language.
  • 1:32 - 1:34
    This was a period of...
  • 1:34 - 1:37
    like, the first real experience
    with the language
  • 1:37 - 1:39
    and a bunch of people
    from different backgrounds...
  • 1:41 - 1:42
    giving their opinions on the language
  • 1:42 - 1:45
    and trying to push it
    into their favorite direction.
  • 1:45 - 1:48
    And, we had a lot of like
    experiments and dead ends
  • 1:48 - 1:51
    and overhauls and false starts and...
  • 1:53 - 1:55
    just lots and lots of churn.
  • 1:55 - 1:59
    We would sometimes come up with
    a breaking change in the morning,
  • 1:59 - 2:01
    and then have a patch ready
    in the afternoon
  • 2:01 - 2:03
    and then convince someone
    to merge it later on and then...
  • 2:04 - 2:08
    Because there was only one codebase,
    we'd just fix everything right away
  • 2:08 - 2:09
    and people could continue working.
  • 2:09 - 2:11
    There was some...
  • 2:12 - 2:15
    some trickiness with actually
    getting like a compiler...
  • 2:15 - 2:18
    that compiles the current code
    after you'd make a breaking change,
  • 2:18 - 2:21
    so you'd first change the compiler,
    then upload a snapshot,
  • 2:21 - 2:23
    then change the code,
    and then you could -
  • 2:23 - 2:26
    everyone could - proceed
    with the new snapshot.
  • 2:26 - 2:29
    And then, of course,
    a year and a half ago - I think -
  • 2:31 - 2:33
    the team cut version 1.0,
  • 2:33 - 2:37
    and then the process changed entirely,
    so now it's like...
  • 2:38 - 2:39
    everything stays backwards compatible.
  • 2:39 - 2:44
    It's impressive how seriously they have
    been taking backward compatibility.
  • 2:46 - 2:50
    Experiments move like RFCs
    move very slowly
  • 2:50 - 2:52
    and there has to be a wide consensus
  • 2:52 - 2:55
    and it has to fit
    within the current codebase.
  • 2:55 - 2:58
    So, that's a whole different stage again.
  • 2:59 - 3:03
    I'm going to be mostly talking about
    the period where I was part of the team.
  • 3:03 - 3:06
    which was 2011-2012,
  • 3:06 - 3:08
    and which was probably
    the wildest period
  • 3:08 - 3:13
    in terms of features cut,
    features changed - stuff like that.
  • 3:15 - 3:17
    So, it may seem a bit ridiculous
  • 3:17 - 3:20
    that we put so much time into
    really complicated features
  • 3:20 - 3:22
    just to end up dropping them again.
  • 3:23 - 3:28
    But, I think it's kind of an essential
    part of getting a complex design
  • 3:28 - 3:30
    like a programming language right that,
  • 3:30 - 3:33
    unless you're a super-genius,
    you won't really see in advance
  • 3:33 - 3:36
    what the implications
    and the interactions between
  • 3:36 - 3:40
    the various parts of the system are
    and you have to try it and see
  • 3:41 - 3:44
    how well you can make it work
    and how well it fits into the system,
  • 3:44 - 3:46
    and sometimes you later have to just...
  • 3:47 - 3:48
    abandon it again.
  • 3:49 - 3:53
    I think that's part of
    a healthy design process for like...
  • 3:54 - 3:57
    mere mortals who need to
    actually see how something works
  • 3:57 - 3:59
    before they can evaluate it.
  • 4:00 - 4:04
    I'm structuring this talk about...
    around a number of visions
  • 4:04 - 4:07
    that were part of the language
    and then dropped again.
  • 4:07 - 4:12
    And, I'll try to explain why I think that,
    in every case,
  • 4:12 - 4:15
    it was a real good decision to drop them.
  • 4:15 - 4:18
    But, it's still interesting to see
  • 4:19 - 4:22
    what the original visions were
    and what we did end up with.
  • 4:23 - 4:28
    So, these are typestate,
    a structural type system and
  • 4:29 - 4:33
    lightweight processes,
    and finally, garbage collection.
  • 4:34 - 4:36
    Let's start with typestate.
  • 4:38 - 4:41
    Typestate is... it was actually
  • 4:41 - 4:45
    an important point in initial
    announcements of the language.
  • 4:45 - 4:47
    and people were very excited about it.
  • 4:48 - 4:51
    What typestate does is basically
    allows you to...
  • 4:51 - 4:56
    allows the compiler to know more
    about value than just its type.
  • 4:56 - 4:57
    So, an example would be...
  • 4:58 - 4:59
    This is something of type [sockets],
  • 4:59 - 5:04
    but we also happen to know that it's open
    or this is something of type array,
  • 5:04 - 5:07
    but we happen to know... or say vector
    in the current terminology,
  • 5:07 - 5:10
    but we happen to know that
    it's not empty.
  • 5:10 - 5:13
    Something like that allowing you to add
  • 5:14 - 5:17
    more safety to your program -
    more static guarantees.
  • 5:19 - 5:23
    So, when you're programming,
    you usually have some mental model
  • 5:23 - 5:27
    of why the thing you are doing
    right now is...
  • 5:27 - 5:30
    is valid - is not going to crash -
  • 5:30 - 5:32
    like if you're not just making
    random changes
  • 5:32 - 5:35
    and seeing if the [tests pass],
    you will have
  • 5:36 - 5:39
    some mental model of your... program.
  • 5:40 - 5:42
    And, to a certain degree,
    depending on the language,
  • 5:42 - 5:45
    you can tell the compiler
    about this model
  • 5:45 - 5:46
    and the compiler can then check
  • 5:46 - 5:49
    whether you are applying
    your model consistently.
  • 5:49 - 5:53
    So, the simple case is just types -
    that you're actually passing the type
  • 5:53 - 5:55
    that you think you are passing somewhere.
  • 5:55 - 5:59
    And, if you don't, then,
    instead of finding out at runtime,
  • 5:59 - 6:02
    you find out at compile time -
    and this is nice.
  • 6:02 - 6:03
    There's a kind of...
  • 6:04 - 6:08
    this computer does not have the fonts
    that my computer had but...
  • 6:08 - 6:10
    Imagine arrowheads on both sides.
  • 6:10 - 6:14
    There's a kind of spectrum
    on which languages fall
  • 6:14 - 6:16
    in terms of how much
  • 6:17 - 6:19
    you can actually communicate
    to the compiler,
  • 6:19 - 6:21
    so down on one side,
    there's Javascript,
  • 6:21 - 6:24
    and it's like... syntactically correct.
  • 6:24 - 6:26
    Okay, let's go ahead - let's run it.
  • 6:26 - 6:29
    And then, there's like,
    on the... way on other side,
  • 6:29 - 6:32
    there's languages like Coq,
    which require you to actually construct
  • 6:32 - 6:37
    a formal proof that your program does
    what it's supposed to do -
  • 6:37 - 6:40
    that it does so in bounded time,
    in bounded space,
  • 6:42 - 6:46
    which means you're making
    a lot less mistakes.
  • 6:46 - 6:48
    But, on the other hand,
    it's like a major...
  • 6:49 - 6:53
    a major project to write a small program
    in such a language.
  • 6:54 - 6:58
    And, there is a reason that
    not everyone is writing their web servers
  • 6:58 - 6:59
    in Coq or whatever.
  • 6:59 - 7:01
    And, Rust kind of falls in the middle.
  • 7:01 - 7:04
    It does have quite a bit
    of static guarantees
  • 7:04 - 7:06
    and it helps quite a lot.
  • 7:06 - 7:11
    But, it still aims to be ergonomic,
    like easy to program in,
  • 7:12 - 7:14
    where you don't have to
  • 7:15 - 7:18
    spend too much time
    working on these things.
  • 7:20 - 7:25
    And I... one way to see the history
    of programming languages is kind of...
  • 7:26 - 7:29
    one aspect of it at least is
  • 7:29 - 7:33
    that we've been finding better
    and better vocabulary to...
  • 7:33 - 7:38
    to describe these things we know about
    our program to the compiler
  • 7:39 - 7:41
    in a way that's actually convenient.
  • 7:41 - 7:43
    So, if you have a really terrible
    type system,
  • 7:43 - 7:46
    that's often worse than
    no type system at all.
  • 7:46 - 7:49
    If I have to choose to write something
    in Java or Javascript,
  • 7:49 - 7:52
    [I'll just take Javascript,
    thank you very much.]
  • 7:52 - 7:54
    But, we're getting better at this,
  • 7:54 - 7:56
    and Rust is making
    a big contribution here,
  • 7:56 - 7:59
    and like bringing a real,
    modern type system
  • 7:59 - 8:00
    to the systems programming space.
  • 8:00 - 8:03
    And the ownership model is - I think -
  • 8:03 - 8:05
    just really, really good.
  • 8:05 - 8:08
    I unfortunately wasn't on the team
    anymore when this was introduced,
  • 8:08 - 8:10
    so I can't take any credit for it, but..
  • 8:10 - 8:12
    I think that this is
    the most exciting part of Rust
  • 8:12 - 8:14
    and it's exactly this kind of thing
    where you...
  • 8:14 - 8:18
    where the compiler knows
    what you're trying to do and
  • 8:18 - 8:21
    tells you when you're
    violating your model.
  • 8:23 - 8:25
    So, back to typestate.
  • 8:26 - 8:28
    It looked somewhat like this.
  • 8:30 - 8:32
    You could define predicates,
  • 8:32 - 8:35
    which is this "pure function not empty"
    at the top
  • 8:39 - 8:42
    The extra information that
    the compiler had about your values
  • 8:42 - 8:45
    came in the form of this predicate hold
  • 8:45 - 8:49
    But, these were actually just predicates
    written in normal Rust code
  • 8:49 - 8:50
    that were supposed to be pure.
  • 8:50 - 8:54
    There was a concept of an fx system
    at that point - which is also gone now.
  • 8:54 - 8:59
    But, they just took a value and said:
    "Okay, I hold or I don't hold."
  • 9:00 - 9:05
    And, then you could define
    for your functions,
  • 9:05 - 9:07
    preconditions and postconditions.
  • 9:07 - 9:09
    So, you could say, for example,
  • 9:09 - 9:13
    this function "last" here
    demands that its first argument
  • 9:13 - 9:17
    has the "not empty" predicate
    holding on it,
  • 9:17 - 9:21
    because you can't take the last element
    from an empty array...
  • 9:24 - 9:27
    Then, before you could pass
    such a value to such a function
  • 9:27 - 9:32
    you'd have to convince the compiler
    that this predicate held at this point.
  • 9:32 - 9:34
    For some things,
    this worked relatively well:
  • 9:34 - 9:37
    the compiler was very clever
    in propagating its information
  • 9:37 - 9:39
    through the control flow graph
    and like taking it
  • 9:39 - 9:42
    from the post conditions
    of the functions you called.
  • 9:44 - 9:45
    But here, you have, for example,
  • 9:45 - 9:49
    I create an array
    and then I want to pass it to last.
  • 9:49 - 9:52
    But, it's not okay: I first have to...
  • 9:53 - 9:55
    check that it's not empty.
  • 9:55 - 9:59
    And, this is actually...a sink check
    would insert a runtime test,
  • 9:59 - 10:03
    call to the predicate,
    and then panic if it failed.
  • 10:06 - 10:08
    But actually, I mean,
    this array isn't empty:
  • 10:08 - 10:10
    this is very easy to prove.
  • 10:10 - 10:13
    But, because the compiler
    only saw these predicates as
  • 10:13 - 10:15
    like opaque pieces of code,
  • 10:15 - 10:17
    it couldn't actually reason about them -
  • 10:17 - 10:19
    it could only take what you told it.
  • 10:19 - 10:22
    Like, if you checked
    there was variance of check,
  • 10:22 - 10:26
    one of them which just was
    like an unsafe form of
  • 10:26 - 10:28
    "just believe me: this holds."
  • 10:28 - 10:30
    So, that might also have been
    appropriate here
  • 10:30 - 10:32
    because I'm really sure that
    this array is not empty.
  • 10:34 - 10:37
    And then, there was a one version that
  • 10:38 - 10:42
    ensured that the compiler already
    statically knew at this point
  • 10:42 - 10:45
    that something else... that it was
    kind of an assertion of "okay...
  • 10:46 - 10:50
    "I must notice at this point,
    but don't insert a runtime check.
  • 10:51 - 10:54
    "I want to have a static error
    if it's not provable."
  • 10:56 - 11:00
    But, in my experience,
    the effect of this system
  • 11:00 - 11:04
    was mostly that you would be
    littering your code with check statements
  • 11:04 - 11:06
    and they would also panic at runtime.
  • 11:06 - 11:10
    So, the amount of static guarantees
    wasn't very great
  • 11:10 - 11:14
    because often... usually the compiler
    couldn't really help a lot with
  • 11:14 - 11:17
    like reasoning about when
    they actually held and when they didn't.
  • 11:18 - 11:21
    It was in the compiler
    for a long time still
  • 11:21 - 11:24
    but eventually it was dropped because
    it was just not pulling its weight.
  • 11:24 - 11:28
    So, in terms of experiments
    in good expressive ways
  • 11:28 - 11:31
    to express these kind of things,
    I think this was a failed experiment.
  • 11:31 - 11:34
    It existed in some research
    languages before
  • 11:34 - 11:38
    but it's never really made it into
    a big mainstream type language -
  • 11:38 - 11:40
    for good reason, I think.
  • 11:42 - 11:43
    So, so much for that.
  • 11:44 - 11:47
    Next topic is "structural typing."
  • 11:47 - 11:52
    So, in typing systems,
    you have two concepts
  • 11:53 - 11:55
    where structural typing is...
  • 11:55 - 11:58
    say you have a function type,
    which has a few arguments types,
  • 11:58 - 12:00
    and a return type,
    and you want to compare it
  • 12:00 - 12:02
    to another function type.
  • 12:02 - 12:06
    So, you're just going to look at
    the fields in the function:
  • 12:06 - 12:08
    does it have the same amount
    of arguments,
  • 12:08 - 12:13
    are its arguments of compatible types,
    is it return type of compatible type?
  • 12:13 - 12:15
    And, that's structural.
  • 12:15 - 12:17
    On the other hand,
    there is a nominal typing,
  • 12:17 - 12:20
    where you just say:
    "Where is this type declared?
  • 12:20 - 12:22
    "What's the name of this type?" -
    and it has to be the same.
  • 12:22 - 12:25
    So, Rust's structs currently work
    this way,
  • 12:25 - 12:28
    as do enums -
    two types are only compatible
  • 12:28 - 12:31
    if they are actual instances of
    the thing that was declared
  • 12:31 - 12:34
    in the same point in the code.
  • 12:36 - 12:39
    Initially, structs were structural types.
  • 12:39 - 12:43
    So, this curly braces thing there
    is syntax for a struct type,
  • 12:43 - 12:48
    with two fields - x and y -
    of type "float."
  • 12:48 - 12:52
    And, the type declaration
    just defines an alias for the type.
  • 12:53 - 12:58
    This is just like a name for the type
    record with two float fields.
  • 13:00 - 13:04
    So, if I define a function,
    which takes an argument of this point,
  • 13:04 - 13:08
    I can call it with just a record
    constructed on the fly
  • 13:08 - 13:11
    without any record name involved.
  • 13:12 - 13:14
    Records themselves don't have a name:
  • 13:14 - 13:16
    they just have a structure
    in this system, and...
  • 13:17 - 13:20
    it's kind of nice and lightweight
    and minimal,
  • 13:20 - 13:24
    and often you don't even bother
    to give your record a name
  • 13:24 - 13:25
    if you only use it a few times.
  • 13:27 - 13:29
    So, where you would now
    probably use it triple,
  • 13:29 - 13:34
    you could use a record
    with nice descriptive field names.
  • 13:34 - 13:37
    I kind of liked it for programming with...
  • 13:39 - 13:42
    But, I'll come back later
    to why this part was removed.
  • 13:43 - 13:46
    Another aspect of this was object types,
  • 13:47 - 13:49
    whereas structure types
    were only compatible
  • 13:49 - 13:53
    if they had actually
    the exact same fields and...
  • 13:54 - 13:55
    in the same order.
  • 13:55 - 13:58
    They weren't reordered
    because of C compatibility
  • 13:58 - 14:02
    and they had to be the exact same
    to be able to compile it efficiently
  • 14:02 - 14:05
    because then all code
    that interacted with such a record
  • 14:05 - 14:07
    knew how it was laid out in memory.
  • 14:08 - 14:12
    Objects were a more dynamic feature,
    and here...
  • 14:14 - 14:19
    any object type that has a subset
    of the fields -
  • 14:19 - 14:22
    fields are always methods -
    so they're always functions.
  • 14:23 - 14:26
    This object type has, its compatible,
    so I could,
  • 14:27 - 14:30
    if I define the type - a collection of T -
    with these two -
  • 14:30 - 14:33
    it's probably not a very great abstraction
    but just bear with me -
  • 14:33 - 14:37
    with a length and
    an item accessor method,
  • 14:37 - 14:41
    you could take any object that has
    I don't know what kind of...
  • 14:42 - 14:44
    methods with also these two,
  • 14:44 - 14:47
    and you could treat it
    as a collection of T.
  • 14:48 - 14:52
    So, these were both the types
    of the concrete objects
  • 14:52 - 14:56
    and also serve the role of interfaces,
    which is kind of nice in terms of
  • 14:56 - 15:00
    how many concepts you need
    to do object-oriented programming.
  • 15:00 - 15:02
    And, you could also even use it
    as a kind of...
  • 15:04 - 15:08
    checked duck typing,
    where you define your function
  • 15:08 - 15:10
    and you just say:
    "I'm only going to call length
  • 15:10 - 15:12
    "on the thing that I'm getting,"
  • 15:12 - 15:15
    and then, anything that had a length
    method could be passed in -
  • 15:15 - 15:18
    you don't even need to formally define
    an interface name or anything,
  • 15:18 - 15:22
    it's just all like structural by name.
  • 15:24 - 15:26
    So...
  • 15:28 - 15:29
    One implication of this was
  • 15:29 - 15:32
    that because code that used them
    didn't know their size,
  • 15:32 - 15:34
    they always had to be heap allocated.
  • 15:34 - 15:36
    I think they even always
    had to be garbage collected.
  • 15:38 - 15:43
    Any calls to them will be going
    through a dispatch table - a vtable.
  • 15:43 - 15:46
    So, they're so much more heavyweight...
  • 15:48 - 15:50
    compared to the rest of the language.
  • 15:50 - 15:52
    We were finding that, in the compiler,
  • 15:52 - 15:54
    we were kind of shying away from them
  • 15:54 - 15:58
    and that we absolutely needed
    polymorphism because
  • 15:59 - 16:02
    they were more heavyweight
    than necessary in many situations.
  • 16:06 - 16:08
    Then, at some point...
  • 16:09 - 16:11
    Well, there was also
  • 16:11 - 16:13
    a lot of machinery involved
    in actually doing this,
  • 16:13 - 16:17
    like upcasting to a type with less methods
  • 16:17 - 16:20
    because then you needed to
    allocate a new vtable,
  • 16:20 - 16:23
    preferably statically,
    which forwarded to the alt object
  • 16:23 - 16:25
    and created a wrapper.
  • 16:26 - 16:29
    It was conceptually simple
    but not terribly simple to implement.
  • 16:29 - 16:31
    And then, at some point...
  • 16:32 - 16:35
    we got more [Haskell?] people
    on the team
  • 16:35 - 16:37
    and we all started agitating for
  • 16:37 - 16:40
    a typed class kind of implementation
  • 16:40 - 16:43
    interface thing that we ended up with now.
  • 16:44 - 16:47
    And, because no one really
    liked these objects very much,
  • 16:47 - 16:50
    we migrated to that.
  • 16:50 - 16:55
    And, I think they just fit
    with the language much better.
  • 16:55 - 16:58
    They don't require you
    to put something on the heap.
  • 16:58 - 17:04
    They don't require indirect calls unless
    you actually are using polymorphism.
  • 17:05 - 17:07
    So, I think that's a win.
  • 17:08 - 17:15
    But, now that we had implementations,
    which effect a specific type,
  • 17:17 - 17:19
    structural records
    also became problematic,
  • 17:19 - 17:22
    because if you're using a record that
    happens to have the same shape
  • 17:22 - 17:24
    in two completely independent contexts
  • 17:25 - 17:29
    and they both define say
    a two string implementation of it,
  • 17:29 - 17:30
    then these will clash,
  • 17:30 - 17:33
    even though the actual usages
    have nothing to do with each other,
  • 17:33 - 17:36
    they will both be trying to
    implement the same interface -
  • 17:36 - 17:37
    the same trait on it.
  • 17:38 - 17:40
    That doesn't work, so...
  • 17:40 - 17:43
    Well, no one cared that much
    about structural records either,
  • 17:43 - 17:47
    so they became nominal
    for this reason at that point.
  • 17:47 - 17:48
    And now, of course,
  • 17:48 - 17:51
    people say functions
    are still structural -
  • 17:51 - 17:53
    they don't make much sense
    in any other way.
  • 17:53 - 17:58
    But, the like heavy emphasis
    on structural typing was abandoned...
  • 17:59 - 18:01
    again for good reasons.
Title:
Marijn Haverbeke - The Rust That Could Have Been
Description:

This talk will describe a number of language concepts and features that were in the pre-1.0 Rust language at some point but were ultimately abandoned, such as typestate, garbage collection, structural types, and a more or less classical object system. I’ll go over the reasons they were abandoned, and try to convince you that the Rust we have now is the best Rust yet.

---
For more go to https://rustfest.eu or follow us on Twitter: https://twitter.com/rustfest

more » « less
Video Language:
English
Duration:
31:11

English subtitles

Incomplete

Revisions Compare revisions