-
Hi, I'm Mira Murati.
-
I'm the chief technology officer
at Openai, the company that created
-
ChatGPT.
-
I really wanted to work on AI
-
because it has the potential
to really improve
-
almost every aspect of life
and help us tackle really hard challenges.
-
Hi, I'm Cristobal Valenzuela,
CEO and co-founder of Runway
-
Runway, is a research company
that builds AI algorithms
-
for storytelling and video creation.
-
Chat bots like ChatGPT
are based on a new type of AI
-
technology
that's called large language models.
-
So instead of a typical neural network
which trains on a specific task
-
like how to recognize faces
or images, a large language
-
model is trained on the largest amount
of information possible,
-
such as everything
available on the Internet.
-
It uses this training to then be able
-
to generate completely new information,
-
like to write essays or poems,
have conversations, or even write code.
-
The possibilities seem endless,
-
but how does this work
and what are its shortcomings?
-
Let's dive in.
-
While a chatbot built on a large
language model may seem magical,
-
it works
based on some really simple ideas.
-
In fact, most of the magic of AI
is based on very simple math concepts
-
from statistics applied billions of times
using fast computers.
-
The AI uses probabilities to predict
the text that you want it to produce
-
based on all the previous text
that it has been trained on.
-
Suppose that we want to train
a large language model
-
to read every play written
by William Shakespeare
-
so that it could write new plays
in the same style.
-
We'd start with all the texts
from Shakespeare's plays
-
stored letter by letter in a sequence
-
next, we'd analyze each letter
-
to see what letter
is most likely to come next after an I,
-
the next most likely letters
to show up in Shakespeare plays are
-
S or N after an, S,
-
T, C, or H, and so on.
-
This creates a table of probabilities.
-
With just this,
we can try to generate new writing.
-
We pick a random letter to start
-
starting with the first letter.
-
We can see
what's most likely to come next.
-
We don't always have to pick
the most popular choice
-
because that would lead
to repetitive cycles.
-
Instead, we pick randomly.
-
Once we have the next letter,
we repeat the process
-
to find the next letter
and then the next one and so on.
-
Okay, well,
that doesn't look at all like Shakespeare.
-
It's not even English,
but it's a first step.
-
The simple system might not seem
even remotely intelligent,
-
but as we build up from here,
you'll be surprised where it goes.
-
The problem in the last example
is that at any point the AI only considers
-
a single letter to pick what comes next.
-
That's not enough context,
and so the output is not helpful.
-
What if we could
-
train it to consider
a sequence of letters, like sentences
-
or paragraphs, to give it more context
to pick the next one?
-
To do this, we don't use a simple table
of probabilities.
-
We use a neural network.
-
A neural network is a computer system
that is loosely inspired
-
by the neurons in the brain.
-
It is trained on a body of information,
and with enough training,
-
it can learn to take in new information
and give simple answers.
-
The answers always include probabilities
-
because there can be many options.
-
Now let's take a neural network
and train it on
-
all the letters sequences
in Shakespeare's plays to learn
-
what letter is likely
to come next at any point.
-
Once we do this,
the neural networks can take
-
any new sequence and predict
what could be a good next letter.
-
Sometimes the answer is
obvious, but usually is not.
-
It turns out
-
this new approach works
better, much better
-
by looking at the long enough
sequence of letters, the AI
-
can learn complicated patterns, and
it uses those to produce all new texts.
-
It starts
the same way with a starting letter
-
and then using probabilities
to pick the next letter and so on.
-
But this time, the probabilities are based
-
on the entire context
of what came beforehand.
-
As you see, this works surprisingly well.
-
Now, a system like ChatGPT uses a similar approach,
-
but with three very important additions.
-
First,
instead of just training on Shakespeare,
-
it looks at all the information
it can find on the Internet,
-
including all the articles on Wikipedia
or all the code on GitHub.
-
Second,
instead of learning and predicting letters
-
from just the 26 choices in the alphabet,
it looks at tokens
-
which are either full words
or word parts or even code.
-
And third
-
difference
is that a system of this complexity
-
needs a lot of human tuning to make sure
it produces reasonable results
-
in a wide variety of situations,
while also protecting against problems
-
like producing highly biased
or even dangerous content.
-
Even after we do this tuning,
it's important to note that this system
-
is still just using random probabilities
to choose words.
-
A large language model can produce
-
unbelievable results that seem like magic,
-
but because it's not actually magic,
it can often get things wrong.
-
And when it gets things wrong, people ask, does
-
a large language
model have actual intelligence?
-
Discussions about A.I. often spark
-
philosophical debates
about the meaning of intelligence.
-
Some argue that a neural network
producing words
-
using probabilities
doesn't have really intelligence.
-
But what isn't under debate
is that large language
-
models produce amazing results
-
with applications in many fields.
-
This technology is already being used
to create apps and websites,
-
help produce movies and video games,
and even discover new drugs.
-
The rapid acceleration of
AI will have enormous impacts on society,
-
and it's important for everybody
to understand this technology.
-
What I'm looking forward to
is the amazing things
-
people will create with AI.,
and I hope you dive in to learn
-
more about how AI works
and explore what you can build with it.