>> Okay, so, we have got three little bits of a machine line
here and there are lot of tools and techniques that are inside that.
>> Mm-hm.
>> And I think that's great and we are going to be
trying to teach you a lot of those tools and techniques
and sort of ways to connect them together. So, by the
way, as Michael is pointing out, there are kind of ways
that these might help each other. Unsupervised learning might help supervised
learning, it's actually much deeper than that. It turns out you,
even though unsupervised learning is clearly not the same as supervised
learning, at the level we've described it, in some ways they're exactly
the same thing. Supervised learning you have some bias, oh
it's a quadratic function, induction makes sense, all these kind
of assumptions you make. And in unsupervised learning, I told
you that we don't know whether this cluster's better than
this cluster, dividing by sex is better than dividing by
height, or or hair color or whatever. But ultimately, you
make some decision about how to cluster, and that means
implicitly there's some assumed set. There's some assumed set of labels
that you can possess. Oh, I think things that look alike should
somehow be clustered. Things that are near one another should be clustered
together. So, in some ways, it's still kind of like supervised learning. You
can certainly turn any supervised learning
problem into an unsupervised learning problem.
>> Mm, mm.
>> Right. So in fact, all of these
problems are really the same kind of problem.
>> Yeah, well there's two things that I'd
want to add to that. One is that, in some
sense, in many cases, you can formulate all
of these different problems as some form of optimization.
In supervised learning, you want something that, that labels data well, and
so your, the thing you're trying to optimize is, find me a function
that, that does that. We're going to score it. In reinforcement learning, we're
trying to find a behavior that scores well. And in unsupervised learning, we
usually have to make up some kind of a criterion, and then
we find a way of clustering the data, organizing the data so that
it scores well. So that was the first point I wanted to
make. The other one is. If you divide things by sex and you're
a virgin, then there's numerical instability issues.
>> Did you learn about that on the street?
>> I learned in the math book.
>> That's ugh. I, I'm going to move on with her. So here's the thing.
>> Alright.
>> Everything Michael just said, except the last part, is
true. But there's actually a sort of deeper thing going
on, to me. If you think about the commonalities of
everything we just said, it boils down to one thing, data.
Data, data, data, data, data. Data is king in
machine work. Now Michael would call himself a computer scientist.
>> Oh, yeah.
>> And I would call myself a computationalist.
>> What?
>> I'm in a college of computing at a Department
of Computer Science. I believe in computing and computation as being
the ultimate thing. So I'd call myself a computationalist and Michael
would probably agree with that just to keep this discussion moving.
>> Let's say.
>> Right, so we're computationalists, we
believe in computing. That's a good thing.
>> Sure.
>> Many of our colleagues, who do computation,
think in terms of algorithms. They think in terms
of, what are the series of steps I need to do in order to solve some problem?
>> I think about algorithms.
>> Or they might think in terms of theorems. If I try to
describe this problem in a particular way,
is it solvable implicitly by some algorithm?
>> Yeah.
>> And truthfully, machine learning is a lot
of that. But the difference between the person who's
trying to solve a problem as an AI
person or as a computing person and someone who's
trying to solve a problem as a machine learning person is
that the algorithm stops being central, the data starts being central.
And so what I hope you get out of this class
or at least part of the stuff that you do, is understanding
that you have to believe the data, you have to do
something with the data, you have to be consistent with the data.
The algorithms that fall out of all that are algorithms but
they're algorithms that take the data as primary. Or at least important.
>> I'm going to go with coequal.
>> So algorithms and
data are coequal.
>> Coequal.
>> Well, if you believe in lists they the same thing.
>> Exactly.
>> They knew back in the seventies.
>> So it turns out we do agree on those things.
>> Whew that was close.
>> Excellent. So, the rest of the semester will go exactly
like this except you won't see us. You'll see our hands though
>> This side, this side. There you go.
>> You'll see our hands though. Thank you Michael
>> S'alright.
>> [LAUGH] Well.
>> What? [LAUGH]
>> That was good. That took me back to when I was four.
>> Señor, Señor Wences.
>> Hm?
>> It's called Señor Wences.
>> Yes I know. I remember that.
>> Yeah, okay. Mm-hm.
>> I'm not that much younger than you. Ten, 12 years old.
>> Come on.
>> You count gray hairs. Anyway, the point is
the rest of the semester will go like this.
We will talk about supervised learning and a whole
series of algorithms, step back a little bit and talk
about the theory behind them. And try to connect theory of machine
learning with theory of computing notions. Or at least that kind of
basic idea. What does it mean to be a hard problem versus
an easier problem. We'll move into
randomized optimization and unsupervised learning. Where we
will talk about all the issues that we brought up here and
try to connect them back to some of the things that we
did in the section on supervised learning. And then finally, we will
spend our time on reinforcement learning
and the generalization of these traditional reinforcement
learning which involves multiple agents. So we'll talk about a little bit.
>> Hm.
>> Of game theory which Michael loves to talk about, I love
to talk about. And the applications of all the stuff that we've
been learning To solving problems. How to actually act in the world.
How to build that robot to do something, or build that agent
to play a game, or to teach you how to do whatever
you need to be taught how to do. But at the end
of the day we're going to teach you to think about data, how
to think about algorithms, and how to build artifacts that you know
will learn.
>> Let's do this thing.
>> Excellent. Alright, well thank you Micheal.
>> Sure.
>> I will see you next time we're in the same place, at the same time.
さて機械学習には3通りの学習法がありました
そこには多くのツールやテクニックが
隠されています
そのすばらしさを皆さんにお伝えして
上手な活用方もお話しします
マイケルが指摘したように
3通りの学習は互いに補完し合っています
深く掘り下げれば
教師なし学習は教師あり学習とは
明らかに異なりますが
今の段階では同じだと言えます
教師あり学習にはバイアスがあります
“それは二次関数だ”、“帰納法を使ってもよい”
などの仮定全部のことです
一方教師なし学習では2つのクラスタの
どちらが優れているかは分かりません
性別 身長 髪の色など
どの基準で分類するべきか分からないのです
しかし最終的にクラスタリングの方法を決定すれば
それは暗黙的に何らかの集合を
仮定していることになります
仮定された集合のラベルを作るのです
ですから隣り合っているものは
互いにクラスタリングするべきなのです
これは教師あり学習の問題のようですが
教師なし学習の問題にもなり得ます
実際これらすべての問題は本質的に同じなのです
2点だけ補足させてください
1つはある意味よくあることですが
これらの異なる問題のすべてを
最適化問題として定式化することができます
教師あり学習ではデータに
うまくラベルを付ける何かが必要ですが
この場合の最適化とは
それをしてくれる関数を見つけることです
強化学習ではスコアの大きい行動を
探し出すことです
通常の教師なし学習では評価基準を作って
スコアの大きいクラスタリング方法を
見つけることです
これが1つ目でもう1つあります
物事を性別で分類した人がバージンだった場合
数的不安要素が発生します
それは実地調査の結果ですか?
数学の本から学びました
では話を進めましょう
最後以外 マイケルの話は事実です
しかし私はもう少し掘り下げたいのです
私たちの話のすべてに共通していたものは
データです
とにかくデータなのです
データは機械学習の要です
自分のことをコンピュータ科学者と呼びますか?
はい
私は自分をコンピュテーショナリストと呼びます
え?
私はコンピュータ科学科で
計算を学んでいますから
私はコンピュータや計算が究極のものだと
信じています
ですから私はコンピュテーショナリストです
きっとマイケルは話を合わせるでしょうね
いいですよ
私たちはコンピュテーショナリストです
計算することに意義を感じています
もちろん
計算を研究している同僚の多くは
アルゴリズムについて考えています
つまり問題を解決するまでの
過程を考えているのです
定理についても考えています
例えばもしこの問題を定式化したら
何らかのアルゴリズムで解けるかという問題です
そこで機械学習が大きな役割を果たしますが
問題の解決法は様々です
人工知能や計算機科学分野の人たちと
機械学習分野の人たちとの違いは
アルゴリズム中心ではなく
データ中心に考えているということです
ですからこのコースで学んだ皆さんに
理解していただきたいのは
データを信用して活用し
協調しなければならないということです
アルゴリズムはこの話とは無関係ですが
アルゴリズムはデータを主要なものと認識するので
重要だと言えます
私にとっては同等です
アルゴリズムとデータがですか?
そうです
ではリストを信じるということですか
そのとおり
70年代からです
結局私たちはこれらに同意しているのですね
惜しいところですね
それでは残りの学期もこの調子でいきましょう
私たちは消えてしまいますが
私たちの手が登場します
手のこちら側です
私たちの手が出てきます もう結構です
どういたしまして
何か?
面白くて4歳の頃を思い出しました
セニョール・ウェンセスです
セニョール・ウェンセスと呼ばれています
ええ 知ってますよ
私はあなたより10〜12歳は年下ですよ
まさか
あなたには白髪があります そんなことより
残りの学期はこのように進みます
教師あり学習とその一連のアルゴリズムを
学習します
少し復習をしながら背景にある理論をお話しします
また機械学習とコンピュータ概念の理論を結ぶ
基本的な考え方です
難題と易題について考え
確率的最適化と教師なし学習の話に移行します
ここで取り上げたすべての問題にお答えします
そして教師あり学習で学んだ学習事項と
関連づけていきます
最終的には強化学習に焦点を当て
複数のエージェントを含むような
強化学習の一般化についてと
マイケルが話したがるゲーム理論の話題も
取り上げていくつもりです
そして問題解決のために学んできたことを
どうやって実践するかを学んでください
ロボットを作ったり
ゲームをしたりあるいは何でも教えてくれる
エージェントを作ってみましょう
皆さんがデータや
アルゴリズムについての理解を深め
プログラムの作り方を理解できるよう
コースを進めていきましょう
そうしましょう
すばらしい ありがとう マイケル
こちらこそ
またこの場所で同じ時間にお会いしましょう