Okay, I'm going to show you how I arrived at my answer. So
I'm going to select this line. And I'm actually going to use the Python
IDE, to, to figure out the kind of structure of this document.
Here we are in the IDE. I'm going to paste in that first line.
So I pasted in that first line. It's in a variable called
reddit_front. We can go ahead, and run length on it. We see
it's 26,000 characters. Okay, so the first thing I'm going to do is
import json. And then I'm going to convert this, this document into json
using the loads function in the json module. So now I've got
this big dictionary: j. And it's got all this stuff in it. Actually,
that wasn't very useful. It just printed everything. So let's look at j.keys.
We can see there are two keys here: kind and data. Data is
almost certainly the one we want. Let's look at that. Oh, another
bunch of stuff. Let's look at the keys on this. This has just
four keys: after, before, children, and modhash. Children is going to be one
we want. The other ones are just simple little variables. So let's look
at children. Now we're starting to get somewhere. Let's look at the keys
of this. It's a list. So it's probably a list of lengths. Which
is kind of what we're expecting. Let's look at one of these. Again,
a bunch of crap. But let's look at the keys for the first
element in the links, or in this children's list. We can see that
it has kind and data. So let's look at the data for this
guy. We're starting to get a little bit closer. Let's see what the
keys are for this guy. Aha, perfect. And we can see that 'ups' is
actually in this. So if I were to call ups, we can see that
it is the integer of the number of ups on this link. So that's how
I found this. So, looking at our total JSON document, were going to look at
data, were going to look at children. And
then for each of the children, were going to
sum up the ups. If I were to change zero to one, to find
the second element on the list, we can see that we get another variable. I'm
going to take this piece of code with me into the ide, and were going to write
a function to add up all the ups. Okay, here we are in the ide,
and what we want to do is we want to sum up all
of the ops. So I can say sum. So I'm going to
say C data ups, for C in data children. Basically what
I'm doing, is I'm iterating over the list data children, which
we know is a list. For C, and each element in
that list, I'm going to look up data ups on that object
C. And then we're going to sum it up using the Python
built in function, sum, and I'm just going to return that.
Let's give that a run. J is not defined. That's
means I didn't load the actual string of the Reddit
frontpage in JSON, into a JSON object. So let's do
that. Let's run that again. Here we go. Now, here's your
answer, 103978. Simple enough. Now, what I wanted you to
accomplish there was just learn how to you now load this
into JSON, and then manipulate the data structure a little
bit. And, you can see it's just like manipulating any Python
data structure because JSON maps very cleanly to what we
already have been working with in Python which is dictionaries
and lists and integers and floats and that sort of
thing, so. Pretty handy there. You are now a JSON expert.
答えを導いてみましょう
この行を選択します Python IDEを使って
このドキュメントの構造を調べます
これがIDEで1行目に貼りつけます
reddit_frontという変数に貼りつけます
そして長さを測る関数を実行すると
2万6千字です 次にJSONをインポートします
そしてこのドキュメントをJSONに変換します
JSONモジュールのloads関数を使いましょう
ここで取得した辞書jはこれらすべてを含んでいます
そのまま出力されただけです j.keysを見てみましょう
ここにkindとdataという2つのキーがあります
では私たちが欲しかったデータを見ていきましょう
また大量のデータです こちらのキーも見てみましょう
4つのキー、after、before、children、modhashのうち
必要なのはchildrenで他はただの変数です
ではchildrenのキーを見ていきましょう
リストです おそらく長さのリストでしょう
思ったとおりです この中の1つに注目します
大量のデータからリンクやchildrenのリストにある
最初の要素のキーを探します
それはkindとdataを持っているので
dataを見てみましょう
今度は少し詳しく見ていきます
キーは何でしょうか?
出ましたね ここにあるupsは
この中にあるのでupsを呼び出すと
このリンクにあるupsの数が返されます
ここにありますね このJSONドキュメント全体を見ると
dataとchildrenがあります
それぞれのchildrenに対して
upsを合計します リストの2番目の要素を
見つけるために0を1に変える場合
別の変数を取得します
次にこのコードをIDEに移し
すべてのupsを計算する関数を書きます
次はIDEです ここでまずupsを合計します
ここでsumと入力して
c、data、ups、for c in、data、children
今はdataとchildrenのリストを繰り返し適用しています
これはリストです cとリストの各要素に対して
オブジェクトcのdataとupsを調べます
そしてPythonの組み込み関数sumを使って
それを合計して返します
これを実行しましょう jが定義されていませんでした
Redditのトップページの文字列を
JSONオブジェクトに読み込んでいませんでした
それを読み込んで実行します
答えは10万3978です 簡単ですね
ここで覚えてほしいのはJSONに読み込み
データ構造を少し操作する方法です
Pythonのデータ構造を操作するのと似ています
JSONはすでにPythonで扱った
辞書やリスト、整数、少数などを
きれいにマップするのです
とても便利ですね 皆さんはもうJSONの専門家です