1
00:00:00,000 --> 00:00:05,000
One of the things we'd like able to do is to compute statistics on lists.
2
00:00:05,000 --> 00:00:10,000
Imagine that we've got a set of n nodes and we've got L, which is a list of values--
3
00:00:10,000 --> 00:00:16,000
one for each of the nodes in the network and we want to compute statistics.
4
00:00:16,000 --> 00:00:21,000
What are statistics? Astatistic is actually quite a simple idea.
5
00:00:21,000 --> 00:00:24,000
It's just a number that summarizes a list of numbers.
6
00:00:24,000 --> 00:00:28,000
If we have a list L--say a list of the centrality scores.
7
00:00:28,000 --> 00:00:33,000
There's lots of different statistics that we can imagine that would summarized
8
00:00:33,000 --> 00:00:37,000
these list of numbers--for example, how many numbers are in the list.
9
00:00:37,000 --> 00:00:42,000
What's the largest number in the list? What's the total of all the scores in the list?
10
00:00:42,000 --> 00:00:47,000
How many scores are in the list that are between 2 and 3 inclusive?
11
00:00:47,000 --> 00:00:52,000
These are all different statistics and some of these are may be more useful than others
12
00:00:52,000 --> 00:00:55,000
but there's lots of different statistics that we might want to compute.
13
00:00:55,000 --> 00:01:00,000
In general when you're doing an analysis of large structures like social networks,
14
00:01:00,000 --> 00:01:04,000
we need some way of summarizing this large amount of data--you can't just present the data
15
00:01:04,000 --> 00:01:09,000
in the raw form--it's too much for people to think of all at once.
16
00:01:09,000 --> 00:01:14,000
Statistics and computing statistics ends up being a really important operation.