1 00:00:00,480 --> 00:00:02,070 There are a lot of times when we want to build 2 00:00:02,070 --> 00:00:05,800 a reverse index from a data set, to allow for some 3 00:00:05,800 --> 00:00:09,800 faster searching. If you've ever read some sort of reference book, 4 00:00:09,800 --> 00:00:12,670 you've probably used an index. So here's a book I like, 5 00:00:12,670 --> 00:00:15,680 A Brief History of Time. And the subtitle is From the 6 00:00:15,680 --> 00:00:18,000 Big Bang to Black Holes. So let's say I just want 7 00:00:18,000 --> 00:00:20,620 to read about the big bang. Well, only thing I can 8 00:00:20,620 --> 00:00:25,590 do is crawl through all of this text, and look for 9 00:00:25,590 --> 00:00:28,280 the words big bang. But of course, I'm not going to 10 00:00:28,280 --> 00:00:32,360 do that. Somebody's already gone through put in the work in 11 00:00:32,360 --> 00:00:35,720 advance to create an index for me. So at the end 12 00:00:35,720 --> 00:00:37,920 of the book. I can just go to the index, and 13 00:00:37,920 --> 00:00:40,910 on the first page we have B, and I see big 14 00:00:40,910 --> 00:00:44,740 bang. And these are all the pages where I can find 15 00:00:44,740 --> 00:00:47,160 it. And of course you can do the same thing with 16 00:00:47,160 --> 00:00:51,620 the web, except instead of numbers, we'd have links to web pages. 17 00:00:51,620 --> 00:00:54,740 This is a common problem, and so there's a design pattern to 18 00:00:54,740 --> 00:00:58,640 solve. Why don't you take a shot at building your own inverted index?