[Script Info] Title: [Events] Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text Dialogue: 0,0:00:17.08,0:00:18.19,Default,,0000,0000,0000,,TODD SCHNEIDER: All right. We're, we're good.\NThank you. Dialogue: 0,0:00:18.19,0:00:19.91,Default,,0000,0000,0000,,Sorry for the delay. Classic. Dialogue: 0,0:00:19.91,0:00:22.27,Default,,0000,0000,0000,,Even in the future nothing works. Welcome. Dialogue: 0,0:00:22.27,0:00:26.24,Default,,0000,0000,0000,,I am Todd. I'm an engineer at Rap Genius. Dialogue: 0,0:00:26.24,0:00:31.64,Default,,0000,0000,0000,,And today's talk is going to be about data\Nscience with a live tutorial. Dialogue: 0,0:00:31.64,0:00:34.36,Default,,0000,0000,0000,,And before we get into the live coding component, Dialogue: 0,0:00:34.36,0:00:36.07,Default,,0000,0000,0000,,I wanted to show you all a project I Dialogue: 0,0:00:36.07,0:00:39.03,Default,,0000,0000,0000,,built previously, which kind of serves as\Nthe inspiration Dialogue: 0,0:00:39.03,0:00:41.47,Default,,0000,0000,0000,,for this talk. Sort of. So this is a Dialogue: 0,0:00:41.47,0:00:45.44,Default,,0000,0000,0000,,website called weddingcrunchers dot com. What\Nis Wedding Crunchers? Dialogue: 0,0:00:45.44,0:00:48.11,Default,,0000,0000,0000,,It's a place where you can track the, the Dialogue: 0,0:00:48.11,0:00:50.98,Default,,0000,0000,0000,,popularity of words and phrases in the New\NYork Dialogue: 0,0:00:50.98,0:00:54.45,Default,,0000,0000,0000,,Times wedding section over the past thirty-some\Nyears. Dialogue: 0,0:00:54.45,0:00:56.13,Default,,0000,0000,0000,,And a lot of you might be wondering why Dialogue: 0,0:00:56.13,0:00:58.64,Default,,0000,0000,0000,,on earth would this be interesting or relevant\Nor Dialogue: 0,0:00:58.64,0:01:01.53,Default,,0000,0000,0000,,funny or anything, and I hope to convince\Nyou Dialogue: 0,0:01:01.53,0:01:04.36,Default,,0000,0000,0000,,of that very quickly. Here is a, a example Dialogue: 0,0:01:04.36,0:01:07.22,Default,,0000,0000,0000,,wedding announcement from the New York Times.\NThis one's Dialogue: 0,0:01:07.22,0:01:08.03,Default,,0000,0000,0000,,from 1985. Dialogue: 0,0:01:08.03,0:01:08.97,Default,,0000,0000,0000,,If you don't know me, you don't live in Dialogue: 0,0:01:08.97,0:01:11.26,Default,,0000,0000,0000,,New York, read the New York Times, the wedding Dialogue: 0,0:01:11.26,0:01:14.28,Default,,0000,0000,0000,,section is a certain cultural cache. It's\Nkind of Dialogue: 0,0:01:14.28,0:01:15.72,Default,,0000,0000,0000,,an honor to be listed in there and it's Dialogue: 0,0:01:15.72,0:01:18.58,Default,,0000,0000,0000,,got a very resume-like structure. People get\Nto brag Dialogue: 0,0:01:18.58,0:01:20.11,Default,,0000,0000,0000,,about where they went to school and what they Dialogue: 0,0:01:20.11,0:01:20.98,Default,,0000,0000,0000,,do. Dialogue: 0,0:01:20.98,0:01:23.05,Default,,0000,0000,0000,,So here is an example. You know, Diane deCordova Dialogue: 0,0:01:23.05,0:01:25.27,Default,,0000,0000,0000,,is marrying Michael Monro Lewis. They both\Nwent to Dialogue: 0,0:01:25.27,0:01:28.25,Default,,0000,0000,0000,,Princeton. They graduated Cum Laude. You know,\Nshe works Dialogue: 0,0:01:28.25,0:01:30.44,Default,,0000,0000,0000,,at Morgan Stanley. He works at Solomon Brothers\Nin Dialogue: 0,0:01:30.44,0:01:32.61,Default,,0000,0000,0000,,New York and they're gonna go to London. And Dialogue: 0,0:01:32.61,0:01:34.43,Default,,0000,0000,0000,,this should be a little familiar to a bunch Dialogue: 0,0:01:34.43,0:01:35.42,Default,,0000,0000,0000,,of you. Dialogue: 0,0:01:35.42,0:01:37.87,Default,,0000,0000,0000,,Mr. Lewis and associates Solomon Brothers\Nis Michael Lewis. Dialogue: 0,0:01:37.87,0:01:40.60,Default,,0000,0000,0000,,He's given you Right Lawyers Poker??, famous\Nbook about Dialogue: 0,0:01:40.60,0:01:42.81,Default,,0000,0000,0000,,his experience there. And before, before he\Nwas a Dialogue: 0,0:01:42.81,0:01:45.71,Default,,0000,0000,0000,,famous writer, he was just another New York\NTimes Dialogue: 0,0:01:45.71,0:01:49.63,Default,,0000,0000,0000,,wedding announced person. Dialogue: 0,0:01:49.63,0:01:51.56,Default,,0000,0000,0000,,And so what Wedding Crunchers does is it takes Dialogue: 0,0:01:51.56,0:01:54.56,Default,,0000,0000,0000,,the entire corpus of New York Times wedding\Nannouncements Dialogue: 0,0:01:54.56,0:01:57.41,Default,,0000,0000,0000,,back from 1981 and you can searh for words Dialogue: 0,0:01:57.41,0:01:59.52,Default,,0000,0000,0000,,and phrases and you can see how common those Dialogue: 0,0:01:59.52,0:02:01.80,Default,,0000,0000,0000,,words and phrases are, you know, by year.\NIt's Dialogue: 0,0:02:01.80,0:02:03.32,Default,,0000,0000,0000,,like, this is a good one that's relevant to Dialogue: 0,0:02:03.32,0:02:06.41,Default,,0000,0000,0000,,people here. You know, banker and programmer.\NYou know, Dialogue: 0,0:02:06.41,0:02:08.98,Default,,0000,0000,0000,,for example, when you list so-and-so is a\Nbanker Dialogue: 0,0:02:08.98,0:02:11.78,Default,,0000,0000,0000,,or is a programmer in the announcement and\Nyou Dialogue: 0,0:02:11.78,0:02:13.70,Default,,0000,0000,0000,,see, over time, you know, banker used to be Dialogue: 0,0:02:13.70,0:02:18.45,Default,,0000,0000,0000,,way more commonly used than programmer in\Nthese announcements. Dialogue: 0,0:02:18.45,0:02:21.14,Default,,0000,0000,0000,,And only just this year, in 2014, programmer\Nhas Dialogue: 0,0:02:21.14,0:02:28.14,Default,,0000,0000,0000,,finally overtaken banker as, you know, the,\Nthe place, Dialogue: 0,0:02:28.19,0:02:29.89,Default,,0000,0000,0000,,you know, the people getting married in New\NYork, Dialogue: 0,0:02:29.89,0:02:32.77,Default,,0000,0000,0000,,who are part of society, come from. Another\Ngood Dialogue: 0,0:02:32.77,0:02:35.17,Default,,0000,0000,0000,,one is, if you look at goldman, sachs and Dialogue: 0,0:02:35.17,0:02:37.60,Default,,0000,0000,0000,,google- is my internet on? Good. Dialogue: 0,0:02:37.60,0:02:41.15,Default,,0000,0000,0000,,So here's another good one. So Goldman Sachs,\Nyou Dialogue: 0,0:02:41.15,0:02:44.12,Default,,0000,0000,0000,,know, classic New York financial instition.\NGoogle, new kid Dialogue: 0,0:02:44.12,0:02:47.16,Default,,0000,0000,0000,,on the block. Tech scene. Boom. Taking over. Dialogue: 0,0:02:47.16,0:02:49.80,Default,,0000,0000,0000,,And, you know, this is obviously fun, and\Nit's Dialogue: 0,0:02:49.80,0:02:52.44,Default,,0000,0000,0000,,amusing. But it's also actually pretty insightful\Nfor a Dialogue: 0,0:02:52.44,0:02:55.76,Default,,0000,0000,0000,,relatively simple concept. I mean, this one\Ngraph tells Dialogue: 0,0:02:55.76,0:02:58.74,Default,,0000,0000,0000,,a pretty powerful story of, you know, New\NYork Dialogue: 0,0:02:58.74,0:03:01.75,Default,,0000,0000,0000,,the, the finance capitol of the world. Meanwhile,\Nwe Dialogue: 0,0:03:01.75,0:03:03.55,Default,,0000,0000,0000,,have this sort of emerging tech scene. You\Nknow, Dialogue: 0,0:03:03.55,0:03:05.15,Default,,0000,0000,0000,,Google may be the biggest player in the kind Dialogue: 0,0:03:05.15,0:03:06.96,Default,,0000,0000,0000,,of new tech world. Dialogue: 0,0:03:06.96,0:03:09.51,Default,,0000,0000,0000,,And now, when you turn to the society pages Dialogue: 0,0:03:09.51,0:03:11.21,Default,,0000,0000,0000,,to see who's getting married, you know, there's\Nmore Dialogue: 0,0:03:11.21,0:03:13.97,Default,,0000,0000,0000,,employees from Google than there are from\NGullman Sachs. Dialogue: 0,0:03:13.97,0:03:16.75,Default,,0000,0000,0000,,And that, you know, kind of interesting thing\Nin Dialogue: 0,0:03:16.75,0:03:17.74,Default,,0000,0000,0000,,the world. Dialogue: 0,0:03:17.74,0:03:20.50,Default,,0000,0000,0000,,And so what we're gonna do today is build Dialogue: 0,0:03:20.50,0:03:25.12,Default,,0000,0000,0000,,something just like Wedding Crunchers, except,\Ninstead of using Dialogue: 0,0:03:25.12,0:03:28.28,Default,,0000,0000,0000,,the text of wedding announcements to analyze,\Nwe're going Dialogue: 0,0:03:28.28,0:03:32.67,Default,,0000,0000,0000,,to look at all of the RailsConf talk abstracts. Dialogue: 0,0:03:32.67,0:03:34.08,Default,,0000,0000,0000,,And so, you know, hopefully this is, this\Nis Dialogue: 0,0:03:34.08,0:03:36.55,Default,,0000,0000,0000,,interesting to people here and, I always say,\Nyou Dialogue: 0,0:03:36.55,0:03:38.71,Default,,0000,0000,0000,,know, if there's only one thing you take from Dialogue: 0,0:03:38.71,0:03:41.32,Default,,0000,0000,0000,,this talk, really, what it should be is that, Dialogue: 0,0:03:41.32,0:03:43.71,Default,,0000,0000,0000,,you know, work on a problem that's interesting\Nto Dialogue: 0,0:03:43.71,0:03:46.26,Default,,0000,0000,0000,,you. Because, especially when you're dealing\Nwith data science, Dialogue: 0,0:03:46.26,0:03:47.59,Default,,0000,0000,0000,,a lot of it's pretty messy and then you Dialogue: 0,0:03:47.59,0:03:49.29,Default,,0000,0000,0000,,have to go through scraping stuff as we'll\Nget Dialogue: 0,0:03:49.29,0:03:51.88,Default,,0000,0000,0000,,into, and it's easy to get frustrated and\Nkind Dialogue: 0,0:03:51.88,0:03:53.81,Default,,0000,0000,0000,,of lost and like, if you're not working on Dialogue: 0,0:03:53.81,0:03:55.45,Default,,0000,0000,0000,,something that you care about, and something\Nthat you Dialogue: 0,0:03:55.45,0:03:58.06,Default,,0000,0000,0000,,really want to know, kind of, the final result, Dialogue: 0,0:03:58.06,0:04:00.11,Default,,0000,0000,0000,,it's just much easier to get distracted and\Nkind Dialogue: 0,0:04:00.11,0:04:01.07,Default,,0000,0000,0000,,of, ultimately, bail. Dialogue: 0,0:04:01.07,0:04:03.82,Default,,0000,0000,0000,,So, again, if you take one thing, just work Dialogue: 0,0:04:03.82,0:04:07.55,Default,,0000,0000,0000,,on something that is interesting to you. So\Nthe Dialogue: 0,0:04:07.55,0:04:09.82,Default,,0000,0000,0000,,particular kind of analysis we're gonna do\Nis something Dialogue: 0,0:04:09.82,0:04:12.68,Default,,0000,0000,0000,,called n-gram analysis. And I have a little\Nexample Dialogue: 0,0:04:12.68,0:04:14.19,Default,,0000,0000,0000,,set up here. So what is an n-gram? You Dialogue: 0,0:04:14.19,0:04:15.80,Default,,0000,0000,0000,,may have heard the word before. Dialogue: 0,0:04:15.80,0:04:19.10,Default,,0000,0000,0000,,Really, all it means is, you know, a, a Dialogue: 0,0:04:19.10,0:04:23.83,Default,,0000,0000,0000,,consecutive words as part of a sentence. So\Nlike, Dialogue: 0,0:04:23.83,0:04:26.03,Default,,0000,0000,0000,,examples very simple, for one simple. This\Ntalk is Dialogue: 0,0:04:26.03,0:04:28.00,Default,,0000,0000,0000,,boring. What are the, what are the one grams Dialogue: 0,0:04:28.00,0:04:30.33,Default,,0000,0000,0000,,in this sentence? It's just the words. This,\Ntalk, Dialogue: 0,0:04:30.33,0:04:32.78,Default,,0000,0000,0000,,is, and boring. The two grams are every pair Dialogue: 0,0:04:32.78,0:04:35.84,Default,,0000,0000,0000,,of consecutive words. This talk, talk is,\Nis boring, Dialogue: 0,0:04:35.84,0:04:37.22,Default,,0000,0000,0000,,and so on. Dialogue: 0,0:04:37.22,0:04:38.15,Default,,0000,0000,0000,,And so what we need to be able to Dialogue: 0,0:04:38.15,0:04:40.89,Default,,0000,0000,0000,,do in order to build, you know, a graph Dialogue: 0,0:04:40.89,0:04:43.30,Default,,0000,0000,0000,,like this, is we need to take a term Dialogue: 0,0:04:43.30,0:04:45.16,Default,,0000,0000,0000,,that's, you know, relavent to RailsConf, say\Nsomething like Dialogue: 0,0:04:45.16,0:04:46.96,Default,,0000,0000,0000,,Ember or whatever, and we need to be able Dialogue: 0,0:04:46.96,0:04:48.76,Default,,0000,0000,0000,,to look up, you know, for each year how Dialogue: 0,0:04:48.76,0:04:51.30,Default,,0000,0000,0000,,many times does this, you know, word or n-gram Dialogue: 0,0:04:51.30,0:04:53.61,Default,,0000,0000,0000,,appear in the data. Dialogue: 0,0:04:53.61,0:04:55.55,Default,,0000,0000,0000,,And so that is what we are going to Dialogue: 0,0:04:55.55,0:04:58.79,Default,,0000,0000,0000,,build. And I have this brief little outline\Nhere. Dialogue: 0,0:04:58.79,0:05:01.02,Default,,0000,0000,0000,,There's kind of three steps. And this is pretty Dialogue: 0,0:05:01.02,0:05:04.63,Default,,0000,0000,0000,,general to, to any data project. You know,\Nstep Dialogue: 0,0:05:04.63,0:05:06.72,Default,,0000,0000,0000,,one is gonna be just gathering the data, getting Dialogue: 0,0:05:06.72,0:05:09.66,Default,,0000,0000,0000,,it in some usable form. Step two is gonna Dialogue: 0,0:05:09.66,0:05:11.26,Default,,0000,0000,0000,,be kind of the analysis part where we do Dialogue: 0,0:05:11.26,0:05:14.05,Default,,0000,0000,0000,,the n-gram calculation. We store the results.\NAnd then Dialogue: 0,0:05:14.05,0:05:15.79,Default,,0000,0000,0000,,step three is gonna be to create a nice Dialogue: 0,0:05:15.79,0:05:19.26,Default,,0000,0000,0000,,little front-end interface that lets us investigate,\Nvisualize and Dialogue: 0,0:05:19.26,0:05:20.81,Default,,0000,0000,0000,,see what we've done. Dialogue: 0,0:05:20.81,0:05:23.30,Default,,0000,0000,0000,,Now unfortunately, you know, in a, in a thirty Dialogue: 0,0:05:23.30,0:05:26.02,Default,,0000,0000,0000,,minute talk we can't possibly do all of this. Dialogue: 0,0:05:26.02,0:05:28.69,Default,,0000,0000,0000,,So we're gonna focus more on items one and Dialogue: 0,0:05:28.69,0:05:31.49,Default,,0000,0000,0000,,two and less so on three, and even then Dialogue: 0,0:05:31.49,0:05:33.10,Default,,0000,0000,0000,,it's too much. So, you know, I sort of Dialogue: 0,0:05:33.10,0:05:34.69,Default,,0000,0000,0000,,used the analogy, it'll be a bit like watching Dialogue: 0,0:05:34.69,0:05:37.42,Default,,0000,0000,0000,,TV on the Food Network, where we might, you Dialogue: 0,0:05:37.42,0:05:40.04,Default,,0000,0000,0000,,know, throw something in the oven, mysteriously\Nsomething else Dialogue: 0,0:05:40.04,0:05:42.01,Default,,0000,0000,0000,,pops out of the other oven even though it's, Dialogue: 0,0:05:42.01,0:05:43.76,Default,,0000,0000,0000,,where did that come from? Dialogue: 0,0:05:43.76,0:05:46.09,Default,,0000,0000,0000,,But not to worry. Everything is also on GitHub. Dialogue: 0,0:05:46.09,0:05:47.87,Default,,0000,0000,0000,,There's a repo I'll share with you at the Dialogue: 0,0:05:47.87,0:05:50.34,Default,,0000,0000,0000,,end. So anything that we don't cover or that Dialogue: 0,0:05:50.34,0:05:51.98,Default,,0000,0000,0000,,we cover too quickly or something, you'll\Nbe able Dialogue: 0,0:05:51.98,0:05:53.78,Default,,0000,0000,0000,,to see sort of the, the full version on Dialogue: 0,0:05:53.78,0:05:55.74,Default,,0000,0000,0000,,GitHub. Dialogue: 0,0:05:55.74,0:05:57.77,Default,,0000,0000,0000,,So let us jump in now to step one, Dialogue: 0,0:05:57.77,0:06:00.19,Default,,0000,0000,0000,,which is, you know, gathering the data. And\Nso Dialogue: 0,0:06:00.19,0:06:01.91,Default,,0000,0000,0000,,let's take a look back at the, the RailsConf Dialogue: 0,0:06:01.91,0:06:03.08,Default,,0000,0000,0000,,website again. So we have to figure out how Dialogue: 0,0:06:03.08,0:06:06.46,Default,,0000,0000,0000,,we're gonna model a, a RailsConf talk in our Dialogue: 0,0:06:06.46,0:06:09.89,Default,,0000,0000,0000,,database. So like, what, you know, attributes\Ndoes a, Dialogue: 0,0:06:09.89,0:06:13.34,Default,,0000,0000,0000,,do a, excuse me, does a RailsConf talk have. Dialogue: 0,0:06:13.34,0:06:14.29,Default,,0000,0000,0000,,And it's like, one thing we see is they Dialogue: 0,0:06:14.29,0:06:17.67,Default,,0000,0000,0000,,all have titles. So that looks like something.\NThey Dialogue: 0,0:06:17.67,0:06:20.09,Default,,0000,0000,0000,,have speakers. You know, there's this thing,\Nwhich is Dialogue: 0,0:06:20.09,0:06:23.33,Default,,0000,0000,0000,,the abstract, and then there's the bio. And\Nthat's Dialogue: 0,0:06:23.33,0:06:25.47,Default,,0000,0000,0000,,probably it. That's probably all we need. Dialogue: 0,0:06:25.47,0:06:27.67,Default,,0000,0000,0000,,So that's pretty simple. And, you know, I\Nhave Dialogue: 0,0:06:27.67,0:06:29.100,Default,,0000,0000,0000,,the little migration. I've already run here.\NBut here Dialogue: 0,0:06:29.100,0:06:31.79,Default,,0000,0000,0000,,are attributes for talks. It's just the year,\Nyou Dialogue: 0,0:06:31.79,0:06:33.91,Default,,0000,0000,0000,,know, what, what conference were we actually\Nat. The Dialogue: 0,0:06:33.91,0:06:36.11,Default,,0000,0000,0000,,title of the talk, the speaker, the abstract,\Nand Dialogue: 0,0:06:36.11,0:06:37.57,Default,,0000,0000,0000,,the bio. Dialogue: 0,0:06:37.57,0:06:41.49,Default,,0000,0000,0000,,And so also, that's, again, pretty straightforward.\NThe gemfile Dialogue: 0,0:06:41.49,0:06:45.09,Default,,0000,0000,0000,,is also very simple. It's mostly pretty boiler\Nplate. Dialogue: 0,0:06:45.09,0:06:47.83,Default,,0000,0000,0000,,Rails 4, Ruby 2.1. The only gems I wanted Dialogue: 0,0:06:47.83,0:06:49.41,Default,,0000,0000,0000,,to call out here are, we're gonna use nokogiri Dialogue: 0,0:06:49.41,0:06:52.31,Default,,0000,0000,0000,,for, you know, fetching, or, parsing websites\Nand kind Dialogue: 0,0:06:52.31,0:06:54.23,Default,,0000,0000,0000,,of scraping the data we need. We're gonna\Nuse Dialogue: 0,0:06:54.23,0:06:56.39,Default,,0000,0000,0000,,PosGres as our main data store and we're gonna Dialogue: 0,0:06:56.39,0:06:58.22,Default,,0000,0000,0000,,use redis to build these sort of index that Dialogue: 0,0:06:58.22,0:07:00.18,Default,,0000,0000,0000,,we can ultimately use to look up, you know, Dialogue: 0,0:07:00.18,0:07:02.39,Default,,0000,0000,0000,,how common a word is. Dialogue: 0,0:07:02.39,0:07:05.39,Default,,0000,0000,0000,,And so one thing that's not here is, like, Dialogue: 0,0:07:05.39,0:07:09.01,Default,,0000,0000,0000,,you know, gem fancy data algorithm. And a\Nlot Dialogue: 0,0:07:09.01,0:07:10.69,Default,,0000,0000,0000,,of people, this is kind of where Ruby often Dialogue: 0,0:07:10.69,0:07:13.37,Default,,0000,0000,0000,,gets a bad reputation of, you know, not being Dialogue: 0,0:07:13.37,0:07:16.04,Default,,0000,0000,0000,,supportive of scientific computing or whatever.\NAnd other languages Dialogue: 0,0:07:16.04,0:07:18.59,Default,,0000,0000,0000,,have more, more support. But my claim is that Dialogue: 0,0:07:18.59,0:07:20.52,Default,,0000,0000,0000,,it's really not that important. You can get\Na Dialogue: 0,0:07:20.52,0:07:23.51,Default,,0000,0000,0000,,ton of mileage out of very simple tools that Dialogue: 0,0:07:23.51,0:07:24.21,Default,,0000,0000,0000,,you can build yourself. Dialogue: 0,0:07:24.21,0:07:25.81,Default,,0000,0000,0000,,You know, you don't need a fancy gem or Dialogue: 0,0:07:25.81,0:07:28.36,Default,,0000,0000,0000,,any fancy algorithm. Those things are cool\Ntoo and Dialogue: 0,0:07:28.36,0:07:30.74,Default,,0000,0000,0000,,they have their place. But they're not needed\Na Dialogue: 0,0:07:30.74,0:07:33.35,Default,,0000,0000,0000,,lot of the time. And, you know, Ruby is Dialogue: 0,0:07:33.35,0:07:36.21,Default,,0000,0000,0000,,a wonderful language for, especially, scraping\Nstuff from the Dialogue: 0,0:07:36.21,0:07:38.45,Default,,0000,0000,0000,,web. There's a ton of support there. And so Dialogue: 0,0:07:38.45,0:07:40.98,Default,,0000,0000,0000,,I don't think that the, the lack of, you Dialogue: 0,0:07:40.98,0:07:43.51,Default,,0000,0000,0000,,know, fancy algorithm gems should necessarily\Nbe a deterrant Dialogue: 0,0:07:43.51,0:07:44.44,Default,,0000,0000,0000,,at all. Dialogue: 0,0:07:44.44,0:07:46.96,Default,,0000,0000,0000,,And so hopefully part of this talk is convincing Dialogue: 0,0:07:46.96,0:07:49.65,Default,,0000,0000,0000,,people that Ruby and Rails are actually quite\Nwell-suited Dialogue: 0,0:07:49.65,0:07:50.94,Default,,0000,0000,0000,,to problems like this. Dialogue: 0,0:07:50.94,0:07:53.56,Default,,0000,0000,0000,,OK. So now we actually need to write some Dialogue: 0,0:07:53.56,0:07:56.25,Default,,0000,0000,0000,,code to scrape the talk. And you know, if Dialogue: 0,0:07:56.25,0:07:57.42,Default,,0000,0000,0000,,you've ever done anything like this before,\Nyou know Dialogue: 0,0:07:57.42,0:07:59.52,Default,,0000,0000,0000,,that Chrome Inspector is your best friend.\NSo let's Dialogue: 0,0:07:59.52,0:08:02.50,Default,,0000,0000,0000,,fire that up. We're gonna inspect element,\Nand so Dialogue: 0,0:08:02.50,0:08:04.07,Default,,0000,0000,0000,,like, we actually, what we need to do now Dialogue: 0,0:08:04.07,0:08:06.89,Default,,0000,0000,0000,,is take you know, this HTML on the page Dialogue: 0,0:08:06.89,0:08:09.12,Default,,0000,0000,0000,,and turn it into a database record that we Dialogue: 0,0:08:09.12,0:08:11.89,Default,,0000,0000,0000,,can then, you know, use to our advantage later. Dialogue: 0,0:08:11.89,0:08:13.05,Default,,0000,0000,0000,,And so it looks like, you know, all the Dialogue: 0,0:08:13.05,0:08:16.63,Default,,0000,0000,0000,,talks are in these session classes. So that's\Nsomething. Dialogue: 0,0:08:16.63,0:08:19.85,Default,,0000,0000,0000,,We can look in here. This looks like something. Dialogue: 0,0:08:19.85,0:08:23.47,Default,,0000,0000,0000,,So let's make this bigger. Dialogue: 0,0:08:23.47,0:08:25.04,Default,,0000,0000,0000,,And you know it helps to, well, it's kind Dialogue: 0,0:08:25.04,0:08:29.06,Default,,0000,0000,0000,,of essential to be decent with CSS selectors\Nhere, Dialogue: 0,0:08:29.06,0:08:32.15,Default,,0000,0000,0000,,because that's how we're going to basically\Nfind stuff. Dialogue: 0,0:08:32.15,0:08:34.72,Default,,0000,0000,0000,,So let's see, OK, so there's eighty-one session\Ndivs. Dialogue: 0,0:08:34.72,0:08:37.99,Default,,0000,0000,0000,,That sounds about right. I happen to know\Nthat Dialogue: 0,0:08:37.99,0:08:42.23,Default,,0000,0000,0000,,mine is number seventy-eight, so let's, let's\Nlook at Dialogue: 0,0:08:42.23,0:08:44.36,Default,,0000,0000,0000,,that. And so here we are. So we need Dialogue: 0,0:08:44.36,0:08:46.97,Default,,0000,0000,0000,,to, again, the, the things we're mod- or,\Nthe Dialogue: 0,0:08:46.97,0:08:50.25,Default,,0000,0000,0000,,attributes we're storing at the title, the\Nspeaker, the Dialogue: 0,0:08:50.25,0:08:52.68,Default,,0000,0000,0000,,abstract, and the bio. And so we're gonna\Nneed Dialogue: 0,0:08:52.68,0:08:54.85,Default,,0000,0000,0000,,to pull these things out. Dialogue: 0,0:08:54.85,0:08:57.63,Default,,0000,0000,0000,,So let's see. It looks like the, the title Dialogue: 0,0:08:57.63,0:09:00.49,Default,,0000,0000,0000,,is in this h1 element inside the header. So Dialogue: 0,0:09:00.49,0:09:04.83,Default,,0000,0000,0000,,let's just make sure that works. You know,\Nheader Dialogue: 0,0:09:04.83,0:09:08.45,Default,,0000,0000,0000,,h1. That looks right. Dialogue: 0,0:09:08.45,0:09:13.65,Default,,0000,0000,0000,,The, the speaker looks to be the header h2. Dialogue: 0,0:09:13.65,0:09:16.06,Default,,0000,0000,0000,,Cool. Dialogue: 0,0:09:16.06,0:09:20.64,Default,,0000,0000,0000,,Now the abstract is in this p tag, so Dialogue: 0,0:09:20.64,0:09:23.13,Default,,0000,0000,0000,,we can do something like this. But this is Dialogue: 0,0:09:23.13,0:09:26.49,Default,,0000,0000,0000,,actually not quite right. So what's wrong\Nwith this? Dialogue: 0,0:09:26.49,0:09:30.14,Default,,0000,0000,0000,,Well, the abstract ends, you know, suited\Nto the Dialogue: 0,0:09:30.14,0:09:32.31,Default,,0000,0000,0000,,problem. The bio here is also in the p Dialogue: 0,0:09:32.31,0:09:35.31,Default,,0000,0000,0000,,tag. Originally a math guy. And we've actually\Npulled Dialogue: 0,0:09:35.31,0:09:37.01,Default,,0000,0000,0000,,all the p-tags. So we need a way of Dialogue: 0,0:09:37.01,0:09:38.94,Default,,0000,0000,0000,,not doing that. And this is where you just Dialogue: 0,0:09:38.94,0:09:40.20,Default,,0000,0000,0000,,need to know a little bit of CSS. Not Dialogue: 0,0:09:40.20,0:09:42.55,Default,,0000,0000,0000,,very complicated. But if you use the little\Ngreater Dialogue: 0,0:09:42.55,0:09:44.80,Default,,0000,0000,0000,,than guy, what this says is only take the Dialogue: 0,0:09:44.80,0:09:47.21,Default,,0000,0000,0000,,p tags that are immediate descendants of the\Nsession Dialogue: 0,0:09:47.21,0:09:50.39,Default,,0000,0000,0000,,div. And so now we have, you know, only Dialogue: 0,0:09:50.39,0:09:51.06,Default,,0000,0000,0000,,the abstract. Dialogue: 0,0:09:51.06,0:09:54.34,Default,,0000,0000,0000,,And lastly, you know, the bio is just in Dialogue: 0,0:09:54.34,0:09:58.46,Default,,0000,0000,0000,,its own little section. So something like\Nthat. Cool. Dialogue: 0,0:09:58.46,0:10:00.19,Default,,0000,0000,0000,,So that is the jQuery version of it. We Dialogue: 0,0:10:00.19,0:10:03.18,Default,,0000,0000,0000,,need to do this, though, in Ruby. And as Dialogue: 0,0:10:03.18,0:10:05.25,Default,,0000,0000,0000,,I said, this does sometimes get a little tedious. Dialogue: 0,0:10:05.25,0:10:07.34,Default,,0000,0000,0000,,But let's, let's write the code. So I have Dialogue: 0,0:10:07.34,0:10:12.16,Default,,0000,0000,0000,,this empty method - create_railsconf_2014_talks.\NAnd also this method Dialogue: 0,0:10:12.16,0:10:14.76,Default,,0000,0000,0000,,I've written already called fetch_and_parse,\Nwhich just gets a Dialogue: 0,0:10:14.76,0:10:16.61,Default,,0000,0000,0000,,URL and sends it to nokogiri, which we can Dialogue: 0,0:10:16.61,0:10:17.69,Default,,0000,0000,0000,,then use to do our CSS selectors. Dialogue: 0,0:10:17.69,0:10:20.51,Default,,0000,0000,0000,,So let, let's just write this. So we can Dialogue: 0,0:10:20.51,0:10:27.40,Default,,0000,0000,0000,,say doc is fetch_and_parse. The url is this.\NLet's Dialogue: 0,0:10:27.40,0:10:33.94,Default,,0000,0000,0000,,see if this works in the console. Dialogue: 0,0:10:33.94,0:10:40.94,Default,,0000,0000,0000,,Of course, in here. Do I have internet? Nice. Dialogue: 0,0:10:47.36,0:10:52.70,Default,,0000,0000,0000,,So we can then check the same thing. Again. Dialogue: 0,0:10:52.70,0:10:57.83,Default,,0000,0000,0000,,Looks right. Let's find my talk, which, this\Npart Dialogue: 0,0:10:57.83,0:10:59.31,Default,,0000,0000,0000,,I couldn't possibly tell you. When you use\Nthe Dialogue: 0,0:10:59.31,0:11:01.61,Default,,0000,0000,0000,,nokogiri, the eq thing, you have to add two Dialogue: 0,0:11:01.61,0:11:04.33,Default,,0000,0000,0000,,from whatever jQuery does. So I'm number 80\Nnow. Dialogue: 0,0:11:04.33,0:11:06.57,Default,,0000,0000,0000,,Don't ask me why. I couldn't possibly tell\Nyou. Dialogue: 0,0:11:06.57,0:11:10.21,Default,,0000,0000,0000,,But maybe someone here knows. Be curious to\Nfind Dialogue: 0,0:11:10.21,0:11:10.78,Default,,0000,0000,0000,,out. Dialogue: 0,0:11:10.78,0:11:11.92,Default,,0000,0000,0000,,AUDIENCE: ?? (00:11:13) Dialogue: 0,0:11:11.92,0:11:15.40,Default,,0000,0000,0000,,T.S.: So there it is. There's the title. So Dialogue: 0,0:11:15.40,0:11:17.38,Default,,0000,0000,0000,,let us now write some code here. We have Dialogue: 0,0:11:17.38,0:11:21.52,Default,,0000,0000,0000,,our, our document. We're gonna go through\Neach session. Dialogue: 0,0:11:21.52,0:11:24.32,Default,,0000,0000,0000,,The CSS method is kind of like, you know, Dialogue: 0,0:11:24.32,0:11:28.90,Default,,0000,0000,0000,,the selector for nokogiri. Each elements.\NSo each of Dialogue: 0,0:11:28.90,0:11:35.37,Default,,0000,0000,0000,,these we're gonna create a talk. Dialogue: 0,0:11:35.37,0:11:38.39,Default,,0000,0000,0000,,And again. So the year we already know is Dialogue: 0,0:11:38.39,0:11:45.39,Default,,0000,0000,0000,,2014. The title we're gonna say is, elm.css("header\Nh1").inner_text. Dialogue: 0,0:11:48.30,0:11:55.30,Default,,0000,0000,0000,,Speaker, header h2, dun nuh nuh dun nuh nuh Dialogue: 0,0:12:00.46,0:12:04.52,Default,,0000,0000,0000,,nuh. Gettin' there. Dialogue: 0,0:12:04.52,0:12:09.95,Default,,0000,0000,0000,,All right. So I think this will probably work. Dialogue: 0,0:12:09.95,0:12:13.98,Default,,0000,0000,0000,,Let's find out. And so we're back in here. Dialogue: 0,0:12:13.98,0:12:19.47,Default,,0000,0000,0000,,Just to prove to you that I'm not lying, Dialogue: 0,0:12:19.47,0:12:23.45,Default,,0000,0000,0000,,2014 dot count. There's none of them. And,\Nwhat'd Dialogue: 0,0:12:23.45,0:12:26.44,Default,,0000,0000,0000,,I call this method? This guy. Delayed::Job. Dialogue: 0,0:12:26.44,0:12:33.44,Default,,0000,0000,0000,,All right. So we just did something. Did it Dialogue: 0,0:12:33.44,0:12:40.44,Default,,0000,0000,0000,,work? Nice. We got eighty-one talks. Most\Nimportantly, let's, Dialogue: 0,0:12:41.15,0:12:42.39,Default,,0000,0000,0000,,we have my talk. That's the, that's the only Dialogue: 0,0:12:42.39,0:12:46.76,Default,,0000,0000,0000,,one that matters anyway. And so, you know,\Nyou Dialogue: 0,0:12:46.76,0:12:48.26,Default,,0000,0000,0000,,might be thinking now, like, you know, what\Nthe Dialogue: 0,0:12:48.26,0:12:50.12,Default,,0000,0000,0000,,heck, I came to the, the data science talk, Dialogue: 0,0:12:50.12,0:12:52.33,Default,,0000,0000,0000,,not the scraping talk. You know, to that,\NI Dialogue: 0,0:12:52.33,0:12:56.02,Default,,0000,0000,0000,,would say, tough luck. They're the same thing.\NYou Dialogue: 0,0:12:56.02,0:12:57.88,Default,,0000,0000,0000,,know, you might not, you might not want to Dialogue: 0,0:12:57.88,0:13:00.04,Default,,0000,0000,0000,,hear it, but guess what, this is usually the Dialogue: 0,0:13:00.04,0:13:02.02,Default,,0000,0000,0000,,most important part of the entire project. Dialogue: 0,0:13:02.02,0:13:04.96,Default,,0000,0000,0000,,It's the hardest part, you know, because guess\Nwhat, Dialogue: 0,0:13:04.96,0:13:07.08,Default,,0000,0000,0000,,just because we got the 2014 talks, you know, Dialogue: 0,0:13:07.08,0:13:08.63,Default,,0000,0000,0000,,now we have to get the 2013 talks. And Dialogue: 0,0:13:08.63,0:13:10.88,Default,,0000,0000,0000,,the 2012 talks. And they're all on different\Nwebsites. Dialogue: 0,0:13:10.88,0:13:12.89,Default,,0000,0000,0000,,They all have different structures. You know,\Nyou're gonna Dialogue: 0,0:13:12.89,0:13:15.09,Default,,0000,0000,0000,,have to write different code to get each type Dialogue: 0,0:13:15.09,0:13:17.12,Default,,0000,0000,0000,,of website. It's a pain. And this is why Dialogue: 0,0:13:17.12,0:13:19.24,Default,,0000,0000,0000,,I said earlier, you know, really make sure\Nyou're Dialogue: 0,0:13:19.24,0:13:21.16,Default,,0000,0000,0000,,working on something you care about. Because\Nit's just Dialogue: 0,0:13:21.16,0:13:24.30,Default,,0000,0000,0000,,not fun to like, like, ugh, in 2008 they Dialogue: 0,0:13:24.30,0:13:26.85,Default,,0000,0000,0000,,separated the speakers and the abstracts.\NAnd it's like, Dialogue: 0,0:13:26.85,0:13:29.26,Default,,0000,0000,0000,,it's just, it's annoying, but again, it's\Nthe most Dialogue: 0,0:13:29.26,0:13:30.29,Default,,0000,0000,0000,,important part I would say. Dialogue: 0,0:13:30.29,0:13:32.92,Default,,0000,0000,0000,,You know, so much of data science is taking Dialogue: 0,0:13:32.92,0:13:35.96,Default,,0000,0000,0000,,data that's either unstructured or structured\Nin the wrong Dialogue: 0,0:13:35.96,0:13:39.02,Default,,0000,0000,0000,,format to you and, you know, getting it into Dialogue: 0,0:13:39.02,0:13:40.51,Default,,0000,0000,0000,,the way, you know, into the structure that\Nyou Dialogue: 0,0:13:40.51,0:13:43.41,Default,,0000,0000,0000,,need to do whatever analysis you want to do. Dialogue: 0,0:13:43.41,0:13:45.13,Default,,0000,0000,0000,,So in this case, that's taking, you know,\Nhtml Dialogue: 0,0:13:45.13,0:13:47.93,Default,,0000,0000,0000,,on a page and converting it into a PosGres Dialogue: 0,0:13:47.93,0:13:49.43,Default,,0000,0000,0000,,database. Dialogue: 0,0:13:49.43,0:13:52.80,Default,,0000,0000,0000,,And so we have done that now. And again, Dialogue: 0,0:13:52.80,0:13:53.92,Default,,0000,0000,0000,,take my word that, you know, I've done this Dialogue: 0,0:13:53.92,0:13:56.98,Default,,0000,0000,0000,,for the other years as well. Back in 2007 Dialogue: 0,0:13:56.98,0:14:00.93,Default,,0000,0000,0000,,and so we have a total of 497 talks Dialogue: 0,0:14:00.93,0:14:04.22,Default,,0000,0000,0000,,in here from RailsConfs over the years. And\Nso Dialogue: 0,0:14:04.22,0:14:06.87,Default,,0000,0000,0000,,that's cool. That's basically our dataset\Nthat we're gonna Dialogue: 0,0:14:06.87,0:14:07.20,Default,,0000,0000,0000,,use. Dialogue: 0,0:14:07.20,0:14:08.73,Default,,0000,0000,0000,,And so we can sort of move on to, Dialogue: 0,0:14:08.73,0:14:11.00,Default,,0000,0000,0000,,you know, step two of the project here, which Dialogue: 0,0:14:11.00,0:14:14.00,Default,,0000,0000,0000,,is, you know, do the n-gram calculation and\Nstore Dialogue: 0,0:14:14.00,0:14:16.80,Default,,0000,0000,0000,,the results. And so let's go back to talk.rb. Dialogue: 0,0:14:16.80,0:14:18.70,Default,,0000,0000,0000,,All this by the way is just in, you Dialogue: 0,0:14:18.70,0:14:21.92,Default,,0000,0000,0000,,know, app/models/talk.rb. That's where all\Nthis code is. Dialogue: 0,0:14:21.92,0:14:25.56,Default,,0000,0000,0000,,And I have another empty method somewhere\Ncalled def Dialogue: 0,0:14:25.56,0:14:27.59,Default,,0000,0000,0000,,ngrams. And so this method, we're gonna need\Nto Dialogue: 0,0:14:27.59,0:14:29.81,Default,,0000,0000,0000,,give, you know, it goes on a talk. So Dialogue: 0,0:14:29.81,0:14:32.40,Default,,0000,0000,0000,,given a value of n, calculate on the ngrams Dialogue: 0,0:14:32.40,0:14:34.54,Default,,0000,0000,0000,,from that talk's abstract. Dialogue: 0,0:14:34.54,0:14:36.16,Default,,0000,0000,0000,,And so, what are we gonna do here? So Dialogue: 0,0:14:36.16,0:14:43.16,Default,,0000,0000,0000,,again, let's look at, talk dot mine. Dot abstract. Dialogue: 0,0:14:43.55,0:14:45.16,Default,,0000,0000,0000,,So here's the abstract, and we need to, you Dialogue: 0,0:14:45.16,0:14:48.92,Default,,0000,0000,0000,,know, get ngrams out of this. And so the Dialogue: 0,0:14:48.92,0:14:51.06,Default,,0000,0000,0000,,first thing, I've written a little helper\Nmethod over Dialogue: 0,0:14:51.06,0:14:54.05,Default,,0000,0000,0000,,here. Which I've just tacked on a string called Dialogue: 0,0:14:54.05,0:14:57.41,Default,,0000,0000,0000,,normalized_for_ngrams. And you know, what\Ndoes this do? Well, Dialogue: 0,0:14:57.41,0:14:59.64,Default,,0000,0000,0000,,it downcases it, cause we're gonna do case\Ninsensitive. Dialogue: 0,0:14:59.64,0:15:01.56,Default,,0000,0000,0000,,There might be cases where you want to keep Dialogue: 0,0:15:01.56,0:15:03.82,Default,,0000,0000,0000,,case sensitivity. Whatever. Doesn't really\Nmatter. In this case Dialogue: 0,0:15:03.82,0:15:06.06,Default,,0000,0000,0000,,we're gonna go case insensitive. Dialogue: 0,0:15:06.06,0:15:08.88,Default,,0000,0000,0000,,Squish is a nice, convenient method that will\Nkind Dialogue: 0,0:15:08.88,0:15:11.46,Default,,0000,0000,0000,,of standardize the white space for you. So\Nlike, Dialogue: 0,0:15:11.46,0:15:13.99,Default,,0000,0000,0000,,if there's any trailing or leading white space,\Nand Dialogue: 0,0:15:13.99,0:15:16.60,Default,,0000,0000,0000,,if there's like a bunch of middle white space, Dialogue: 0,0:15:16.60,0:15:18.73,Default,,0000,0000,0000,,this will, it'll kill the beginning and ending\Nand Dialogue: 0,0:15:18.73,0:15:20.63,Default,,0000,0000,0000,,it'll turn anything in the middle into a single Dialogue: 0,0:15:20.63,0:15:21.22,Default,,0000,0000,0000,,space. Dialogue: 0,0:15:21.22,0:15:22.23,Default,,0000,0000,0000,,So that way you just don't have to worry Dialogue: 0,0:15:22.23,0:15:25.13,Default,,0000,0000,0000,,about things like double spaces or, you know,\Nother, Dialogue: 0,0:15:25.13,0:15:26.82,Default,,0000,0000,0000,,other weird things that can happen. Cause\Nof course Dialogue: 0,0:15:26.82,0:15:28.60,Default,,0000,0000,0000,,it's the web. Whatever can go wrong will go Dialogue: 0,0:15:28.60,0:15:31.51,Default,,0000,0000,0000,,wrong. So make sure that you're data's in\Nsome Dialogue: 0,0:15:31.51,0:15:33.36,Default,,0000,0000,0000,,kind of standardized format. Dialogue: 0,0:15:33.36,0:15:36.50,Default,,0000,0000,0000,,And the last thing I've done is removed punctuation. Dialogue: 0,0:15:36.50,0:15:38.36,Default,,0000,0000,0000,,And the reason for that is just cause like, Dialogue: 0,0:15:38.36,0:15:40.28,Default,,0000,0000,0000,,you know, there's commas, periods, colons,\Nall sorts of Dialogue: 0,0:15:40.28,0:15:42.93,Default,,0000,0000,0000,,stuff like that. We don't really care about\Nit. Dialogue: 0,0:15:42.93,0:15:44.71,Default,,0000,0000,0000,,And so let's just kill any character that's\Nnot Dialogue: 0,0:15:44.71,0:15:46.54,Default,,0000,0000,0000,,either a space or a word character. This is Dialogue: 0,0:15:46.54,0:15:49.45,Default,,0000,0000,0000,,kind of the, little like, Ruby special regex\Nthing. Dialogue: 0,0:15:49.45,0:15:53.04,Default,,0000,0000,0000,,So we're gonna kill punctuation. Dialogue: 0,0:15:53.04,0:15:54.19,Default,,0000,0000,0000,,And so we can actually just mess with this Dialogue: 0,0:15:54.19,0:15:56.61,Default,,0000,0000,0000,,in the console maybe. So let's take our little Dialogue: 0,0:15:56.61,0:16:00.46,Default,,0000,0000,0000,,example sentence. You know, this talk is boring.\NAnd Dialogue: 0,0:16:00.46,0:16:04.24,Default,,0000,0000,0000,,let's normalize that for ngrams. OK. All it\Ndid Dialogue: 0,0:16:04.24,0:16:07.71,Default,,0000,0000,0000,,was downcase it. And now we want to get Dialogue: 0,0:16:07.71,0:16:09.41,Default,,0000,0000,0000,,that into an array of words, which we can Dialogue: 0,0:16:09.41,0:16:13.06,Default,,0000,0000,0000,,just do with split. Cool. Dialogue: 0,0:16:13.06,0:16:16.83,Default,,0000,0000,0000,,And now there's actually this neat little\NRuby enumerable Dialogue: 0,0:16:16.83,0:16:18.29,Default,,0000,0000,0000,,thing, which I didn't know about until pretty\Nrecently. Dialogue: 0,0:16:18.29,0:16:21.80,Default,,0000,0000,0000,,Each const, which stands for each consecutive.\NAnd it Dialogue: 0,0:16:21.80,0:16:25.38,Default,,0000,0000,0000,,takes an argument, a single number, like two,\Nand Dialogue: 0,0:16:25.38,0:16:27.28,Default,,0000,0000,0000,,what this says is give me all of the, Dialogue: 0,0:16:27.28,0:16:29.78,Default,,0000,0000,0000,,you know, consecutive pairs of two. So if\Nwe Dialogue: 0,0:16:29.78,0:16:32.44,Default,,0000,0000,0000,,to_a this, now we have this array of arrays, Dialogue: 0,0:16:32.44,0:16:34.18,Default,,0000,0000,0000,,which looks like exactly what we want. Dialogue: 0,0:16:34.18,0:16:36.87,Default,,0000,0000,0000,,This talk, talk is, and is boring. And so Dialogue: 0,0:16:36.87,0:16:38.31,Default,,0000,0000,0000,,the last thing we can do there is we Dialogue: 0,0:16:38.31,0:16:43.69,Default,,0000,0000,0000,,can just map that array to make these just Dialogue: 0,0:16:43.69,0:16:44.19,Default,,0000,0000,0000,,phrases. Dialogue: 0,0:16:44.19,0:16:46.86,Default,,0000,0000,0000,,So cool. So this is actually the entirety\Nof Dialogue: 0,0:16:46.86,0:16:49.82,Default,,0000,0000,0000,,our ngrams method, is just, you know, this\Ncode Dialogue: 0,0:16:49.82,0:16:51.63,Default,,0000,0000,0000,,right here. So let's copy and paste this into Dialogue: 0,0:16:51.63,0:16:56.50,Default,,0000,0000,0000,,the old method here. So we want. We're doing Dialogue: 0,0:16:56.50,0:17:03.04,Default,,0000,0000,0000,,this on the abstract. Let's get some new lines Dialogue: 0,0:17:03.04,0:17:04.08,Default,,0000,0000,0000,,here. Dialogue: 0,0:17:04.08,0:17:09.84,Default,,0000,0000,0000,,All right, cool. So again, just to recap,\Nyou Dialogue: 0,0:17:09.84,0:17:12.04,Default,,0000,0000,0000,,take the abstract, we normalize it, which\Nmeans, you Dialogue: 0,0:17:12.04,0:17:14.88,Default,,0000,0000,0000,,know, downcase and kill the punctuation. We\Nsplit it Dialogue: 0,0:17:14.88,0:17:17.29,Default,,0000,0000,0000,,to words. Uh, wait. Actually this should not\Nbe Dialogue: 0,0:17:17.29,0:17:21.12,Default,,0000,0000,0000,,two. That should be n. And then we join Dialogue: 0,0:17:21.12,0:17:24.22,Default,,0000,0000,0000,,those. So let's, let's see if this worked. Dialogue: 0,0:17:24.22,0:17:31.22,Default,,0000,0000,0000,,So talk dot mine again. And one. OK. So Dialogue: 0,0:17:31.36,0:17:32.77,Default,,0000,0000,0000,,here are all the one grams, which is just Dialogue: 0,0:17:32.77,0:17:36.24,Default,,0000,0000,0000,,the sequence of words. And that looks correct.\NAnd Dialogue: 0,0:17:36.24,0:17:41.87,Default,,0000,0000,0000,,all of the two grams. Also looks correct,\NI Dialogue: 0,0:17:41.87,0:17:45.37,Default,,0000,0000,0000,,think. Yeah. To get, get a, yeah, OK, perfect. Dialogue: 0,0:17:45.37,0:17:47.62,Default,,0000,0000,0000,,And so this is kind of the, the method Dialogue: 0,0:17:47.62,0:17:50.69,Default,,0000,0000,0000,,we're gonna use to decompose these talks into\Njust, Dialogue: 0,0:17:50.69,0:17:53.80,Default,,0000,0000,0000,,you know, an array of words and phrases. And Dialogue: 0,0:17:53.80,0:17:55.93,Default,,0000,0000,0000,,so what is the next step, now that we Dialogue: 0,0:17:55.93,0:17:57.55,Default,,0000,0000,0000,,have this method? Well, the next step is we Dialogue: 0,0:17:57.55,0:17:59.47,Default,,0000,0000,0000,,have to build these indexes that we're actually\Ngonna Dialogue: 0,0:17:59.47,0:18:03.66,Default,,0000,0000,0000,,use to look up, you know, the final results. Dialogue: 0,0:18:03.66,0:18:05.14,Default,,0000,0000,0000,,And so for that, we're gonna use redis. Dialogue: 0,0:18:05.14,0:18:07.18,Default,,0000,0000,0000,,Now, we don't have sort of enough time to Dialogue: 0,0:18:07.18,0:18:10.99,Default,,0000,0000,0000,,really get totally into the details of redis.\NBut, Dialogue: 0,0:18:10.99,0:18:12.04,Default,,0000,0000,0000,,you know, the, the thing that we're really\Ngonna Dialogue: 0,0:18:12.04,0:18:14.76,Default,,0000,0000,0000,,use is the, the sorted set data structure,\Nwhich Dialogue: 0,0:18:14.76,0:18:16.44,Default,,0000,0000,0000,,I'd definitely encourage you to check out.\NIt's a Dialogue: 0,0:18:16.44,0:18:19.16,Default,,0000,0000,0000,,great data structure. Great feature of redis.\NAnd so Dialogue: 0,0:18:19.16,0:18:20.21,Default,,0000,0000,0000,,what is a sorted set? Dialogue: 0,0:18:20.21,0:18:22.73,Default,,0000,0000,0000,,Well, it's got the word set in it, so Dialogue: 0,0:18:22.73,0:18:24.72,Default,,0000,0000,0000,,that tells you something. It's, you know,\Nunique elements. Dialogue: 0,0:18:24.72,0:18:27.06,Default,,0000,0000,0000,,And the, the neat feature of a sorted set Dialogue: 0,0:18:27.06,0:18:28.99,Default,,0000,0000,0000,,is that each element in the set also has Dialogue: 0,0:18:28.99,0:18:32.36,Default,,0000,0000,0000,,a score associated with it. So the way we Dialogue: 0,0:18:32.36,0:18:34.67,Default,,0000,0000,0000,,can use this is, remember, again, the question\NI'm Dialogue: 0,0:18:34.67,0:18:36.61,Default,,0000,0000,0000,,gonna answer is, like, you know, if someone\Nsearches Dialogue: 0,0:18:36.61,0:18:38.56,Default,,0000,0000,0000,,for Ember, you know, how many times was Ember Dialogue: 0,0:18:38.56,0:18:40.43,Default,,0000,0000,0000,,mentioned in 2007. How many times was it mentioned Dialogue: 0,0:18:40.43,0:18:42.17,Default,,0000,0000,0000,,in 2008. How many times was it mentioned in Dialogue: 0,0:18:42.17,0:18:42.66,Default,,0000,0000,0000,,2009? Dialogue: 0,0:18:42.66,0:18:44.61,Default,,0000,0000,0000,,So we're gonna have one sorted set for each Dialogue: 0,0:18:44.61,0:18:47.70,Default,,0000,0000,0000,,year, where the members of the sorted set\Nare Dialogue: 0,0:18:47.70,0:18:50.26,Default,,0000,0000,0000,,all the words and phrases that appeared in\NRailsConf Dialogue: 0,0:18:50.26,0:18:54.10,Default,,0000,0000,0000,,talks, and the scores are the number of times Dialogue: 0,0:18:54.10,0:18:56.42,Default,,0000,0000,0000,,that those ngrams appeared. Dialogue: 0,0:18:56.42,0:18:58.40,Default,,0000,0000,0000,,And then, you know, redis is very efficient\Nabout Dialogue: 0,0:18:58.40,0:19:00.25,Default,,0000,0000,0000,,this zscore method. You can look up. It's\Nlike Dialogue: 0,0:19:00.25,0:19:02.59,Default,,0000,0000,0000,,this command right here would say, OK, in\Nthe Dialogue: 0,0:19:02.59,0:19:05.99,Default,,0000,0000,0000,,sorted set for 2014, get me the score associated Dialogue: 0,0:19:05.99,0:19:09.25,Default,,0000,0000,0000,,with the member ember. And that's gonna tell\Nyou, Dialogue: 0,0:19:09.25,0:19:11.56,Default,,0000,0000,0000,,you know, some number. Like, three or whatever.\NIs Dialogue: 0,0:19:11.56,0:19:14.34,Default,,0000,0000,0000,,the number of times it gets mentioned. Dialogue: 0,0:19:14.34,0:19:15.84,Default,,0000,0000,0000,,So what we have to do is build these Dialogue: 0,0:19:15.84,0:19:18.80,Default,,0000,0000,0000,,sorted sets. One for each year again. And\Nagain Dialogue: 0,0:19:18.80,0:19:23.59,Default,,0000,0000,0000,,I have an empty method called generate_ngram_data_by_year.\NSo iterate Dialogue: 0,0:19:23.59,0:19:26.11,Default,,0000,0000,0000,,through all talks from a given year, you know, Dialogue: 0,0:19:26.11,0:19:27.39,Default,,0000,0000,0000,,calculate the ngram counts and add it to the Dialogue: 0,0:19:27.39,0:19:29.94,Default,,0000,0000,0000,,appropriate redis sorted set. So let's write\Nthat. Dialogue: 0,0:19:29.94,0:19:32.45,Default,,0000,0000,0000,,So one thing we need to do is make Dialogue: 0,0:19:32.45,0:19:34.46,Default,,0000,0000,0000,,sure we're not double counting. So if we have Dialogue: 0,0:19:34.46,0:19:37.24,Default,,0000,0000,0000,,an old sorted set sitting around, let's delete\Nit. Dialogue: 0,0:19:37.24,0:19:40.21,Default,,0000,0000,0000,,So let's, redis.delete year. We need to decide\Nwhat Dialogue: 0,0:19:40.21,0:19:43.46,Default,,0000,0000,0000,,values of n we're gonna use. So let's just Dialogue: 0,0:19:43.46,0:19:46.21,Default,,0000,0000,0000,,say one, two, and three, meaning we're gonna\Ncalculate Dialogue: 0,0:19:46.21,0:19:48.19,Default,,0000,0000,0000,,all the one grams, two grams, three grams.\NAnything Dialogue: 0,0:19:48.19,0:19:49.70,Default,,0000,0000,0000,,longer than that and it's sort of, like, what's Dialogue: 0,0:19:49.70,0:19:51.74,Default,,0000,0000,0000,,even the point. You're getting into pretty\Nspecific sentences. Dialogue: 0,0:19:51.74,0:19:53.11,Default,,0000,0000,0000,,There's not gonna be a lot of repetition. Dialogue: 0,0:19:53.11,0:19:55.79,Default,,0000,0000,0000,,So now we need to iterate through each talk Dialogue: 0,0:19:55.79,0:20:02.79,Default,,0000,0000,0000,,for the given years. Where(:year => year).find_each.\NAnd then Dialogue: 0,0:20:05.79,0:20:07.86,Default,,0000,0000,0000,,for each talk we need to iterate through each Dialogue: 0,0:20:07.86,0:20:14.33,Default,,0000,0000,0000,,value of n. And then for each value of Dialogue: 0,0:20:14.33,0:20:15.61,Default,,0000,0000,0000,,n, what do we need to do? We need Dialogue: 0,0:20:15.61,0:20:17.48,Default,,0000,0000,0000,,to calculate the ngram, so do talk dot ngrams. Dialogue: 0,0:20:17.48,0:20:19.06,Default,,0000,0000,0000,,This is the method we just wrote. We're gonna Dialogue: 0,0:20:19.06,0:20:19.99,Default,,0000,0000,0000,,pass it n. Dialogue: 0,0:20:19.99,0:20:22.65,Default,,0000,0000,0000,,Do |ngram|. Dialogue: 0,0:20:22.65,0:20:26.49,Default,,0000,0000,0000,,And then finally, we're going to add this\Nto Dialogue: 0,0:20:26.49,0:20:29.33,Default,,0000,0000,0000,,the relevant redis sorted set. So the command\Nfor Dialogue: 0,0:20:29.33,0:20:30.05,Default,,0000,0000,0000,,that is redis.zincrby. Dialogue: 0,0:20:30.05,0:20:34.67,Default,,0000,0000,0000,,And this goes, you give it a year, you Dialogue: 0,0:20:34.67,0:20:38.77,Default,,0000,0000,0000,,give it a number, like one, and you give Dialogue: 0,0:20:38.77,0:20:40.32,Default,,0000,0000,0000,,it what are you incrementing. Dialogue: 0,0:20:40.32,0:20:42.78,Default,,0000,0000,0000,,OK. So let's look at this method now. We're Dialogue: 0,0:20:42.78,0:20:45.02,Default,,0000,0000,0000,,gonna take, give it a year. We're gonna go Dialogue: 0,0:20:45.02,0:20:48.42,Default,,0000,0000,0000,,through every talk from that year. We're gonna\Ngo Dialogue: 0,0:20:48.42,0:20:50.63,Default,,0000,0000,0000,,through values of n, which is one, two and Dialogue: 0,0:20:50.63,0:20:53.20,Default,,0000,0000,0000,,three, so let's say one, OK. Get the talk. Dialogue: 0,0:20:53.20,0:20:55.29,Default,,0000,0000,0000,,Calculate all of its one grams. And then for Dialogue: 0,0:20:55.29,0:20:59.15,Default,,0000,0000,0000,,each one gram, add to the year sorted set Dialogue: 0,0:20:59.15,0:21:02.87,Default,,0000,0000,0000,,the value of one for that ngram. And then Dialogue: 0,0:21:02.87,0:21:05.14,Default,,0000,0000,0000,,do that just a bunch of times. Dialogue: 0,0:21:05.14,0:21:07.55,Default,,0000,0000,0000,,So let's see if this works. Dialogue: 0,0:21:07.55,0:21:14.48,Default,,0000,0000,0000,,Let's reload. Again to prove I'm not lying.\NThere's Dialogue: 0,0:21:14.48,0:21:21.36,Default,,0000,0000,0000,,nothing in redis at the moment. Oops. Gotta\Ndo Dialogue: 0,0:21:21.36,0:21:22.42,Default,,0000,0000,0000,,talk. Dialogue: 0,0:21:22.42,0:21:29.42,Default,,0000,0000,0000,,Let's worry about those Delayed::Jobs. Perfect.\NDrink break. Dialogue: 0,0:21:30.42,0:21:33.02,Default,,0000,0000,0000,,So it's going through each year now. And each Dialogue: 0,0:21:33.02,0:21:34.56,Default,,0000,0000,0000,,talk in each year, counting up all the words Dialogue: 0,0:21:34.56,0:21:39.49,Default,,0000,0000,0000,,and phrases and building our sorted sets.\NAnd it Dialogue: 0,0:21:39.49,0:21:40.44,Default,,0000,0000,0000,,is done. Dialogue: 0,0:21:40.44,0:21:43.05,Default,,0000,0000,0000,,So let's see what we got in here now. Dialogue: 0,0:21:43.05,0:21:46.78,Default,,0000,0000,0000,,OK, cool. So we got these keys. Let's, let's Dialogue: 0,0:21:46.78,0:21:48.04,Default,,0000,0000,0000,,look into one of these. One of the nice Dialogue: 0,0:21:48.04,0:21:49.61,Default,,0000,0000,0000,,things about the sorted set is you can, of Dialogue: 0,0:21:49.61,0:21:52.91,Default,,0000,0000,0000,,course, sort by it. And so the command here Dialogue: 0,0:21:52.91,0:21:55.95,Default,,0000,0000,0000,,is zrevrange. So we can do the 2014 sorted Dialogue: 0,0:21:55.95,0:21:58.87,Default,,0000,0000,0000,,set. So this is gonna give us the top Dialogue: 0,0:21:58.87,0:22:01.47,Default,,0000,0000,0000,,ten, or actually eleven, top eleven, you know,\Nngrams Dialogue: 0,0:22:01.47,0:22:03.91,Default,,0000,0000,0000,,in 2014. So let's see. Dialogue: 0,0:22:03.91,0:22:09.09,Default,,0000,0000,0000,,And we can actually add :with_scores = true.\NSo Dialogue: 0,0:22:09.09,0:22:11.76,Default,,0000,0000,0000,,the most common words and phrases from 2014\NRailsConf Dialogue: 0,0:22:11.76,0:22:16.64,Default,,0000,0000,0000,,talk abstracts. Not very surprising. The,\Nto, and, a, Dialogue: 0,0:22:16.64,0:22:20.20,Default,,0000,0000,0000,,of, in, you, how. Rails. OK. Rails makes the Dialogue: 0,0:22:20.20,0:22:21.11,Default,,0000,0000,0000,,number ten. Dialogue: 0,0:22:21.11,0:22:23.52,Default,,0000,0000,0000,,So there you go. Dialogue: 0,0:22:23.52,0:22:25.25,Default,,0000,0000,0000,,Now we can also, let's just have a little Dialogue: 0,0:22:25.25,0:22:28.37,Default,,0000,0000,0000,,fun here. See what some of the sort top Dialogue: 0,0:22:28.37,0:22:30.48,Default,,0000,0000,0000,,non-trivial ones are. Obviously you could\Nwrite some code, Dialogue: 0,0:22:30.48,0:22:32.95,Default,,0000,0000,0000,,maybe kill stop words. Stuff like that. If\Nyou Dialogue: 0,0:22:32.95,0:22:34.69,Default,,0000,0000,0000,,don't care about them. Dialogue: 0,0:22:34.69,0:22:40.33,Default,,0000,0000,0000,,But, so. Rails. Can code. This talk. Most\Npopular Dialogue: 0,0:22:40.33,0:22:44.62,Default,,0000,0000,0000,,two-word phrase. Pretty good. How to. Ruby\Ndevelopers. Eh, Dialogue: 0,0:22:44.62,0:22:46.40,Default,,0000,0000,0000,,this looks pretty, pretty relevant, right.\NI mean, these Dialogue: 0,0:22:46.40,0:22:51.22,Default,,0000,0000,0000,,are not words you'd be surprised to see in Dialogue: 0,0:22:51.22,0:22:53.29,Default,,0000,0000,0000,,a RailsConf talk abstract. Dialogue: 0,0:22:53.29,0:22:56.22,Default,,0000,0000,0000,,So those, you know, are the most common words. Dialogue: 0,0:22:56.22,0:22:57.29,Default,,0000,0000,0000,,So we now have this. We have this for Dialogue: 0,0:22:57.29,0:22:58.51,Default,,0000,0000,0000,,every year, by the way. So we can also Dialogue: 0,0:22:58.51,0:23:01.44,Default,,0000,0000,0000,,do something, this is the same thing for 2011. Dialogue: 0,0:23:01.44,0:23:04.28,Default,,0000,0000,0000,,Whatever. And the last piece of code we're\Ngoing Dialogue: 0,0:23:04.28,0:23:05.74,Default,,0000,0000,0000,,to write, is we need to be able to Dialogue: 0,0:23:05.74,0:23:06.77,Default,,0000,0000,0000,,query this data. Dialogue: 0,0:23:06.77,0:23:08.94,Default,,0000,0000,0000,,So, you know, the actual, sort of, website\Nor Dialogue: 0,0:23:08.94,0:23:11.59,Default,,0000,0000,0000,,finished product, you're gonna have to, you\Nknow, search Dialogue: 0,0:23:11.59,0:23:13.43,Default,,0000,0000,0000,,for a term. And you're gonna have to go Dialogue: 0,0:23:13.43,0:23:15.74,Default,,0000,0000,0000,,look up in your data, you know, what, what Dialogue: 0,0:23:15.74,0:23:19.34,Default,,0000,0000,0000,,are the relevant values for that term. Dialogue: 0,0:23:19.34,0:23:21.30,Default,,0000,0000,0000,,And so, how we're gonna do this. Well, the Dialogue: 0,0:23:21.30,0:23:23.50,Default,,0000,0000,0000,,first thing we gotta remember is that we normal- Dialogue: 0,0:23:23.50,0:23:27.41,Default,,0000,0000,0000,,remember we did this normalize for ngrams\Nthing. So Dialogue: 0,0:23:27.41,0:23:28.92,Default,,0000,0000,0000,,we have to do that again, because what if Dialogue: 0,0:23:28.92,0:23:31.10,Default,,0000,0000,0000,,someone searches for a capitalized word or\Nwith something Dialogue: 0,0:23:31.10,0:23:32.99,Default,,0000,0000,0000,,with punctuation. We have to process it the\Nexact Dialogue: 0,0:23:32.99,0:23:35.74,Default,,0000,0000,0000,,same way that we processed our input. Otherwise\Nit Dialogue: 0,0:23:35.74,0:23:38.89,Default,,0000,0000,0000,,won't match. So let's just do that. Dialogue: 0,0:23:38.89,0:23:42.81,Default,,0000,0000,0000,,And then we have this constant ALL_YEARS.\NAnd we're Dialogue: 0,0:23:42.81,0:23:45.95,Default,,0000,0000,0000,,gonna iterate through that with an object\Nwith a Dialogue: 0,0:23:45.95,0:23:47.30,Default,,0000,0000,0000,,hash. Let's just build up a hash. That's probably Dialogue: 0,0:23:47.30,0:23:51.100,Default,,0000,0000,0000,,the easy way to do it. Do |year, hash|. Dialogue: 0,0:23:51.100,0:23:57.55,Default,,0000,0000,0000,,And the, the relevant redis command, again,\Nis zscore. Dialogue: 0,0:23:57.55,0:24:03.70,Default,,0000,0000,0000,,So we can do redis dot zscore(). We're gonna Dialogue: 0,0:24:03.70,0:24:05.87,Default,,0000,0000,0000,,look up in the hash for that year, the Dialogue: 0,0:24:05.87,0:24:08.47,Default,,0000,0000,0000,,term. And we need to put this actually in Dialogue: 0,0:24:08.47,0:24:13.74,Default,,0000,0000,0000,,the hash. And so, and then we need to Dialogue: 0,0:24:13.74,0:24:16.29,Default,,0000,0000,0000,,to_i that in case it's nil. Dialogue: 0,0:24:16.29,0:24:19.10,Default,,0000,0000,0000,,OK. So this now, what does this say? ALL_YEARS Dialogue: 0,0:24:19.10,0:24:22.86,Default,,0000,0000,0000,,is just, you know, 2007 through 2014. Go through Dialogue: 0,0:24:22.86,0:24:25.89,Default,,0000,0000,0000,,each of those years. And then build up our Dialogue: 0,0:24:25.89,0:24:27.61,Default,,0000,0000,0000,,hash so that the hash, the key of the Dialogue: 0,0:24:27.61,0:24:30.50,Default,,0000,0000,0000,,year, maps to the value of, you know, the Dialogue: 0,0:24:30.50,0:24:33.89,Default,,0000,0000,0000,,number of times that term appeared in that\Nyear. Dialogue: 0,0:24:33.89,0:24:38.18,Default,,0000,0000,0000,,So let's, again, see if that works. Talk dot Dialogue: 0,0:24:38.18,0:24:43.64,Default,,0000,0000,0000,,query, you know, ruby or something. Cool.\NSo in Dialogue: 0,0:24:43.64,0:24:47.33,Default,,0000,0000,0000,,2007 it was mentioned 52 times, 2014 22 times. Dialogue: 0,0:24:47.33,0:24:50.23,Default,,0000,0000,0000,,Whatever. We can, I guess, we said Ember originally. Dialogue: 0,0:24:50.23,0:24:54.31,Default,,0000,0000,0000,,And there you go. It was not mentioned until Dialogue: 0,0:24:54.31,0:24:58.37,Default,,0000,0000,0000,,this year. Which is also kind of telling. Dialogue: 0,0:24:58.37,0:25:01.69,Default,,0000,0000,0000,,And so this is basically, you know, all of Dialogue: 0,0:25:01.69,0:25:04.10,Default,,0000,0000,0000,,the kind of step two code you need. That's Dialogue: 0,0:25:04.10,0:25:06.84,Default,,0000,0000,0000,,sort of the ngram calculation, store the results.\NAnd Dialogue: 0,0:25:06.84,0:25:09.84,Default,,0000,0000,0000,,again, I reiterate, like, everything we just\Ndid, is Dialogue: 0,0:25:09.84,0:25:12.83,Default,,0000,0000,0000,,kind of trivially simple. There's no fancy\Nalgorithms. It's Dialogue: 0,0:25:12.83,0:25:15.22,Default,,0000,0000,0000,,just counting, you know, putting stuff in\Nthe right Dialogue: 0,0:25:15.22,0:25:17.17,Default,,0000,0000,0000,,data structure. Accessing it in sort of the\Nright Dialogue: 0,0:25:17.17,0:25:18.27,Default,,0000,0000,0000,,way. Dialogue: 0,0:25:18.27,0:25:20.94,Default,,0000,0000,0000,,And I just think there's something like pretty,\Nyou Dialogue: 0,0:25:20.94,0:25:23.18,Default,,0000,0000,0000,,know, insightful about that, that you don't\Nneed to Dialogue: 0,0:25:23.18,0:25:26.39,Default,,0000,0000,0000,,do fancy things all the time. And that often Dialogue: 0,0:25:26.39,0:25:28.59,Default,,0000,0000,0000,,the kind of the coolest results will come\Nfrom Dialogue: 0,0:25:28.59,0:25:30.75,Default,,0000,0000,0000,,something simple. Dialogue: 0,0:25:30.75,0:25:31.77,Default,,0000,0000,0000,,And so, as I said, the last thing we're Dialogue: 0,0:25:31.77,0:25:33.14,Default,,0000,0000,0000,,gonna do here is create this nice front end Dialogue: 0,0:25:33.14,0:25:35.97,Default,,0000,0000,0000,,interface that lets us investigate the results.\NYou know, Dialogue: 0,0:25:35.97,0:25:37.99,Default,,0000,0000,0000,,unfortunately, we don't really have time to\Nget into Dialogue: 0,0:25:37.99,0:25:40.32,Default,,0000,0000,0000,,that. It is all on the GitHub. But, I Dialogue: 0,0:25:40.32,0:25:42.94,Default,,0000,0000,0000,,will tell you, I use pie charts as a Dialogue: 0,0:25:42.94,0:25:46.10,Default,,0000,0000,0000,,nice library, front-end library that makes\Nit very simple Dialogue: 0,0:25:46.10,0:25:47.45,Default,,0000,0000,0000,,to get charts up and running. It's actually\Nnot Dialogue: 0,0:25:47.45,0:25:48.42,Default,,0000,0000,0000,,that much code. Dialogue: 0,0:25:48.42,0:25:49.89,Default,,0000,0000,0000,,And I've done this already. So let's start\Nup Dialogue: 0,0:25:49.89,0:25:54.04,Default,,0000,0000,0000,,a server. And, oops. Let's fire up the localhost. Dialogue: 0,0:25:54.04,0:25:58.95,Default,,0000,0000,0000,,And so here we are. The abstractogram is our Dialogue: 0,0:25:58.95,0:26:00.01,Default,,0000,0000,0000,,app. So what are we, what are we gonna Dialogue: 0,0:26:00.01,0:26:01.08,Default,,0000,0000,0000,,search for here? Dialogue: 0,0:26:01.08,0:26:03.92,Default,,0000,0000,0000,,Let's see. I, you, we or something. And there Dialogue: 0,0:26:03.92,0:26:05.33,Default,,0000,0000,0000,,we go. So there, there it is. The number Dialogue: 0,0:26:05.33,0:26:08.73,Default,,0000,0000,0000,,of times the word you appears in each year. Dialogue: 0,0:26:08.73,0:26:11.10,Default,,0000,0000,0000,,Looks pretty flat. So, you know, the, these\Nare Dialogue: 0,0:26:11.10,0:26:13.10,Default,,0000,0000,0000,,kind of constant. Anyone have any, anything\Nelse they Dialogue: 0,0:26:13.10,0:26:15.54,Default,,0000,0000,0000,,want to search for? Let's try ember, backbone. Dialogue: 0,0:26:15.54,0:26:19.37,Default,,0000,0000,0000,,All right. Let's say, we got, PosGres I heard. Dialogue: 0,0:26:19.37,0:26:24.11,Default,,0000,0000,0000,,All right. I guess we could all say, let's Dialogue: 0,0:26:24.11,0:26:28.64,Default,,0000,0000,0000,,say SQL. No one cares about PosGres this year. Dialogue: 0,0:26:28.64,0:26:32.70,Default,,0000,0000,0000,,Service. SOA. Oh, there is sort of a rising Dialogue: 0,0:26:32.70,0:26:35.85,Default,,0000,0000,0000,,trend of service-oriented architecture. Dialogue: 0,0:26:35.85,0:26:36.32,Default,,0000,0000,0000,,Anything else? Dialogue: 0,0:26:36.32,0:26:41.42,Default,,0000,0000,0000,,TDD. That's a good one. TDD. Testing. Test-driven,\Nhow Dialogue: 0,0:26:41.42,0:26:48.42,Default,,0000,0000,0000,,about. So there we go. I'm sorry? Dialogue: 0,0:26:48.91,0:26:53.74,Default,,0000,0000,0000,,Rest. That's a trick one though, cause rest\Nis Dialogue: 0,0:26:53.74,0:26:55.48,Default,,0000,0000,0000,,also like a real word that, you know, like, Dialogue: 0,0:26:55.48,0:26:57.44,Default,,0000,0000,0000,,the rest of the time will be something else. Dialogue: 0,0:26:57.44,0:27:04.15,Default,,0000,0000,0000,,And. Refactor. Let's see. Ooh. That's a good\None. Dialogue: 0,0:27:04.15,0:27:09.63,Default,,0000,0000,0000,,DHH. Wow. Peaked 2011, peak DHH. Let's see,\Nwe Dialogue: 0,0:27:09.63,0:27:11.57,Default,,0000,0000,0000,,got, Heroku is a good one. On the rise. Dialogue: 0,0:27:11.57,0:27:13.70,Default,,0000,0000,0000,,I like we can just look at Ruby and Dialogue: 0,0:27:13.70,0:27:15.41,Default,,0000,0000,0000,,Rails. This is actually, I think, pretty relevant.\NIt's Dialogue: 0,0:27:15.41,0:27:18.98,Default,,0000,0000,0000,,like, what are people talking about? Not Rails\Nanymore. Dialogue: 0,0:27:18.98,0:27:20.27,Default,,0000,0000,0000,,We got to find something new to talk about. Dialogue: 0,0:27:20.27,0:27:22.73,Default,,0000,0000,0000,,You know, it's like, too many RailsConfs.\NAnd, in Dialogue: 0,0:27:22.73,0:27:25.35,Default,,0000,0000,0000,,fact, this actually came up at the, you know, Dialogue: 0,0:27:25.35,0:27:27.12,Default,,0000,0000,0000,,there was a speaker meeting, whatever, and\Neveryone was Dialogue: 0,0:27:27.12,0:27:29.49,Default,,0000,0000,0000,,talking about how, you know, their talks weren't\Nactually Dialogue: 0,0:27:29.49,0:27:30.60,Default,,0000,0000,0000,,about Rails. Dialogue: 0,0:27:30.60,0:27:32.88,Default,,0000,0000,0000,,And, you know, maybe this is actually an insightful Dialogue: 0,0:27:32.88,0:27:35.64,Default,,0000,0000,0000,,statement, that, you know, the, the community\Nhas obviously Dialogue: 0,0:27:35.64,0:27:37.71,Default,,0000,0000,0000,,gotten very large and there's just a ton of Dialogue: 0,0:27:37.71,0:27:38.35,Default,,0000,0000,0000,,other stuff to talk about. People have been\Ntalking Dialogue: 0,0:27:38.35,0:27:41.30,Default,,0000,0000,0000,,about Rails for a long time. And so, you Dialogue: 0,0:27:41.30,0:27:42.91,Default,,0000,0000,0000,,know, here I am giving a talk that's not Dialogue: 0,0:27:42.91,0:27:46.06,Default,,0000,0000,0000,,really directly about Rails. But, so maybe\Nthis is Dialogue: 0,0:27:46.06,0:27:47.37,Default,,0000,0000,0000,,like a real trend that people are just finding Dialogue: 0,0:27:47.37,0:27:49.04,Default,,0000,0000,0000,,other stuff to talk about. Dialogue: 0,0:27:49.04,0:27:53.08,Default,,0000,0000,0000,,And that is pretty cool. So I promised that Dialogue: 0,0:27:53.08,0:27:56.47,Default,,0000,0000,0000,,I would show you the repo or whatever on Dialogue: 0,0:27:56.47,0:27:59.61,Default,,0000,0000,0000,,GitHub. You can just do bit.ly slash railsconfdata.\NIt's Dialogue: 0,0:27:59.61,0:28:02.06,Default,,0000,0000,0000,,just the code. Everything we've looked at\Ntoday. Plus Dialogue: 0,0:28:02.06,0:28:04.42,Default,,0000,0000,0000,,some more stuff. It's actually running live\Non the Dialogue: 0,0:28:04.42,0:28:07.40,Default,,0000,0000,0000,,internet at abstractogram dot herokuapp dot\Ncom. Dialogue: 0,0:28:07.40,0:28:09.68,Default,,0000,0000,0000,,I figure the internet's probably not working,\Nbut let's Dialogue: 0,0:28:09.68,0:28:16.68,Default,,0000,0000,0000,,see. Yup. Classic. And, you know, otherwise\Nthat is Dialogue: 0,0:28:16.81,0:28:19.65,Default,,0000,0000,0000,,it. And thank you for listening. And I think Dialogue: 0,0:28:19.65,0:28:20.45,Default,,0000,0000,0000,,we have time for questions.