0:00:00.156,0:00:03.683 This is The Rundown, I'm Hari Sreenivasan, we're talking about words today. 0:00:03.683,0:00:09.732 Joining me now is lexicographer Erin McKean, she's the CEO and founder of Wordnik.com. 0:00:09.732,0:00:10.885 Thanks for joining us. 0:00:10.885,0:00:12.535 You're very welcome. Thank you. 0:00:12.535,0:00:18.356 Google recently launched a kind of a website or a database if you will, along with some folks at Harvard— 0:00:18.356,0:00:23.531 the NGRAM, which allows people to search for words through hundreds and hundreds and thousands 0:00:23.531,0:00:28.427 of books and periodicals and so forth that have gone back for decades. 0:00:28.427,0:00:30.250 What did you do when you first heard about it? 0:00:30.250,0:00:36.033 We were very excited when we realized that google was releasing the NGRAM data under a very open license 0:00:36.033,0:00:41.181 because it means that lots of people can take that data and try and do cool things with it. 0:00:41.181,0:00:44.957 And of course at Wordnik, we're all about trying to do cool things with words. 0:00:44.957,0:00:50.320 And so the data is based on something like 5 percent of the Google Books corpus, 0:00:50.320,0:00:53.878 which is not a lot, but it's a lot of words. 0:00:53.878,0:00:59.966 What does it teach you about the English language to have access to the occurrence of words over time? 0:00:59.966,0:01:07.814 Right now, you can think of the kind of science behind the NGRAM viewer as like what, 0:01:07.814,0:01:10.696 say, early antibiotics were like. 0:01:10.696,0:01:16.173 They aren't very targeted, so you can't really tell the difference between, say, 0:01:16.173,0:01:19.151 the word "pretty" when it means "good looking" 0:01:19.151,0:01:23.670 versus the word "pretty" when it's in a construction, like, "That was a pretty neat thing." 0:01:23.670,0:01:29.233 Are we using more new words now? Is the rate of the English language's growth increasing? 0:01:29.233,0:01:36.768 Right now we can measure it better than we ever have been able to do before, so in the paper that the 0:01:36.768,0:01:40.232 researchers from Google and from Harvard published in Science, 0:01:40.232,0:01:45.124 they were talking about that they notice more new words appearing over time. 0:01:45.124,0:01:49.897 And also something that I was very happy to have people from Google and Harvard backing me up on 0:01:49.897,0:01:54.082 that they estimated that 52 percent of the words that they looked at 0:01:54.082,0:01:56.752 were not in the dictionaries that they checked. 0:01:56.752,0:01:58.259 How is that even possible? 0:01:58.259,0:02:02.972 Well, there are lots and lots of words that happen just once, nonce words, 0:02:02.972,0:02:06.746 that if you are making a print dictionary you just don't have room to put them in. 0:02:06.746,0:02:09.724 And for someone who hasn't been to Wordnik, what's the difference between 0:02:09.724,0:02:12.963 Wordnik and going to one of the other online dictionaries? 0:02:12.963,0:02:17.691 So Wordnik has about six times as many words as most of the other online dictionaries. 0:02:17.691,0:02:22.994 So we show you as much information as we can about as many words as we can. 0:02:22.994,0:02:26.260 So if there's a traditional dictionary definition, we'll show you that. 0:02:26.260,0:02:29.264 But if we only have three really good sentences from say 0:02:29.264,0:02:32.842 the Wall Street Journal, or Forbes, or the Huffington Post, we'll show you that 0:02:32.842,0:02:38.824 and say, "Hey, real journalists are using this word. You can take their sentences as a model." 0:02:38.824,0:02:40.888 Since it is getting kind of close to the new year, 0:02:40.888,0:02:46.139 what are some of the top words of 2010 or 2011 that you're seeing? 0:02:46.139,0:02:50.371 It's interesting, people always want to have the top words of the year, but usually 0:02:50.371,0:02:58.024 words kind of incubate underground like seeds for a while until they pop up into popular consciousness. 0:02:58.024,0:03:02.805 A couple of words that I've been really interested in lately are all kinda 0:03:02.805,0:03:08.343 negative technology consequences words, like geoslavery. 0:03:08.343,0:03:10.798 And what does geoslavery mean? 0:03:10.798,0:03:17.773 So geoslavery is the idea that with all the GPS functionality and tracking on people's cellphones 0:03:17.773,0:03:25.949 that abusive partners and spouses can use that data to keep tighter tabs on their partners 0:03:25.949,0:03:29.189 With the idea that they're really trying to enforce behavior limits. 0:03:29.189,0:03:31.331 What else is popping up like a seed? 0:03:31.331,0:03:39.167 I really like the word aftercrimes, which is made by analogy to afershocks. 0:03:39.167,0:03:43.869 So it's little crimes that pop up in an area after a major crime has occurred there. 0:03:43.869,0:03:49.381 So what's the end goal for Wordnik? Does it become the dictionary of choice for everyone? 0:03:49.381,0:03:52.803 We're trying to map the whole English language. 0:03:52.803,0:03:55.285 What we'd really like to be is GPS for words 0:03:55.285,0:03:58.707 and show you as much information about as many words as possible. 0:03:58.707,0:04:02.364 All right, Erin McKean CEO and founder of Wordnik, lexicographer. 0:04:02.364,0:04:04.480 Thanks for joining us and happy wording. 0:04:04.480,0:04:06.100 Thanks so much. 0:04:06.100,0:04:09.123 I'm Hari Sreenivasan, this is The Rundown. Stay with us.