English 字幕

← Aude Explores Coordinated Migration - Data Analysis with R


Showing Revision 3 created 05/24/2016 by Udacity Robot.

  1. We want you to develop a mindset of being both curious and skeptical, when
  2. you work with data. To help you get into this mindset, I want to share
  3. another conversation that I had with Aude. In this next video, I want you
  4. to listen to Aude's work and look
  5. out for how she demonstrated this exact mindset.
  6. >> So we gathered all the home towns and current cities from the
  7. users and I was looking at conditional probabilities given
  8. a hometown. What is the probability that you currently live
  9. in each different cities? Like, for example given that your
  10. hometown is New York, what is the probability that you
  11. live in Chicago or that you live in, that you
  12. still live in New York or that you live in
  13. San Francisco or Paris and so on. And what I
  14. was expecting is that, at least, the most likely city,
  15. where you would live right now would be the city where
  16. your home town is. If you grew up in
  17. Chicago, the most likely place that you're going to
  18. be now is still Chicago. You could be moving
  19. as well but the most likely place would remain your
  20. hometown. But I saw a fair number of cases
  21. where the most likely current city was different from
  22. the home town and, and that was, was a
  23. fairly high probability. I was really surprised. I was wondering
  24. if I had, had a prime in my computations, If there was some issues upstream of
  25. what I was doing. So I decided to put all the cities on a map. All the pairs of
  26. hometowns and current cities for which the most
  27. likely current city was different from the hometown. And
  28. what we saw on this map was really fascinating
  29. because it was really not what we we're expecting.
  30. It was not a bug in the code. We were really seeing patterns arise. Here we only
  31. plotted pairs of hometown and current city, so
  32. there's no movement between the pairs but what we
  33. see is that a lot of these cities for which the most likely
  34. current city is different from the hometown arise in western Africa or in
  35. India or in like Turkey, which we were not
  36. necessarily expecting at the beginning. And there were a
  37. lot of small cities all moving to the same current
  38. city and so we decided to dig a bit
  39. more into it. One thing that happens is that sometimes
  40. the distribution of the current city is very flat.
  41. Given that you grew up in, let's say Paris, maybe
  42. you're still living in Paris but maybe you live in
  43. like one of the thousand cities in the suburbs and
  44. so the distribution is really flat and so we
  45. have to decide what was considered as a coordinates
  46. demarcation. We decided yeah, the probability to move to
  47. that city is high enough that we're considering that.
  48. And the other thing we have to think about is that if you look at the map at
  49. the world scale or if you zoom to a
  50. very specific area, you don't want to see the same things.
  51. So,we also want it to have interactivity in the visualization. And so we
  52. decided to use D3, which is a Javascript-based visualization framework, which
  53. enables you to have a lot of interactivity with with your data
  54. and enabled us to do a lot of that exploration and so on.