< Return to Video

https:/.../52f6698f-a7dd-4943-86eb-ad7501790760-fc4269eb-1399-4a33-9140-ad8c0105606b.mp4?invocationId=08872c6d-6303-ec11-a9e9-0a1a827ad0ec

  • Not Synced
    1
    00:00:04,700 --> 00:00:08,900
    Hello and welcome to see us five thirty three, introduction to data science.
  • Not Synced
    2
    00:00:08,900 --> 00:00:12,650
    This is the introduction video that's going to give you an overview of what it is going to we're going
  • Not Synced
    3
    00:00:12,650 --> 00:00:17,960
    to be talking about this semester and provide you with guidance for how to get started with the course.
  • Not Synced
    4
    00:00:17,960 --> 00:00:21,650
    Are learning outcomes for this video are to introduce the class subject.
  • Not Synced
    5
    00:00:21,650 --> 00:00:24,290
    What is data science for our purposes?
  • Not Synced
    6
    00:00:24,290 --> 00:00:30,650
    For you to understand the learning outcomes of the course, for you to understand how this class is structured at a high level,
  • Not Synced
    7
    00:00:30,650 --> 00:00:39,890
    we'll be supplementing this with discussion in class and then also to know where to get help throughout the semester as you're in the course.
  • Not Synced
    8
    00:00:39,890 --> 00:00:44,030
    To get started, though, I want to talk about what data science is.
  • Not Synced
    9
    00:00:44,030 --> 00:00:49,460
    And there are many different people are going to have many different definitions of data science.
  • Not Synced
    10
    00:00:49,460 --> 00:00:59,730
    But one of the things that. But the definition I've been using as I've built out this class is that data science is the use
  • Not Synced
    11
    00:00:59,730 --> 00:01:06,150
    of data to provide quantitative insights on questions of scientific business or social interest.
  • Not Synced
    12
    00:01:06,150 --> 00:01:13,420
    There may be many different things to which we want to apply the science we may have maybe in a business context,
  • Not Synced
    13
    00:01:13,420 --> 00:01:18,570
    but we want to gain understanding about the effectiveness of our business processes.
  • Not Synced
    14
    00:01:18,570 --> 00:01:26,040
    We may want to evaluate a change to some aspect of how we carry out our business or how we manufacture, provide our products.
  • Not Synced
    15
    00:01:26,040 --> 00:01:30,990
    We may want to understand the impacts of some of our business decisions.
  • Not Synced
    16
    00:01:30,990 --> 00:01:38,310
    We may have a scientific question. What we are trying to produce generalizable knowledge to understand the world around us.
  • Not Synced
    17
    00:01:38,310 --> 00:01:40,170
    We may have a social interest,
  • Not Synced
    18
    00:01:40,170 --> 00:01:47,670
    particularly if we're deploying data science in the context of a nonprofit or of a government agency or of an educational institution.
  • Not Synced
    19
    00:01:47,670 --> 00:02:00,640
    What we're trying to understand, social dynamics of human behavior. We're trying to understand maybe the impact of a policy on on.
  • Not Synced
    20
    00:02:00,640 --> 00:02:13,880
    For example. We may have social interest where particularly if we are deploying.
  • Not Synced
    21
    00:02:13,880 --> 00:02:19,640
    We may have social interest, particularly if we are deploying data science in a nonprofit or a government agency
  • Not Synced
    22
    00:02:19,640 --> 00:02:23,720
    or an educational setting where we're trying to understand social dynamics,
  • Not Synced
    23
    00:02:23,720 --> 00:02:28,700
    we're trying to understand the effectiveness and the impact of policies and programs
  • Not Synced
    24
    00:02:28,700 --> 00:02:35,110
    and various other purposes to inform our social mission with quantitative insights.
  • Not Synced
    25
    00:02:35,110 --> 00:02:43,780
    To give you some more about how I go about thinking about this and some more of the background of how I've been designing the class,
  • Not Synced
    26
    00:02:43,780 --> 00:02:51,310
    I find this quote from Sergios Mondo's Introduction to science and Technology Studies useful in a book.
  • Not Synced
    27
    00:02:51,310 --> 00:02:55,990
    He says, By itself, some piece of data has no meaning.
  • Not Synced
    28
    00:02:55,990 --> 00:03:01,330
    Data is only given meaning as evidence by the people who make use of it.
  • Not Synced
    29
    00:03:01,330 --> 00:03:04,990
    And this means that we can't just go find a bunch of data.
  • Not Synced
    30
    00:03:04,990 --> 00:03:15,610
    And, oh, we have the answer. We have to do work to get our data into a form where it can actually provide answers to the questions we care about.
  • Not Synced
    31
    00:03:15,610 --> 00:03:23,610
    We need to do work to frame our questions in a way that we can actually.
  • Not Synced
    32
    00:03:23,610 --> 00:03:31,770
    Answer them with the data that we have available or that we can obtain, and this is a human process and it's also a social process,
  • Not Synced
    33
    00:03:31,770 --> 00:03:37,200
    which is one of the reasons why we're going to be using our own class time for
  • Not Synced
    34
    00:03:37,200 --> 00:03:44,820
    discussion and application exercises is because we need the data of the data,
  • Not Synced
    35
    00:03:44,820 --> 00:03:50,310
    gains its meaning and becomes evidence for particular conclusions of particular answers.
  • Not Synced
    36
    00:03:50,310 --> 00:03:56,370
    Because we go through the process of interpretation and we go through the process of presenting our
  • Not Synced
    37
    00:03:56,370 --> 00:04:02,250
    interpretations for other to others and discussing and debating how we came to the conclusions,
  • Not Synced
    38
    00:04:02,250 --> 00:04:06,030
    the conclusions. We came to the level of support they have from the data.
  • Not Synced
    39
    00:04:06,030 --> 00:04:15,900
    So we're going to be practicing that a lot in this class of going from data to evidentiary meaning as a human process and pursuit of all of this.
  • Not Synced
    40
    00:04:15,900 --> 00:04:24,180
    There are a number of learning outcomes for this course. The first one is that I want you to be able to explore a data set.
  • Not Synced
    41
    00:04:24,180 --> 00:04:29,550
    Someone gives you some data and you first need to be able to get your bearings and understand what data you have,
  • Not Synced
    42
    00:04:29,550 --> 00:04:35,520
    be able to assess whether or not it would answer particular questions and what questions you might be able to answer with it.
  • Not Synced
    43
    00:04:35,520 --> 00:04:39,960
    You need to be able to define a question that we can answer.
  • Not Synced
    44
    00:04:39,960 --> 00:04:48,200
    So this is a. This is not an easy process when you're talking about quite a bit more about it in the next two videos.
  • Not Synced
    45
    00:04:48,200 --> 00:04:53,720
    But taking a goal that we have and turning it into a question that we can actually answer
  • Not Synced
    46
    00:04:53,720 --> 00:04:59,990
    through a data analysis is a process that takes work and we're going to be learning that.
  • Not Synced
    47
    00:04:59,990 --> 00:05:09,350
    I want you to be able to then actually carry out your analysis in a way that is documented so other people can understand what you did,
  • Not Synced
    48
    00:05:09,350 --> 00:05:18,710
    it's reproducible, so others can do the same analysis. And it also doesn't unnecessarily waste computing resources to come to the answers.
  • Not Synced
    49
    00:05:18,710 --> 00:05:27,730
    You make efficient use of the resources we have available to us. And also then we can scale to being doing analysis on very large data sets.
  • Not Synced
    50
    00:05:27,730 --> 00:05:35,320
    It's not enough to just do an analysis, I want you to be able to present the results of your analysis both through visuals,
  • Not Synced
    51
    00:05:35,320 --> 00:05:42,130
    charts and graphics and the written argument to be able to communicate to other people,
  • Not Synced
    52
    00:05:42,130 --> 00:05:48,910
    your peers and the class myself in the future, your advisers, your supervisors at work,
  • Not Synced
    53
    00:05:48,910 --> 00:05:56,650
    what it is that you learn from the data and why your conclusions are a reliable and defensible interpretation of the data.
  • Not Synced
    54
    00:05:56,650 --> 00:06:00,640
    I want you to be able to identify weaknesses in data analysis.
  • Not Synced
    55
    00:06:00,640 --> 00:06:03,990
    No data analysis is perfect.
  • Not Synced
    56
    00:06:03,990 --> 00:06:10,710
    There are going to be weaknesses and downsides and we'll have to make tradeoffs when we make decisions of how we analyze the data,
  • Not Synced
    57
    00:06:10,710 --> 00:06:19,480
    but also not all weaknesses are created equal. I want you to be able to assess whether the impact a weakness has on the correctness and utility.
  • Not Synced
    58
    00:06:19,480 --> 00:06:25,530
    The result? Some weaknesses are fatal flaws, so we can no longer trust the results we get.
  • Not Synced
    59
    00:06:25,530 --> 00:06:29,640
    Other weaknesses, however, are.
  • Not Synced
    60
    00:06:29,640 --> 00:06:37,650
    Caveats that we need to acknowledge and account for when we apply the results, but they don't fundamentally undermine their validity.
  • Not Synced
    61
    00:06:37,650 --> 00:06:42,600
    I also want you to be able to reason about the ethical implications of your work.
  • Not Synced
    62
    00:06:42,600 --> 00:06:46,560
    There's a variety of different frameworks and perspectives that we can take on the
  • Not Synced
    63
    00:06:46,560 --> 00:06:49,920
    ethics of data science work that we're going to be talking about this semester.
  • Not Synced
    64
    00:06:49,920 --> 00:06:57,450
    But I want you to be able to think about and account for and assess the ethical implications of data science work.
  • Not Synced
    65
    00:06:57,450 --> 00:07:03,900
    And then finally, I want you to understand how various the broader picture of data science,
  • Not Synced
    66
    00:07:03,900 --> 00:07:11,580
    the various techniques and applications fit into a framework to give you a map of the of the space,
  • Not Synced
    67
    00:07:11,580 --> 00:07:16,560
    and particularly with the role this class plays in Boise State's graduate curriculum.
  • Not Synced
    68
    00:07:16,560 --> 00:07:22,890
    I want you I want to give you a framework that you can use to relate what you're going to learn in other classes,
  • Not Synced
    69
    00:07:22,890 --> 00:07:27,480
    such as machine learning or large scale data analysis or recommender systems or social
  • Not Synced
    70
    00:07:27,480 --> 00:07:34,790
    media mining together and develop a coherent picture of what it is to do data science.
  • Not Synced
    71
    00:07:34,790 --> 00:07:39,590
    Support these outcomes, there are a number of components of the class. The first is the videos and readings.
  • Not Synced
    72
    00:07:39,590 --> 00:07:44,990
    You're watching one of those videos right now. This is going to be the primary mechanism for content delivery.
  • Not Synced
    73
    00:07:44,990 --> 00:07:51,200
    I'm not planning to lecture live in class. I'll be doing a little bit of lecture style things here and there as we need to
  • Not Synced
    74
    00:07:51,200 --> 00:07:55,610
    clear up things that that are confusing or things you have questions about.
  • Not Synced
    75
    00:07:55,610 --> 00:08:00,800
    But our primary content delivery is going to be through these prerecorded lecture videos.
  • Not Synced
    76
    00:08:00,800 --> 00:08:05,780
    The website is your guide to all of this. Each week is going to have a page that lays out the videos,
  • Not Synced
    77
    00:08:05,780 --> 00:08:12,760
    the readings that you need to do in order to be prepared for class, prepared for the assignments and to learn the material.
  • Not Synced
    78
    00:08:12,760 --> 00:08:19,120
    Are in class, meetings are going to be focused on discussing and applying the materials, some of that will be open discussion,
  • Not Synced
    79
    00:08:19,120 --> 00:08:24,790
    some of that will be guided application exercises particularly carried out with your classmates.
  • Not Synced
    80
    00:08:24,790 --> 00:08:27,860
    We're going to be having some quizzes, particularly each Thursday.
  • Not Synced
    81
    00:08:27,860 --> 00:08:33,250
    There's going to be a quiz before class that assesses your understanding of the material and give both
  • Not Synced
    82
    00:08:33,250 --> 00:08:38,730
    you and I an initial check on how well your understanding it and whether you're ready to apply it.
  • Not Synced
    83
    00:08:38,730 --> 00:08:41,560
    In our work and in the assignment,
  • Not Synced
    84
    00:08:41,560 --> 00:08:47,110
    we're going to have assignments that come at a relatively steady pace throughout the semester, one every other week.
  • Not Synced
    85
    00:08:47,110 --> 00:08:51,550
    They give you more extended place to practice and develop your skills and demonstrate
  • Not Synced
    86
    00:08:51,550 --> 00:08:55,990
    your mastery of the material that we're going to be covering this semester. And then finally,
  • Not Synced
    87
    00:08:55,990 --> 00:09:06,680
    we have two midterm exams that a final exam to give an additional check of your conceptual knowledge of what it is that we're doing in this class.
  • Not Synced
    88
    00:09:06,680 --> 00:09:08,450
    As we go through the semester,
  • Not Synced
    89
    00:09:08,450 --> 00:09:13,950
    I expect you're probably going to have questions and you're going to need some help and there's a variety of places you can get it.
  • Not Synced
    90
    00:09:13,950 --> 00:09:19,760
    You can get it in our class meetings, both from your peers in the class and from me.
  • Not Synced
    91
    00:09:19,760 --> 00:09:23,630
    You can get it at any time through the class forum on Piazza.
  • Not Synced
    92
    00:09:23,630 --> 00:09:29,330
    I strongly encourage you to ask questions and answer each other's questions on Piazza when you ask a question,
  • Not Synced
    93
    00:09:29,330 --> 00:09:35,510
    I'm probably not going to answer it immediately. I'm going to give some time in order to see if others have an answer first.
  • Not Synced
    94
    00:09:35,510 --> 00:09:41,150
    Oftentimes, helping other people answer questions strengthens our own understanding of the material.
  • Not Synced
    95
    00:09:41,150 --> 00:09:46,790
    So I encourage you to make use of Piazza both to answer your get your questions answered
  • Not Synced
    96
    00:09:46,790 --> 00:09:50,990
    and to develop and practice your understanding of the material through helping each other.
  • Not Synced
    97
    00:09:50,990 --> 00:09:56,180
    I am going to be having office hours. I'm going to be having them physically in my office so long as health permits.
  • Not Synced
    98
    00:09:56,180 --> 00:10:00,710
    And I will also be able to have remote office hours by appointment.
  • Not Synced
    99
    00:10:00,710 --> 00:10:04,670
    I do ask that if you have any inquiries about the class that you direct them through Piazza.
  • Not Synced
    100
    00:10:04,670 --> 00:10:09,080
    If it's something that's just for me, perhaps about grades or something,
  • Not Synced
    101
    00:10:09,080 --> 00:10:13,020
    you can send a private message to the instructors on Piazza that's going to let me keep all
  • Not Synced
    102
    00:10:13,020 --> 00:10:18,170
    my course communications in one place so I don't accidentally lose something in my email.
  • Not Synced
    103
    00:10:18,170 --> 00:10:23,180
    And finally, you can use the Internet as a resource where there's a lot of documentation, resources.
  • Not Synced
    104
    00:10:23,180 --> 00:10:31,860
    Many of our readings are going to come from various places on the Internet. There's many more sites where you can search for solutions and help also.
  • Not Synced
    105
    00:10:31,860 --> 00:10:37,140
    I encourage you to make use of public Q&A sites when you're a practicing data scientist after this class.
  • Not Synced
    106
    00:10:37,140 --> 00:10:39,960
    The Internet is at your disposal for getting your work done.
  • Not Synced
    107
    00:10:39,960 --> 00:10:47,600
    There's no reason not to go ahead and practice, make use of the resources that are available to you in this class.
  • Not Synced
    108
    00:10:47,600 --> 00:10:54,920
    I just ask that you do it in a way that's in keeping with the principles of academic integrity,
  • Not Synced
    109
    00:10:54,920 --> 00:10:59,240
    you can go to a public Q&A site and you can ask, I'm trying to do this thing.
  • Not Synced
    110
    00:10:59,240 --> 00:11:05,100
    I'm having a problem. And ask a specific question about the problem that you're getting stuck on.
  • Not Synced
    111
    00:11:05,100 --> 00:11:12,660
    I don't want you to just copy copying a piece of the assignment description pasted in a question answer site and ask, how do I do this?
  • Not Synced
    112
    00:11:12,660 --> 00:11:19,410
    Demonstrate your learning of the material and how you go about identifying where it is that you're stuck and
  • Not Synced
    113
    00:11:19,410 --> 00:11:26,520
    asking a question that's going to help uncover the missing piece that you need in order to move forward.
  • Not Synced
    114
    00:11:26,520 --> 00:11:34,030
    So it's all of that. This class is designed to provide flexibility in learning and to use each modality to its best advantage.
  • Not Synced
    115
    00:11:34,030 --> 00:11:39,180
    There are tradeoffs. I'm not going to pretend like there's no trade off the how I've designed this class.
  • Not Synced
    116
    00:11:39,180 --> 00:11:46,920
    But with this design, we can do the content delivery me lecturing in a way where you can go back and watch videos later.
  • Not Synced
    117
    00:11:46,920 --> 00:11:51,330
    You can catch up. If you have to miss some material, you can speed up the video.
  • Not Synced
    118
    00:11:51,330 --> 00:12:02,400
    If I talk to slow. And we're going to focus our in class time on things that can only be done through Synchronoss discussion.
  • Not Synced
    119
    00:12:02,400 --> 00:12:06,270
    Where we're actually engaging with the material together and talking to each
  • Not Synced
    120
    00:12:06,270 --> 00:12:10,560
    other about it and these things together with the various parts of the class,
  • Not Synced
    121
    00:12:10,560 --> 00:12:16,020
    I hope are going to allow us to achieve our learning outcomes for the semester. I hope you have a great semester.
  • Not Synced
    122
    00:12:16,020 --> 00:12:20,070
    I hope you learn a lot. I also expect to learn from you every time I teach.
  • Not Synced
    123
    00:12:20,070 --> 00:12:28,050
    It's a learning experience for me as well. And I learn more about the subject and about how to teach and communicate about it from my students.
  • Not Synced
    124
    00:12:28,050 --> 00:12:34,900
    So let's learn together.
  • Not Synced
Title:
https:/.../52f6698f-a7dd-4943-86eb-ad7501790760-fc4269eb-1399-4a33-9140-ad8c0105606b.mp4?invocationId=08872c6d-6303-ec11-a9e9-0a1a827ad0ec
Video Language:
English
Duration:
12:34

English subtitles

Incomplete

Revisions