all right so here's a little help file on dealing with missing values in our mini survey data basically the reason that we have any concern that there are missing values is that for each case for which there's a missing value in any analysis that includes that case it'll be omitted so things per se like multiple regression or something like that you end up losing a lot of cases if you have a few missing values scatter around a number of different variables so what I've done here is I've highlighted the cases that are the cases they had something missing with yellow highlighting and I just did that by hand there was no magic formula although I could have made some sort of conditional if statement sort of thing that would have done that but I was feeling lazy and there weren't that many cases what I've gone ahead and done is I've calculated the mean which for this variable or something missing is one point six five to one seven for the median in the mode and just to kind of think about these are all central tendencies of this variable and a lot of a cheap and dirty way to deal with missing values is substitute in the central tendency and then for linear source of variables often times the mean would be a good choice in this case we have kind of these ordered levels in terms of our Likert scales and in that case sometimes maybe the median might be superior one advantage of the mean is that if you substitute the mean it'll be one point six five two which is obviously not one of the choices and you can clearly see which ones were substituted in this case just for our purposes here I'm highlighting them by hand the other thing I'm going to do is I'm going to put an equation here that for these cases where they are being substituted instead of putting the value there I'm going to put equals to the median here and I'm gonna make the twenty nine stay in place there by putting a dollar sign in front of it make it an absolute reference and that means that I can copy this and paste it in each of these subsequent spots regardless of which row it's in it's always going to be grabbing the value from road 29 for that particular column all right so I can come here and I can substitute that in and here when I substitute it we'll see it changes in here and so here this one this one and you can see these ones here these questions were how much he uses different statistics software probably the best guess is actually the median rather than the mean in that case because only one person here used that so anyway this is probably the dominant category here if someone left it blank they probably haven't used it let me see let's drag this over a little bit and I can also I can fill these across it'll still work so I don't have to just paste paste paste I can do a whole row of them like that can go in oops I need to recopy so I can copy any of these paste it in here and it'll work and it'll keep grabbing the observation from the twenty-ninth cell so I need to copy those in there okay and so at the end of this I'm gonna have a data set that is almost all values Oh something I should say is that I calculated the mean median and mode before I started making these changes if you hadn't you'd get a circular reference warning so instead what you want to do is copy and paste the values here instead of the formulas for the mean median mode and that's what I already did and so that's why we see that there let's see here's a couple more that I hadn't highlighted so I'll copy this and I'll paste it there you can see I'm pasting the formatting at the same time which is kind of convenient boom okay hold on okay and then we have a case here computers at home awesome one listed three or more I'm just gonna go ahead and change that to a three I know it could be more than that but for our purposes it's uh we have low medium and high more or less okay and I'll look along here see if there any more missing cases there are a couple so I'm going to go ahead and copy formatting the rule to look for the value from the 29 throw from here I'm gonna paste it into the remaining items now this was not very efficient yeah I wouldn't do this if I had a lot of a lot of cases but for our purposes I think this will work and we could talk about how to automate these sort of a steps later so now what I'm going to do is I'm going to copy everything in this sheet like this copy and then I'm gonna just shoot empty no it's not I'm gonna go and add yet another sheet I'm gonna paste special so right click and then click paste special words and then I'm gonna click on formats ok and then I'm gonna click on paste special values ok and so then I got my highlighting and I got my numbers but I didn't get my formulas which I appreciate so now this data set here I'm gonna go ahead and clear it clear this stuff off the bottom clear contents cuz I don't need it anymore and now I'm ready to do some analysis on this data yeah so this would be a good starting point for that so pretty much we're done processing the data in terms of missing cases and stuff gonna get ready to start doing the next step which would be constructing an index which I'll make a brief video about here in a second