1 00:00:00,027 --> 00:00:02,035 This video introduces JSON. 2 00:00:02,035 --> 00:00:04,096 Let's start by talking about its pronunciation. 3 00:00:04,096 --> 00:00:07,097 Some people call it Jason, and some call it J-sahn. 4 00:00:07,097 --> 00:00:09,021 I'll do a little bit of 5 00:00:09,021 --> 00:00:10,079 investigation and discovered that the 6 00:00:10,079 --> 00:00:12,062 original developer of JSON calls 7 00:00:12,062 --> 00:00:15,014 it JSON so, I'll do that too. 8 00:00:15,014 --> 00:00:18,089 Like XML, JSON can be thought of as a data model. 9 00:00:18,089 --> 00:00:20,051 An alternative to the relational data 10 00:00:20,051 --> 00:00:21,098 model that is more 11 00:00:21,098 --> 00:00:24,084 appropriate for semi-structured data. 12 00:00:24,084 --> 00:00:26,041 In this video I'll introduce the 13 00:00:26,041 --> 00:00:27,096 basics of JSON and I'll 14 00:00:27,096 --> 00:00:29,022 actually compare JSON to the 15 00:00:29,022 --> 00:00:32,059 relational data model and I'll compare it to XML. 16 00:00:32,059 --> 00:00:34,007 But it's not crucial to have 17 00:00:34,007 --> 00:00:37,015 watched those videos to get something out of this one. 18 00:00:37,015 --> 00:00:38,006 Now among the three models 19 00:00:38,006 --> 00:00:40,000 - the relational model, XML, and 20 00:00:40,000 --> 00:00:41,008 JSON - JSON is by 21 00:00:41,008 --> 00:00:43,046 a large margin the newest, 22 00:00:43,046 --> 00:00:44,072 and it does show there aren't 23 00:00:44,072 --> 00:00:46,057 as many tools for JSON 24 00:00:46,057 --> 00:00:48,071 as we have for XML and 25 00:00:48,071 --> 00:00:51,006 certainly not as we have for relational. 26 00:00:51,006 --> 00:00:54,069 JSON stands for Javascript object notation. 27 00:00:54,069 --> 00:00:56,002 Although it's evolved to become pretty 28 00:00:56,002 --> 00:00:59,062 much independent of Javascript at this point. 29 00:00:59,062 --> 00:01:03,006 The little snippet of Jason in the corner right now mostly for decoration. 30 00:01:03,006 --> 00:01:05,077 We'll talk about the details in just a minute. 31 00:01:05,077 --> 00:01:07,058 Now JSON was designed 32 00:01:07,058 --> 00:01:09,068 originally for what's called 33 00:01:09,068 --> 00:01:11,033 serializing data objects. 34 00:01:11,033 --> 00:01:13,021 That is taking the objects that 35 00:01:13,021 --> 00:01:14,055 are in a program and sort 36 00:01:14,055 --> 00:01:15,067 of writing them down in a 37 00:01:15,067 --> 00:01:18,077 serial fashion, typically in files. 38 00:01:18,077 --> 00:01:19,097 one thing about json 39 00:01:19,097 --> 00:01:21,086 is that it is human readable, 40 00:01:21,086 --> 00:01:23,023 similar to the way xml 41 00:01:23,023 --> 00:01:25,025 is human readable and is 42 00:01:25,025 --> 00:01:27,004 often use for data interchange. 43 00:01:27,004 --> 00:01:28,093 So, for writing out, say 44 00:01:28,093 --> 00:01:30,031 the objects program so that 45 00:01:30,031 --> 00:01:32,009 they can be exchanged with another 46 00:01:32,009 --> 00:01:34,029 program and read into that one. 47 00:01:34,029 --> 00:01:36,005 Also, just more generally, because 48 00:01:36,005 --> 00:01:38,012 json is not as rigid 49 00:01:38,012 --> 00:01:40,003 as the relational model, it's generally 50 00:01:40,003 --> 00:01:42,015 useful for representing and for 51 00:01:42,015 --> 00:01:43,049 storing data that doesn't 52 00:01:43,049 --> 00:01:47,053 have rigid structure that we've been calling semi-structured data. 53 00:01:47,053 --> 00:01:49,009 As I mentioned json is 54 00:01:49,009 --> 00:01:51,044 no longer closely tied to 55 00:01:51,044 --> 00:01:54,002 Many different programming languages do 56 00:01:54,002 --> 00:01:56,000 have parsers for reading json 57 00:01:56,000 --> 00:01:57,056 data into the program and 58 00:01:57,056 --> 00:02:00,011 for writing out json data as well. 59 00:02:00,011 --> 00:02:01,064 Now, let's talk about the basic 60 00:02:01,064 --> 00:02:03,035 constructs in JSON, and as 61 00:02:03,035 --> 00:02:06,052 we will see this constructs are recursively defined. 62 00:02:06,052 --> 00:02:08,024 We'll use the example JSON 63 00:02:08,024 --> 00:02:09,067 data shown on the screen 64 00:02:09,067 --> 00:02:11,052 and that data is also available 65 00:02:11,052 --> 00:02:14,035 in a file for download from the website. 66 00:02:14,035 --> 00:02:18,029 The basic atomic values in JSON are fairly typical. 67 00:02:18,029 --> 00:02:21,063 We have numbers, we have strings. 68 00:02:21,063 --> 00:02:23,067 We also have Boolean Values 69 00:02:23,067 --> 00:02:24,057 although there are none of those 70 00:02:24,057 --> 00:02:29,019 in this example, that's true and false, and no values. 71 00:02:29,019 --> 00:02:30,061 There are two types of composite 72 00:02:30,061 --> 00:02:34,049 values in JSON: objects and arrays. 73 00:02:34,049 --> 00:02:36,015 Objects are enclosed in curly 74 00:02:36,015 --> 00:02:37,066 braces and they consist 75 00:02:37,066 --> 00:02:40,029 of sets of label-value pairs. 76 00:02:40,029 --> 00:02:41,069 For example, we have an 77 00:02:41,069 --> 00:02:44,087 object here that has a first name and a last name. 78 00:02:44,087 --> 00:02:46,076 We have a more - 79 00:02:46,076 --> 00:02:48,063 bigger, let's say, object here 80 00:02:48,063 --> 00:02:51,076 that has ISBN, price, edition, and so on. 81 00:02:51,076 --> 00:02:53,028 When we do our JSON demo, 82 00:02:53,028 --> 00:02:55,043 we'll go into these constructs in more detail. 83 00:02:55,043 --> 00:02:57,067 At this point, we're just introducing them. 84 00:02:57,067 --> 00:02:59,046 the second type of composite 85 00:02:59,046 --> 00:03:01,016 value in JSON is arrays, 86 00:03:01,016 --> 00:03:03,004 and arrays are enclosed in square 87 00:03:03,004 --> 00:03:06,000 brackets with commas between the array elements. 88 00:03:06,000 --> 00:03:07,039 Actually we have commas in the objects 89 00:03:07,039 --> 00:03:10,053 as and arrays are list of values. 90 00:03:10,053 --> 00:03:11,089 For example, we can see 91 00:03:11,089 --> 00:03:13,062 here that authors is a 92 00:03:13,062 --> 00:03:16,031 list of author objects. 93 00:03:16,031 --> 00:03:18,002 Now I mentioned that the constructs 94 00:03:18,002 --> 00:03:20,049 are recursive, specifically the values 95 00:03:20,049 --> 00:03:22,018 inside arrays can be anything, 96 00:03:22,018 --> 00:03:23,091 they can be other arrays or objects, 97 00:03:23,091 --> 00:03:26,018 space values and the values 98 00:03:26,018 --> 00:03:27,066 are making up the label value 99 00:03:27,066 --> 00:03:29,004 pairs and objects can also 100 00:03:29,004 --> 00:03:32,032 be any composite value or a base value. 101 00:03:32,032 --> 00:03:33,066 And I did want to 102 00:03:33,066 --> 00:03:34,092 mention, by the way, that sometime 103 00:03:34,092 --> 00:03:36,012 this word label here for 104 00:03:36,012 --> 00:03:39,077 label value pairs is called a "property". 105 00:03:39,077 --> 00:03:41,052 So, just like XML, JSON 106 00:03:41,052 --> 00:03:44,001 has some basic structural requirements in 107 00:03:44,001 --> 00:03:45,035 its format but it doesn't 108 00:03:45,035 --> 00:03:47,081 have a lot of requirements in terms of uniformity. 109 00:03:47,081 --> 00:03:49,072 We have a couple of examples 110 00:03:49,072 --> 00:03:51,053 of heterogeneity in here, for 111 00:03:51,053 --> 00:03:52,069 example, this book has an 112 00:03:52,069 --> 00:03:53,084 edition and the other one 113 00:03:53,084 --> 00:03:57,071 doesn't this book has a remark and the other one doesn't. 114 00:03:57,071 --> 00:03:59,036 But we'll see many more examples 115 00:03:59,036 --> 00:04:00,079 of heterogeneity when we do 116 00:04:00,079 --> 00:04:03,088 the demo and look into JSON data in more detail. 117 00:04:03,088 --> 00:04:06,043 Now let's compare JSON and the relational model. 118 00:04:06,043 --> 00:04:07,047 We will see that many of 119 00:04:07,047 --> 00:04:09,029 the comparisons are fairly similar 120 00:04:09,029 --> 00:04:12,067 to when we compared XML to the relational model. 121 00:04:12,067 --> 00:04:15,094 Let's start with the basic structures underling the data model. 122 00:04:15,094 --> 00:04:18,081 So, the relational model is based on tables. 123 00:04:18,081 --> 00:04:20,007 We set up structure of 124 00:04:20,007 --> 00:04:22,007 table, a set of columns, and 125 00:04:22,007 --> 00:04:25,003 then the data becomes rows in those tables. 126 00:04:25,003 --> 00:04:27,044 JSON is based instead on 127 00:04:27,044 --> 00:04:29,046 sets, the sets of label 128 00:04:29,046 --> 00:04:34,009 pairs and arrays and as we saw, they can be nested. 129 00:04:34,009 --> 00:04:35,064 One of the big differences between 130 00:04:35,064 --> 00:04:38,004 the two models, of course, is the scheme. 131 00:04:38,004 --> 00:04:39,083 So the Relational model has a 132 00:04:39,083 --> 00:04:41,074 Schema fixed in advance, 133 00:04:41,074 --> 00:04:43,001 you set it up before you 134 00:04:43,001 --> 00:04:44,008 have any data loaded and then 135 00:04:44,008 --> 00:04:47,052 all data needs to confirm to that Schema. 136 00:04:47,052 --> 00:04:48,005 Jason on the other other 137 00:04:48,005 --> 00:04:52,011 hand typically does not require a schema in advance. 138 00:04:52,011 --> 00:04:53,062 In fact, the schema and the 139 00:04:53,062 --> 00:04:55,033 data are kinda mix together 140 00:04:55,033 --> 00:04:56,065 just like an xml, and 141 00:04:56,065 --> 00:04:58,054 this is often referred to as 142 00:04:58,054 --> 00:05:00,068 self-describing data, where the 143 00:05:00,068 --> 00:05:04,037 schema elements are within the data itself. 144 00:05:04,037 --> 00:05:05,092 And this is of course typically 145 00:05:05,092 --> 00:05:08,021 more flexible than the to a model. 146 00:05:08,021 --> 00:05:10,006 But there are advantages to having schema [sp?] 147 00:05:10,006 --> 00:05:12,031 as well, definitely. 148 00:05:12,031 --> 00:05:13,076 As far as queries go, one 149 00:05:13,076 --> 00:05:15,016 of the nice features of the 150 00:05:15,016 --> 00:05:16,063 relational model is that there 151 00:05:16,063 --> 00:05:21,096 are simple, expressive languages for clearing the database. 152 00:05:21,096 --> 00:05:23,092 In terms of json, although a 153 00:05:23,092 --> 00:05:25,096 few New things have been proposed; 154 00:05:25,096 --> 00:05:27,095 at this point there's nothing widely 155 00:05:27,095 --> 00:05:29,058 used for querying Jason data. 156 00:05:29,058 --> 00:05:31,001 Typically Jason data is 157 00:05:31,001 --> 00:05:34,055 read into a program and it's manipulated programatically. 158 00:05:34,055 --> 00:05:35,086 Now let me interject that this 159 00:05:35,086 --> 00:05:38,069 video is being made in February 2012. 160 00:05:38,069 --> 00:05:40,064 So it is possible 161 00:05:40,064 --> 00:05:42,037 that some json query languages 162 00:05:42,037 --> 00:05:44,005 will emerge and become 163 00:05:44,005 --> 00:05:45,008 widely used there is just 164 00:05:45,008 --> 00:05:46,081 nothing used at this point. 165 00:05:46,081 --> 00:05:47,093 There are some proposals. 166 00:05:47,093 --> 00:05:49,094 There's a JSON path language, 167 00:05:49,094 --> 00:05:52,037 JSON Query, a language called jaql. 168 00:05:52,037 --> 00:05:53,056 It may be that just like 169 00:05:53,056 --> 00:05:55,001 XML, the query language are 170 00:05:55,001 --> 00:05:57,014 gonna follow the prevalent use 171 00:05:57,014 --> 00:05:59,013 of the data format or the data model. 172 00:05:59,013 --> 00:06:01,057 But that does not happened yet, as of February 2012. 173 00:06:01,057 --> 00:06:04,002 How about ordering? 174 00:06:04,002 --> 00:06:07,003 One aspect of the relational model is that it's an unordered model. 175 00:06:07,003 --> 00:06:08,069 It's based on sets and 176 00:06:08,069 --> 00:06:10,023 if we want to see relational 177 00:06:10,023 --> 00:06:14,015 data in sorted order then we put that inside a query. 178 00:06:14,015 --> 00:06:16,003 In JSON, we have arrays as 179 00:06:16,003 --> 00:06:19,028 one of the basic data structures, and arrays are ordered. 180 00:06:19,028 --> 00:06:20,088 Of course, there's also the fact like 181 00:06:20,088 --> 00:06:22,025 XML that JSON data is 182 00:06:22,025 --> 00:06:24,044 often is usually written files 183 00:06:24,044 --> 00:06:26,071 and files themselves are naturally ordered, 184 00:06:26,071 --> 00:06:27,068 but the ordering of the data 185 00:06:27,068 --> 00:06:30,043 in files usually isn't relevant, 186 00:06:30,043 --> 00:06:31,082 sometimes it is, but 187 00:06:31,082 --> 00:06:33,062 typically not finally in 188 00:06:33,062 --> 00:06:35,003 terms of implementation, for the 189 00:06:35,003 --> 00:06:37,016 relational model, there are 190 00:06:37,016 --> 00:06:39,091 systems that implement the relational model natively. 191 00:06:39,091 --> 00:06:42,074 They're very generally quite 192 00:06:42,074 --> 00:06:44,058 efficient and powerful systems. 193 00:06:44,058 --> 00:06:46,027 For json, we haven't yet 194 00:06:46,027 --> 00:06:48,009 seen stand alone database systems 195 00:06:48,009 --> 00:06:49,078 that use json their data 196 00:06:49,078 --> 00:06:51,036 model instead JSON is 197 00:06:51,036 --> 00:06:54,058 more typically coupled with programming languages. 198 00:06:54,058 --> 00:06:56,069 One thing I should add however 199 00:06:56,069 --> 00:07:00,076 JSON is used in NoSQL systems. 200 00:07:00,076 --> 00:07:02,061 We do have videos about NoSQL 201 00:07:02,061 --> 00:07:05,029 systems you may or may not have, have watched those yet. 202 00:07:05,029 --> 00:07:08,065 There's a couple of different ways that JSON is used used in those systems. 203 00:07:08,065 --> 00:07:10,024 One of them is just as 204 00:07:10,024 --> 00:07:11,087 a format for reading data 205 00:07:11,087 --> 00:07:14,082 into the systems and writing data out from the systems. 206 00:07:14,082 --> 00:07:15,063 The other way that it is 207 00:07:15,063 --> 00:07:17,002 used is that some of the 208 00:07:17,002 --> 00:07:18,007 note systems are what are 209 00:07:18,007 --> 00:07:20,022 called "Document Management Systems" where 210 00:07:20,022 --> 00:07:22,041 the documents themselves may contain 211 00:07:22,041 --> 00:07:24,017 JSON data and then the systems 212 00:07:24,017 --> 00:07:26,023 will have special features for manipulating 213 00:07:26,023 --> 00:07:29,056 the JSON in the document is better stored by the system. 214 00:07:29,056 --> 00:07:32,007 Now let's compared json and XML. 215 00:07:32,007 --> 00:07:35,031 This is actually a hotly debated comparison right now. 216 00:07:35,031 --> 00:07:37,083 There are signification overlap in 217 00:07:37,083 --> 00:07:40,023 the usage of JSON and XML. 218 00:07:40,023 --> 00:07:41,006 Both of them are very 219 00:07:41,006 --> 00:07:43,099 good for putting semi-structured data 220 00:07:43,099 --> 00:07:46,019 into a file format 221 00:07:46,019 --> 00:07:48,027 and using it for data interchange. 222 00:07:48,027 --> 00:07:49,051 And so because there's so 223 00:07:49,051 --> 00:07:50,097 much overlap in what they're used 224 00:07:50,097 --> 00:07:54,002 for, it's not surprising that there's significant debate. 225 00:07:54,002 --> 00:07:55,004 I'm not gonna take sides. 226 00:07:55,004 --> 00:07:57,069 I'm just going to try to give you a comparison. 227 00:07:57,069 --> 00:07:58,008 Let's start by looking at the 228 00:07:58,008 --> 00:08:02,065 verbosity of expressing data in the two languages. 229 00:08:02,065 --> 00:08:03,098 So it is the case 230 00:08:03,098 --> 00:08:05,075 that XML is in general, 231 00:08:05,075 --> 00:08:08,033 a little more verbose than Jason. 232 00:08:08,033 --> 00:08:09,099 So the same data expressed in 233 00:08:09,099 --> 00:08:11,018 the 2 formats will tend to 234 00:08:11,018 --> 00:08:12,063 have more characters [xx] than Json 235 00:08:12,063 --> 00:08:14,026 and you can see that 236 00:08:14,026 --> 00:08:16,053 in our examples because our big 237 00:08:16,053 --> 00:08:18,007 Json example was actually pretty 238 00:08:18,007 --> 00:08:20,054 much the same data that we used when we showed XML. 239 00:08:20,054 --> 00:08:22,028 And the reason for 240 00:08:22,028 --> 00:08:23,026 XML being a bit more 241 00:08:23,026 --> 00:08:24,095 verbose largely has to 242 00:08:24,095 --> 00:08:26,094 do actually with closing tags, 243 00:08:26,094 --> 00:08:29,022 and some other features. 244 00:08:29,022 --> 00:08:30,057 But I'll let you judge 245 00:08:30,057 --> 00:08:32,058 for yourself whether the somewhat 246 00:08:32,058 --> 00:08:35,064 longer expression of XML is a problem. 247 00:08:35,064 --> 00:08:37,061 Second is complexity, and here, 248 00:08:37,061 --> 00:08:39,029 too, most people would say 249 00:08:39,029 --> 00:08:42,036 that XML is a bit more complex than JSON. 250 00:08:42,036 --> 00:08:45,098 I'm not sure I entirely agree with that comparison. 251 00:08:45,098 --> 00:08:47,069 If you look at the subset 252 00:08:47,069 --> 00:08:49,021 of XML that people really 253 00:08:49,021 --> 00:08:51,019 use, you've got attributes, 254 00:08:51,019 --> 00:08:52,052 sub elements and text, and 255 00:08:52,052 --> 00:08:54,008 that's more or less it. 256 00:08:54,008 --> 00:08:55,029 If you look at Json, you got 257 00:08:55,029 --> 00:08:58,038 your basic values and you've got your objects and your arrays. 258 00:08:58,038 --> 00:08:59,043 I think the issue is that 259 00:08:59,043 --> 00:09:01,036 XML has a lot of 260 00:09:01,036 --> 00:09:03,038 extra stuff that goes along with it. 261 00:09:03,038 --> 00:09:06,001 So if you read the entire XML specification. 262 00:09:06,001 --> 00:09:08,012 It will take you a long time. 263 00:09:08,012 --> 00:09:10,003 JSON, you can grasp the 264 00:09:10,003 --> 00:09:12,068 entire specification a little bit more quickly. 265 00:09:12,068 --> 00:09:14,034 Now let's turn to validity. 266 00:09:14,034 --> 00:09:16,018 And by validity, I mean the 267 00:09:16,018 --> 00:09:18,063 ability to specify constraints or 268 00:09:18,063 --> 00:09:20,067 restriction or schema on 269 00:09:20,067 --> 00:09:22,068 the structure of data 270 00:09:22,068 --> 00:09:24,002 in one of these models, and 271 00:09:24,002 --> 00:09:27,007 have it enforced by tools or by a system. 272 00:09:27,007 --> 00:09:28,063 Specifically in XML we 273 00:09:28,063 --> 00:09:30,029 have the notion of document type 274 00:09:30,029 --> 00:09:32,045 descriptors, or DTDs, we also 275 00:09:32,045 --> 00:09:34,062 have XML Schema which 276 00:09:34,062 --> 00:09:38,025 gives us XSD's, XML Schema Descriptors. 277 00:09:38,025 --> 00:09:39,082 And these are schema like 278 00:09:39,082 --> 00:09:41,065 things that we can specify, and 279 00:09:41,065 --> 00:09:42,089 we can have our data checked to 280 00:09:42,089 --> 00:09:43,009 make sure it conforms to the 281 00:09:43,009 --> 00:09:45,099 schema, and these are, I would say, 282 00:09:45,099 --> 00:09:49,018 fairly widely used at this point for XML. 283 00:09:49,018 --> 00:09:51,039 For JSON, there's something called JSON Schema. 284 00:09:51,039 --> 00:09:53,097 And, you know, similar to 285 00:09:53,097 --> 00:09:55,045 XML Schema, it's a way 286 00:09:55,045 --> 00:09:57,008 to specify the structure and then 287 00:09:57,008 --> 00:09:58,092 we can check that JSON conforms 288 00:09:58,092 --> 00:10:02,057 that and we will see some of that in our demo. 289 00:10:02,057 --> 00:10:04,088 The current status, February 290 00:10:04,088 --> 00:10:07,014 2012 is that this is 291 00:10:07,014 --> 00:10:09,015 not widely used this point. 292 00:10:09,015 --> 00:10:11,054 But again, it could really just be evolution. 293 00:10:11,054 --> 00:10:14,018 If we look back 294 00:10:14,018 --> 00:10:15,074 at XML, as it was originally 295 00:10:15,074 --> 00:10:17,076 proposed, probably we didn't 296 00:10:17,076 --> 00:10:18,077 see a whole of lot of use 297 00:10:18,077 --> 00:10:20,007 of DTDs, and in fact not 298 00:10:20,007 --> 00:10:22,076 as XSDs for sure until later on. 299 00:10:22,076 --> 00:10:26,008 So we'll just have to see whether JSON evolves in a similar way. 300 00:10:26,008 --> 00:10:31,011 Now the programming interface is where JSON really shines. 301 00:10:31,011 --> 00:10:34,096 The programming interface for XML can be fairly clunky. 302 00:10:34,096 --> 00:10:37,054 The XML model, the attributes 303 00:10:37,054 --> 00:10:39,007 and sub-elements and so on, 304 00:10:39,007 --> 00:10:41,024 don't typically match the model 305 00:10:41,024 --> 00:10:43,083 of data inside a programming language. 306 00:10:43,083 --> 00:10:47,002 In fact, that's something called the impedance mismatch. 307 00:10:47,002 --> 00:10:48,072 The impedance miss match 308 00:10:48,072 --> 00:10:50,005 has been discussed in database 309 00:10:50,005 --> 00:10:52,084 systems actually, for decades 310 00:10:52,084 --> 00:10:54,058 because one of the original 311 00:10:54,058 --> 00:10:56,045 criticisms of relational database 312 00:10:56,045 --> 00:10:57,076 systems is that the data 313 00:10:57,076 --> 00:10:59,072 structures used in the database, 314 00:10:59,072 --> 00:11:01,071 specifically tables, didn't match 315 00:11:01,071 --> 00:11:04,042 directly with the data structures and programming languages. 316 00:11:04,042 --> 00:11:05,031 So there had to be some manipulation 317 00:11:05,031 --> 00:11:09,014 at the interface between programming languages and the database system and that's the mismatch. 318 00:11:09,014 --> 00:11:13,005 So that same impedance mismatch 319 00:11:13,005 --> 00:11:15,015 is pretty much present 320 00:11:15,015 --> 00:11:17,079 in XML wherein JSON is 321 00:11:17,079 --> 00:11:19,083 really a more direct mapping 322 00:11:19,083 --> 00:11:23,099 between many programming languages and the structures of JSON. 323 00:11:23,099 --> 00:11:25,088 Finally, let's talk about querying. 324 00:11:25,088 --> 00:11:27,012 I've already touched on this 325 00:11:27,012 --> 00:11:28,008 a bit, but JSON does not 326 00:11:28,008 --> 00:11:31,023 have any mature, widely 327 00:11:31,023 --> 00:11:33,042 used query languages at this point. 328 00:11:33,042 --> 00:11:34,058 for XML we do have 329 00:11:34,058 --> 00:11:36,009 XPath, we have XQuery, 330 00:11:36,009 --> 00:11:39,041 we have XSLT. 331 00:11:39,041 --> 00:11:41,008 Maybe not all of 332 00:11:41,008 --> 00:11:42,091 them are widely used but there's 333 00:11:42,091 --> 00:11:44,004 no question that XPath at least and 334 00:11:44,004 --> 00:11:46,093 XSL are used quiet a bit. 335 00:11:46,093 --> 00:11:48,045 As far as Json goes there 336 00:11:48,045 --> 00:11:50,066 is a proposal called Json path. 337 00:11:50,066 --> 00:11:52,018 It looks actually quiet a lot 338 00:11:52,018 --> 00:11:55,003 like XPath maybe he'll catch on. 339 00:11:55,003 --> 00:11:56,076 There's something called JSON Query. 340 00:11:56,076 --> 00:11:58,057 Doesn't look so much like 341 00:11:58,057 --> 00:12:01,068 XML Query, I mean, XQuery. 342 00:12:01,068 --> 00:12:02,095 and finally, there has been a 343 00:12:02,095 --> 00:12:07,007 proposal called [xx] language, but 344 00:12:07,007 --> 00:12:08,073 again as of February 2012 345 00:12:08,073 --> 00:12:10,011 all of these are still very 346 00:12:10,011 --> 00:12:13,016 early, so we just don't know what's going to catch on. 347 00:12:13,016 --> 00:12:16,029 So now let's talk about the validity of JSON data. 348 00:12:16,029 --> 00:12:17,073 So do JSON data that's 349 00:12:17,073 --> 00:12:19,071 syntacti[xx] valid, simply needs 350 00:12:19,071 --> 00:12:22,055 to adhere to the basic structural requirements. 351 00:12:22,055 --> 00:12:24,011 As a reminder, that would be 352 00:12:24,011 --> 00:12:25,061 that we have sets of label 353 00:12:25,061 --> 00:12:27,035 value pairs, we have arrays 354 00:12:27,035 --> 00:12:29,033 of values and our values 355 00:12:29,033 --> 00:12:31,018 are from predefined types. 356 00:12:31,018 --> 00:12:34,006 And again, these values here are defined recursively. 357 00:12:34,006 --> 00:12:35,081 So we start with a JSON 358 00:12:35,081 --> 00:12:37,019 file and we send 359 00:12:37,019 --> 00:12:39,001 it to a the parser 360 00:12:39,001 --> 00:12:40,051 may determine that the file 361 00:12:40,051 --> 00:12:42,038 has syntactic errors or if 362 00:12:42,038 --> 00:12:44,045 the file is syntactically correct then 363 00:12:44,045 --> 00:12:47,074 it can parsed into objects in a programming language. 364 00:12:47,074 --> 00:12:49,076 Now if we're interested in semantically 365 00:12:49,076 --> 00:12:51,005 valid JSON; that is 366 00:12:51,005 --> 00:12:52,064 JSON that conforms to 367 00:12:52,064 --> 00:12:54,045 some constraints or a schema, 368 00:12:54,045 --> 00:12:55,067 then in addition to checking the 369 00:12:55,067 --> 00:12:57,088 basics structural requirements, we check 370 00:12:57,088 --> 00:13:00,084 whether JSON conforms to the specified schema. 371 00:13:00,084 --> 00:13:02,034 If we use a language like JSON 372 00:13:02,034 --> 00:13:03,084 schema for example, we put 373 00:13:03,084 --> 00:13:05,079 a specification in as a 374 00:13:05,079 --> 00:13:07,096 separate file, and in 375 00:13:07,096 --> 00:13:09,058 fact JSON schema is expressed in 376 00:13:09,058 --> 00:13:11,002 JSON itself, as we'll see 377 00:13:11,002 --> 00:13:12,063 in our demo, we send it 378 00:13:12,063 --> 00:13:13,087 to a validator and that 379 00:13:13,087 --> 00:13:15,099 validator might find that there 380 00:13:15,099 --> 00:13:16,009 are some syntactic errors or 381 00:13:16,009 --> 00:13:17,075 it may find that there are 382 00:13:17,075 --> 00:13:19,031 some symantic errors so the 383 00:13:19,031 --> 00:13:21,049 data could to be correct syntactically 384 00:13:21,049 --> 00:13:23,055 but not conform to the schema. 385 00:13:23,055 --> 00:13:25,074 If it's both syntactically and semantically 386 00:13:25,074 --> 00:13:26,008 correct then it can move 387 00:13:26,008 --> 00:13:28,017 on to the parser where 388 00:13:28,017 --> 00:13:30,009 will be parsed again into 389 00:13:30,009 --> 00:13:32,035 objects in a programming language. 390 00:13:32,035 --> 00:13:36,022 So to summarize, JSON stands for Java Script Object Notation. 391 00:13:36,022 --> 00:13:38,006 It's a standard for taking data 392 00:13:38,006 --> 00:13:41,058 objects and serializing them into a format that's human readable. 393 00:13:41,058 --> 00:13:43,029 It's also very useful for 394 00:13:43,029 --> 00:13:46,001 exchanging data between programs, 395 00:13:46,001 --> 00:13:48,012 and for representing and storing 396 00:13:48,012 --> 00:13:51,022 semi-structured data in a flexible fashion. 397 00:13:51,022 --> 00:13:52,077 In the next video we'll go 398 00:13:52,077 --> 00:13:55,002 live with a demonstration of JSON. 399 00:13:55,002 --> 00:13:56,026 We'll use a couple of JSON 400 00:13:56,026 --> 00:13:57,006 editors, we'll take a 401 00:13:57,006 --> 00:13:59,045 look at the structure of JSON 402 00:13:59,045 --> 00:14:01,008 data, when it's syntactically correct. 403 00:14:01,008 --> 00:14:03,054 We'll demonstrate how it's very 404 00:14:03,054 --> 00:14:05,005 flexible when our data might 405 00:14:05,005 --> 00:14:06,085 irregular, and we'll also 406 00:14:06,085 --> 00:14:09,047 demonstrate schema checking using 407 00:14:09,047 --> 99:59:59,999 an example of JSON's schema.