-
Engineering (VisualEditor)
Against the Odds
-
Alright
-
Hello, everybody. I'm Trevor Pascal.
-
This is Roan Kattouw and Ed Sanders,
we work at the Wikimedia Foundation, specifically the VisualEditor
-
and today we want to talk to you a little bit about
some of the challenges that we have to deal with
-
for the past couple of years
-
and how working on VisualEditor is essentially engineering 'Against the odds'.
-
And so in fact we have found that the odds are against us.
-
But can anyone guess what makes creating VisualEditor so difficult ?
-
Like what sort of things are standing in our way ?
-
- The community
The community.. definitely NOT !
-
The green shirt ?
- Ehm browser support ?
-
Having to support like everything like IE6 and mobile ?
- Oh [....] I don't think we ever claimed IE6 support...
-
Anyone else ?
- Working with wiki text
-
Ah, yeah OK now we are getting good
So yeah definitely differences in Operating Systems
-
different kinds of computers people are using..
That includes like laptops, tablets, phones.
-
We are moving towards getting mobile support going.
-
And also.. It turns out even for modern browsers this is still a problem. I mean these are not the IE6.
-
Ehm.. but... even in modern browsers there is a lot of huge differences even at this point in time.
-
And they don't really seem to agree on how standards should be done
-
at least not when you get down to the nitty gritty details.
-
And eh we sort of demand a little rigor
-
But also yes, wikitext. It is a huge issue for us. There is a lot of ambiguities, there is lot of limitations,
-
there is a lot of design flaws and they bubble up into the User Interface
-
and users get really confused about what is going on.
-
because it doesn't quite work the way Microsoft Word works,
-
because wikitext is not the same as Microsoft Word.
It's a data? format.
-
And so we have to work very hard just to maintain compatibility.
-
And eh... isn't that so sad.
-
But we are really hoping to share some of these experiences with you,
-
not necessarily to make you cry, but because we think there are some interesting challenges,
-
most of which we have overcome and there are some kinda exciting stories to tell.
-
So to tell the first one, I'm going to switch over to Roan Kattouw, who is going to talk to us about Wikitext.
-
Alright, so
Let's talk wikitext.
-
Wikitext, like Trevor said, is a limiting factor to us.
-
And one of the main problems is that if you are a VisualEditor user,
-
if you are like a VisualEditor-only user, then as far as you are concerned wikitext does not exist.
-
Like it is our job and our goal to pretend to these people that wikitext has never been there
-
and will never be there and they can live in this sort of fantasyland where they don't have to deal with it.
-
Which works [...] for most of our [...] but the reality is that wikitext sort of [..] there.
-
and users might want to do things that you can't do nicely or can't do at all in wikitext.
-
And so it ends up being that wikitext limits the users in what they can do.
-
We can't let them do things that are not representable in wikitext or that produce very ugly wikitext
-
because people will get upset.
-
Which means that certain things are not allowed, but they are totally arbitrary and as a user you don't know why.
-
because you don't know about wikitext, so you don't know about these completely random limitations.
-
I'll show a few examples of like headaches we've had in our [..]
-
In wikitext you can do nested lists.
You can do this perfectly well in HTML as well.
-
It's perfectly normal, it's the way HTML wants you to do this, is kind of weird,
-
where it wants you to nest the 2nd level list at the end of the previous list item.
-
Which is a little bit weird, but whatever fine, it's how everyone does it.
-
If you insert a line break there, after 'Nested list' and you don't put a star there, then it ends the list completely.
-
and your 'Bar' ends up completely outside of the list.
That's also fine
-
But, if you wanted to put text after the nested list but within the same list item,
-
so you have a list item 'Foo' and then a nested list and then 'Bar'. You can't do that in wikitext.
-
It is perfectly valid HTML, it's something that initially would almost ... if you did certain things in VisualEditor, [...]
-
but you can't do it in wikitext. So we take all sorts of pains to not allow you to do this, because it creates nightmares.
-
But this is of course completely non-transparent to anyone.
-
And in general we tend to ignore that newline characters exist.
-
You also can't really use formatted text in lists. Wikitext allows you to do it if you use an HTML tag here.
-
But.. and that actually works just fine.
-
But it doesn't really work, or it's not really useful
-
because you try to do this...
-
which, you wouldn't be using a tag if you didn't care about the way your newlines were produced,
-
so you probably are going to add a newline in there.
-
Then it happily breaks your list for you and throws 'Bar' outside the list and into it's own paragraph,
-
because clearly that's what you meant, because clearly knows everything [...] a list.
-
This is superevil. It also shows you how sort of this whole notion of using HTML some places in wikitext
-
doesn't really work, or HTML and wikitext doesn't really get along that well.
-
Obviously this means, that if you have pre formatted text in a list and you try to press enter in it,
-
then VisualEditor needs to do, well not that, because that doesn't work.
-
and if you try to create this HTML here...
-
If you try to create something like this, like if the user tried to press Enter inside a formatted text in a list,
-
this is what you probably would expect to happen, except that it is completely [..]
-
[...] so maybe we shouldn't let you do that.
-
So instead what happens is that if you press Enter inside the list item, then it splits the list item.
-
We never try to insert any sort of line break like thing there, because we know it explodes.
-
And also if you select text in a list and try to make it formatted, we break out of the list.
-
Because we know that formatted text in a list is major trouble.
-
Of course this only makes sense if you [...] and if you are the user, this makes no sense whatsoever.
-
You can also if you [..] in a list, you can also do this in wikitext and it will do what you expect.
-
It will create a paragraph break inside your list item.
-
This isn't the most beautiful wikitext of all time, but it kind of works.
-
You...
-
This works, it's not very beautiful..
-
This is basically what you would get if we [..] work in their own way and insert a paragraph break.
-
So obviously the solution is to not let the user create paragraphs breaks in list items and they have no idea why.
-
One of my favorite examples is the [...] link syntax in wikitext is so amazing.
-
You would think that you can totally embed an image inside the text of a link.
-
So what this is trying to do is trying to create a link to 'Page' with the link text being 'Foo', an image and then 'Bar'.
-
This seems reasonable, you can totally do this in HTML, you can totally have an image that's like part of a link action.
-
If you try to do this in wikitext then everything freaks out and
-
it decides that the image is more important than the link and it just barfs a bunch of wikitext all over your page.
-
If we let you try to create a link across an image, then it would try to generate the wikitext like that.
-
which would then when you render it back would create this garbage and so we cannot really do that
-
and if you try to link across an image, it will silently cut the link at the image and [..] because we know links and images are trouble
-
Users have no idea.
-
If you wanted to do this, this completely sensible HTML, you should totally be able to do this in VisualEditor,
-
except we don't let you, because if you then try to put the wikitext back, it will be [..]
-
So those are things that were just broken in wikitext, which means you can't do them in wikitext,
-
but there are also things that you can kinda do in wikitext except the result is so ugly,
-
that people that do use wikitext yell at us whenever we do it,
-
and so we also try and nudge users away from them.
-
One of these things is,
is with links.
-
This is a feature in wikitext where you have link trails.
-
So if you try to create a link to 'Foo', followed by the word 'bar', you would think that you would [..] like that
-
but in fact that doesn't actually work that way, because wikitext is helpful and does link trailing.
-
Where if you write text after a link, it automatically moves it into the link.
-
So if you are trying to do the conversion like on the first line, it would actually break on you,
-
because if you then try to convert it back to HTML you'd get something different.
-
and it would actually go and help you and put bar inside the link
-
and so instead we have to use a nowiki tag and people started crying and yelling at us for inserting all these [..]
-
The solution to this problem, if you can call it that, is to, when users type text at the end of the link,
-
try and make sure that by default that text is always part of the link and so you try and nudge them into creating
-
content that looks like the thing on the 2nd line, which works and try to make it harder for them
-
to create content looks like the thing on the 3rd line, because we know that's trouble
-
This kind of behavior steering is, kind of like, it sort of promotes harmony,
-
because VE users ended up creating wikitext that looked hideous and that people were upset about..
-
But.. you know, it's not a real solution and it's completely [...] to people who use it.
-
You can also put spaces at the beginning of paragraphs, which [...] VE.
-
You would think that we would just generate wikitext with a space at the beginning,
-
unfortunately it doesn't work that way, because a space [..] meaningful [...] a tag.
-
So what we actually have to do is wrap the space with a nowiki tag and again people get upset at us.
-
We haven't even solved this problem yet. We think that we might want to detect these spaces
-
and deliberately ignore them and throw them away.
-
But you know, you can't really do that, that's not a real solution to a problem.
-
It's just like trying to steer around various pitfalls again.
-
Here is another example. You can type '#3' and also you can try to naively serialize that.
-
But it doesn't do what you expect, because # means that you are trying to do a numbered list.
-
So you have to wrap the # in a .
-
Which again, we try to discourage people from doing this, by throwing: "What you typed looks like wikitext"
-
"This won't really work, are you sure you want to do this?"
-
Which a more egregious example of this would be
-
Which a more egregious example of this would be this, where the user actually tries to type wikitext in VE.
-
We come up with a big flashy warning: "This won't work the way you expect it would"
-
and if you do save that actually have nowiki tags.
We can't really prevent that, because this is what we
-
have to do, to give a faithful representation [...]
-
So instead we just surface a warning and go like: "Well people don't really like you to do this"
-
But we can't really stop them.
-
We have also considered converting this to a link, but we are trying to pretend wikitext doesn't exist.
-
so that would be incredibly confusing for people who have never used wikitext
-
and they mash some buttons and all of a sudden they have a link and they have no idea why.
-
So I talked about solutions to these kinds of problems,
-
and these are just a few examples, there are like a lot of other things that we steer around,
-
but they are not really solutions, they are 'solutions' in air quotes, because they are like the lesser of all evils.
-
we should try and nudge people to do, what we think is the right thing, but the reason it is the right thing,
-
is [...] understand, so it's just a bunch of crappy workarounds. It's not really [...]
-
So now Trevor is going to talk about Browsers, which I'm sure we can all agree are lovely things.
-
I recommend them.
-
So more specifically to the [...]
-
I'm really only going to be talking about a browser feature known as ContentEditable.
-
It is a very magical feature that browsers provide.
-
Probably the richest feature that they have ever introduced.
-
It makes HTML like this..
editable by simply putting a switch.
-
Job done. Let's go home now. We are all sorted.
-
But.. It turns out that of all the wonderful features browsers have given us over the past decade or so
-
this is not one of them.
ContentEditable is really inconsistent.
-
It is very overzealous and it's very annoy able.
-
You are basically always in a defensive position, trying to figure out what just happened.
-
It doesn't even have events. It does weird things.
-
Based on what day of the week it is, which way the wind is blowing.
-
Depending on which browser you are in and what OS you are on.
-
I think there is a math.random() somewhere in there. I'm not really sure.
-
We had this idea early on... let's just avoid ContentEditable full stop.
-
We worked on the WikiEditor.
-
We tried to make a syntax highlighting version of wikitext editing in a ContentEditable and we ran into a lot of issues.
-
So we just figured, let's just make a synthetic surface, this is what Google Docs does for instance.
-
[..] ContentEditable, all the text selection and the blinking cursor and all of that is done with just rendering divs
-
And we dit it.
And it worked.
-
But.. even though we had full control, we also had all the responsibility to implement everything from scratch.
-
And that is where we got this limitation.
Things like spellcheck and when you type on mobile
-
pressing spacebar twice, what does it do.
And different IME's, and what happens if you swipe.
-
And mobile text selection is also a big problem, because it works very different and looks very different
-
from the way it does in desktop browsers.
-
All these things are stacking up against us and we sort of maybe we have to revisit ContentEditable,
-
because we were quite limited in what we were able to accomplish.
-
So we realized that the reason that our ContentEditable experience was so bad,
-
was because we were making the DOM the center of our application.
-
That everything in our application, the Data was in the DOM
-
and the ContentEditable was having it's way with it, without our permission, most of the time.
-
And the View, which was ContentEditable, was right there in the DOM.
-
And of course the controller was relying on events from the DOM. That's what put us in that defensive position.
-
The truth is that this was they way ContentEditable was designed to be used.
-
So it seemed sensible. But it really set us up for failure.
-
We realized that what we needed to do, was to come up with a different architecture completely.
-
Where we simply just use ContentEditable for the little things that it is good for, like rendering
-
text selection and some, but not all, of our input, because even the events that it gives sometimes are telling lies
-
so we pretty much have to monitor it, repeatedly, over and over and over and check up and see what it is doing
-
It is sort of resource intensive.
-
This whole experience is really just taught us one thing: Browsers are dangerous.
-
They lie to you, they will waste your time.
-
They are bound to make you look foolish, and they are toys.
-
Using a browser to make an application on Wikipedia, is like using my 7 year old daughter's
-
play kitchen to run a restaurant. Everything seems to be there, but you got to use your imagination.
-
And that is what I think a lot of VisualEditor engineering is, coming up with very imaginative ways to
-
overcome the toy-like state of browsers.
-
Which we hit all the time. So my recommendation to anybody doing [..] development, is to: Protect yourself.
-
By scrapping. And that is exactly what we do. That is the architecture that we came up with.
-
You scrap the DOM completely, to the point where we are really seeing it as a thin rendering client
-
with some input and keep browsers on a very short leash and to give them a minimal amount of control.
-
That is my recommendation to you all. And now to talk about the troubles of Operating Systems..
-
and how they effect our work, is: Ed Sanders
-
Hello
-
So I'm going to talk to you about something specifically bad about ContentEditable,
-
which is: copy-paste.
-
Here is the good news: ContentEditable gives you copy paste.
-
You can press Ctrl-C, put stuff on the clipboard, you press Ctrl-V it takes out of the clipboard
-
and that is pretty much half the job done for you. We don't have to [..] the data in some fake clipboard
-
intercept key events and [..]
-
There is even more good news. Brought to you by Clippy. It has a bunch of events that are actually quite useful.
-
So not only are there oncopy and onpaste events. But they happen before the copy and before the paste.
-
So if you have the oncopy event fired, that means the data hasn't been written to the clipboard yet.
-
Which turns out to be really useful, because you can actually move someones selection,
-
just before it writes the data to the clipboard and change what goes into the clipboard.
-
Even more good news, there is an API, with getData and setData, so you can write stuff directly into the clipboard.
-
This is only available on copy and paste events. And we can make directly set whatever the hell we want in there.
-
With an asterisk, because if there is anyone from Mozilla here, we need to talk...
-
On to the not so good news.
-
It's really bad. It's really inconsistent. Copy paste is to ContentEditable, what ContentEditable is to browsers.
-
It's like, it's the worst of the worst.
-
If we are talking about Trevor's play kitchen
-
This is like using Play-Doh to make a play kitchen to run the restaurant.
-
It's just, every browser, there is no documentation, every browser does it somewhat differently,
-
and this may or may not change next week.
-
As I mentioned earlier, that beautiful API that would let us do whatever we want in getting the clipboard data,
-
take whatever we want out... only works in Chrome.
-
So unfortunately, we have to support more that one browser, so we need to come up with another method.
-
That will work in Firefox and maybe even IE one day.
-
Let's have a look at what we actually want to do in VisualEditor.
-
We have a couple of use cases. There is the easiest one.
-
Which is to copy some text from the top of your document and paste it a bit lower.
-
In that case, you cannot choose a favorite clipboard, because we can just use the internal data
-
that is already in our model, as Trevor mentioned we have taken the model out of ContentEditable.
-
we store all our data in the [..] So every time we get copy, we can say: "Oh, you selected from 1 to 6"
-
and then when you hit paste, it gives us the same thing, you can just move it out.
-
Do one copy from one VE instance to another.
-
That means, getting all this rich data, like templates or images, and making sure that goes into the
-
clipboard properly.
-
And we also might want to copy into a Word document.
So we need to make sure that what goes into the Word
-
document is actually, clean HTML.
-
So you might have thought, just copy between the instances, well we can just take our internal data and JSON serialize it.
-
But then if you paste that into Word, then you are going to be like, wow what just happened there.
-
I copied like a paragraph in an editor and now I've got [..]
-
And people may also want to copy stuff from the website and paste it into the.. maybe not so much Wikipedia,
-
because you know [..]
-
So some solutions
-
1. We have this thing called DM HTML, which isn't the ContentEditable HTML, because ContentEditable HTML
-
will randomly [..] your HTML, we keep a [..] copy of the DM HTML, which is what gets send back to the server
-
when you hit save. So we can put that on the clipboard.
-
If you put a template that is a table [..] to the table, replace it with a little marker saying: This is info box [..]
-
So if you put that in the clipboard, when you paste to another VisualEditor instance, you get a template and not just the rendering of a template.
-
which is a table and a bunch of random stuff, [..]
-
If we can also put in the clipboard some sort of marker, for internal copy paste,
-
so we can say, oh this copy actually came from VisualEditor, then when we hit paste we can say:
-
Well this was a VisualEditor copy, was it from the same window that we are currently in?
-
If it is, then we can just draw the data from our internal data structure.
-
So [..]
Things get easy
-
We can use the plain clipboard data. We put our DM HTML in the HTML part of the clipboard.
-
BTW. the clipboard has a plain text area, an HTML area and sort of random custom metadata.
-
And then we can set a custom key pointing to some internal store for the actual internal data, if they need to be a paste.
-
And that's not too bad. There's a few problems with actually setting the data directly [..]
-
Here is the hard way, which is what happens if you have Firefox or Internet Explorer.
-
You hit copy and as you saw earlier, we have really early events. So we take your selection, which is in the CE.
-
And we then make generous and DM HTML, put it in a hidden text area, move our selection to the text area, then we wait a bit, wait for the copy to happen. [..]
-
And then we put everything back where we found it. So slightly convoluted process.
-
Job done.
-
Not so fast.
-
Mo' problems
-
The ContentEditable clipboard has a habit of tweaking your HTML, so that it looks better.
-
Internet Explorer in particular throws away lot's of whitespace, that's not so bad.
-
Firefox, [..] we need to talk again. They throw away, a whole bunch of attributes, such as RDFa.
-
That's quite unfortunate for us, because we use RDFa to define all template metadata.
-
So if you are in FireFox and you try to do this, you just throw away all your template metadata and you just be left with [..]
-
We can work around this by serializing all the templates
-
which we think CE might destroy, which we put in another template [..] another attribute [..] won't destroy.
-
Element destruction
-
Empty spans, we use empty spans in Mozilla to store this internal templating and then Mozilla decides to destroy it.
-
Because you didn't mean to have an empty span there right ?
-
So a little hack we can do there is to have a [..] and that sort of protects it... atm.
-
And reordering. If you sort of copy just a table cell and try to paste that in a paragraph it might sort of wrap it in tables, it might move the paragraph outside [..]
-
so we need to make sure that when we do do this fake paste [..] trip, that we paste into the right sort of content.
-
End..
Almost
-
As I've mentioned. We don't think we've got this completely sorted out.
-
We get reports of bugs. We'd love you to try and just fire up VisualEditor in your browser,
-
copy and paste and stuff and then people get it to break again.
-
And also as I've mentioned a few times, [..]
-
And that was that.