Intro Herald: So welcome to this evening's next talk with the wonderfully broken title, I love it, "Emoji domains and how wonderfully broken they are" by a very, very wonderful person, Jennifer, who is a web developer, and you wouldn't believe it, her nick is "unicorn", here is a unicorn! ... Hi! Jennifer this is, tell us everything about emoji domains, and why they are so rotten broken. You go! dysphoricUnicorn: Yeah, thank you a lot. Exactly - are we speaking about these wonderfully broken things and the talk will be kind of like... I start with a bit of an intro dump about the history of emoji domains and what they actually are and then I will talk about my personal experience breaking things with them. So yeah, let's start right of with the history. So, DNS were standardized in 1987, with a very limited character set. So, you can see, like, only roman naturals and some numbers, and, like, four non- letters. So these are definitely not sufficient for many languages and it's a very euro-centric view, or not even just euro-centric, but it's actually very centered on the english language and it was clear that this won't suffice so in 1996, internationalized domain names were posed, which allow encoding characters that are not supported or that are not officially supported into this very small character set so that browsers could simply convert them on the fly. This... sources kind of disagree when this exactly went live or when you could start it, when you were able to use it for the first time. The IDNA2003 standard allowed the support, but the first emoji domains were actually registered in 2001. Interesting about these is that, in 2001, emojis weren't part of Unicode yet. So you can see these examples, like the "hot springs" those do show as emoji. which is because they are both emoji and Unicode pictographs. So, not actually emoji domains at the time, but right now, they were kind of converted into emojis. Back then, they were just pictographs. I couldn't really find out if those domains actually resolved if you entered the pictographs back then or if it was just someone who just was hoping they would rise in price once IDNA2003 or whatever standard would implement it, went live. So there was also an IDNA2003 normalization, but that is not too interesting for us because we just want to look at the emoji side of things. IDNA2008 actually banned emoji for most major TLDs, because of concerns that it would be used for phishing domains that looked very similiar to actual, other domains. Like every character exists as an emoji, to be able to make to make country flags, so that could be used for phishing and they decided to ban it for most major TLDs that comply with IDNA2003. Important to my little story, in 2020, the emoji 13 standard added transgender pride flag emoji. You'll see why that's important later. So what actually is this punycode encoding? It's non-human readable representation of Unicode characters. So you can see this symbol here would be translated x-n-dash- dash C-8-H, which obviously doesn't make much sense to type in but your browser would take care of this. So, DNS didn't have to be changed, it's only inside your browser that these conversions happen. Compatible browsers, depending on which browser you use, will either intransparently or semitransparently translate, Firefox for example, as a mitigation to these phishing attempts, does allow you to enter emoji or other Unicode characters, but as soon as you hit enter it will, the URL bar will show this xn-dash-dash domain. Safari, as far as I know, does not do it transparently, so you will not know what exactly the punycode representation is of what you were just enterin'. And different TLDs only support a specific subset as I said, IDNA2008 actually banned it. Fun fact, I forgot on the last slide: IDNA2008 went live in 2010 which is kind of confusing, but whatever. Different TLDs only support specific charsets, most don't support emoji, but there are TLDs that have "supporting emoji" as their main selling point. TLDs that most people wouldn't want to use unless they just simply are interested in emoji. Why did I end up breaking things with it? In early 2011... not 2011, 2021... this year - I was unemployed and looking for interesting ways to build my portfolio. I knew that emoji were somewhat supported but I didn't know what, how exactly it worked, I just knew that there were some people that had emoji domains and I was kind of happy that there was a transgender pride emoji added, so I decided, well, maybe it's a good idea to add some domain that contains this transgender pride emoji to also kind of become less interesting for bigoted potential employers. So, yeah, let's register domain with that emoji. Well... that seems to be a bit more difficult because these domains, even though you never really counter them, seemed to be sold out. Nothing that I looked up worked, and actually the web interface broke a bit, but more to that later. Well... none of these domains actually resolve to anything: .dev does not support emoji at all and namecheap doesn't support emoji even with top-level domains that do support them. So, I had to go to another registrar, which was a bit annoying because I thought, well, I like everything in one place, not specifically I love namecheap or anything. But, whatever. Few months later, I am now the proud owner of "transgender pride flag purple heart . ws". At least, that what I think. So, I just set up to build a small demo page for it, and deploy it on my server and test it and - wow. My server usually isn't that slow. Timeouts... the route looks okay inside my reverse proxy, trying again, and after long time, I end up with this wonderful error message. So we're sorry, that domain is invalid. It also does not show the transgender pride flag anymore, but that could be down to the simply their webfont not supporting it yet because it was just added to emoji 13, at least that's what I thought at that point. Obviously, I was a bit scared because, well I just spent 10 euros at something and... I didn't really know when I would have a stable income again so I did this to find a new job and german unemployment benefits are really difficult to get, so I was a bit scared, but godaddy didn't sell me some invalid domain or they also definitely did not scam me, because if you enter these exact characters that apparently are invalid, it does resolve to my server. So when I looked at the godaddy web interface, it also showed these three characters, the purple heart, the white flag and the transgender symbol. It's simply not the domain that I had entered into the emoji domain search engine. Wasn't just their webfont that doesn't support it. And that is caused by the wonderful zero-width joiners. To avoid having tons of similar emoji, each with their own code, many emoji are created by combining others. So you have the skintone modifiers for example or the country flags, that are a combination of different emoji with a zero-width joiner. The transgender pride flag is a combination of a white flag and a transgender symbol with a zero-width joiner inbetween. And the thing is, punycode does not really support them so it was simply just dropped during conversion while I bought my domain. But that's not everything. Because I still had this project, I still wanted emoji domains and my interest was peaked so I wanted to try out what else I could break. To avoid spending even more money on this project, I just moved my testing to sub domains which was a good idea because I have way more control over sub domains than I have over regular ones. I can register them with any registrar, so I could use my go-to registrar. I can register whatever strings I want, so even invalid punycode. I can register them under a TLD that does not allow it because it's not a second-level domain but a third-level domain. And, yeah, let's see what browsers do about that. So I created the sub domain "transgender pride flag . dysphoric . dev". Firefox converts it to xn-- and I'm not gonna say all that. Chromium converts it to a different string. Which, if you plug any of those into a converter, it will tell you that both are invalid punycode. However, both are understood and routed, so I just simply added an [unintelligible] all-route to my reverse proxy, so that both would work. If you use dig, which is a command- line tool that lets you look up domain records - first of all, it doesn't do the punycode conversion at all, so I had to use one of the strings that one of my browsers gave me, but when I use that string it also gave me this "It's not a valid IDNA2008 name. Disable validation using these tool parameters." also didn't tell me that I needed both. So I added the first and then, oh, you still need the second. But, whatever. Once both were added, I was able to get correct results and my site was reachable. The next thing I thought of was, what if I will move my domain to a non-supported registrar, because as I just talked about, namecheap does not actually allow emoji domains and I was interested to see how their web interface would handle it. Sadly, it simply did not handle at all, because they don't support .ws domains. I wasn't really going to contact their support team to try and still get it because this was only a simple thing that I will probably just simply not interested in hosting that domain because it breaks their web interface if you try to. Or other things about emoji domains break their web interface, so I don't really see why their support team would actually be on my side here. So, what about email? Because, apparently, email clients really enjoy breaking. From my experience at least. Do they break with emoji? When trying to add an emoji domain as a sender, my mail server actually broke because validation was run after punycode to unicode conversion, which caused an uncaught exception, which was suprising, it's already fixed but the patch is not released yet so I couldn't yet test it. But there's still the local part which I could already control as much as I wanted to and the [unintelligible] so Thunderbird simply ignored it and showed the punycode and Apple Mail dropped the zero-width joiner and also showed the punycode under the thing where it shows the exact domain. So, mixed results, nothing too spectacular, no exceptions or crashing clients or anything interesting like that, sadly. What did I learn doing this? Well, obviously emoji domains are very buggy. Implementations vary from browser to browser so you can have the same input string and get different punycodes out of it, so testing in just one browser definitely is not enough, well, it never is, but here especially it isn't. And, you may be able to buy a domain that won't work as you would think which can cause quite the annoyance. But it's still a lot of fun to mess around with this stuff, just not for productive use. I like to end my talks but telling people to join a labor union that doesn't have anything to do with this but that's what I do for some reason. And I've got also a blog post about this where I've written it up and I would publish the slides under the wonderful domain "poop emoji nycode . ws". It's just a link to my regular blog for now. I'm sorry. I think I went a bit fast but I still thank you for your time and I'm open to questions. Herald: [talks, but no sound is audible] Herald: I'm online... oops, I'm sorry. I'm awfully sorry, my machine is slow. I muted myself about half a minute ago. Thank you for that beautiful talk, Jennifer. I had to grin a couple of times, because it was great and it made my day. And actually we have a question. The question is in German, I'll say it in English: why is DNSSec so complicated for emoji domains? dysphoricUnicorn: Well, because no one actually really likes emoji domains except the people who sell them. At least that was my experience looking up things for that. So, they are kind of disallowed in the standard, but just some of top level domains just ignore the standard and still let you register them and it's just something that people will implement things don't want to think about at all. I haven't actually tried DNSSec, but it's just something that is easilly forgotten because it shouldn't actually exist, which may be a bit harsh, but... Herald: Is - you remember the ringtone fads when smartphones didn't exist yet - is this just a fad like this ringtone thing and it will just disappear within the next couple of years or would you think emojis are here to stay? Is this serious? dysphoricUnicorn: I think emojis are here to stay but not within domains or... like, it was possible since 2001, kind of, but at least since 2011 where the first actual emoji domain was registered. But most domains that are, like, popular examples already don't resolve anymore or resolve to sites that say "emoji domains". So, emoji domains definitely are not much more than a fad or a nice, funny thing to just look at for a bit. However, emojis as a whole are such a large part of our culture, I don't think they're going to go away any time soon because it's been more than ten years and the annoying downloadable ringtones were popular for a bit less time, I think. Herald: This is a question that I actually wanted to ask myself as well, because I run my own email server as well and... which email server software do you talk about? Do you know about supporting the others? dysphoricUnicorn: Errm... Herald: What do you use as a software on your email server? dysphoricUnicorn: My email server is running on mailhue, which is a set of Docker containers that are specially made to work together to make setting up an email server as painless as possible for free. So I haven't actually tested any other servers, however in theory they shouldn't actually have any issues. So, the part of mailhue that failed wasn't actually the mail server part. It was simply a parser. So, in theory, with another mail server, it should work, if they didn't also mess up parsing at some point. Herald: Somebody asked here, is there a list of top-level domains that support emojis and somebody posted and answering Wikipedia, is that correct? Wikipedia has such a list? dysphoricUnicorn: It has, but it isn't actually correct, the list, that it has it is the english Wikipedia. It lists at least one domain that no longer supports emojis which is actually kind of a big political thing where they removed support. So, the Wikipedia list is not complete or contains too much. There are, however, registrars, that are specialized in emoji domains and those will have current lists. So, I had .ws as one of them. It's not the red heart emoji, though because that's invalid punycode and so I don't really know what to enter in my URL bar to get to them other than searching it on Google, so... Herald: laughs Next question, is there a difference between single punycode and multiple emoji chained together as a second or third level domain? dysphoricUnicorn: It's just different punycode, depending on how many emoji you have but theoretically, the implementation for this would just, I think the technical term was ASCII-to-Unicode something, which is like, an algorithm to convert it, does handle multiple emoji similarly. Or - it should work without any issues if one of the two works. Herald: Are there any emoji first-level domains? dysphoricUnicorn: No. There are not. There are punycode first-level domains, because there are languages that simply do not use the same letters as english does, so punycode first-level domains are existent but no emoji first-level domains at this point. Maybe there will be, but I kind of doubt it because the people in charge of this emoji domains are kind of an eye sore to them from what I could read, so... Herald: Talking about eye sores: I always have the impression, that at least to the old coders, diacritical signs in themselves were considered an eye sore. You know, that funny little dots those German speaking people have up there. Don't talk about the Czech and the Poles. Now, my name contains such a diacritical sign, my first name is André and I've been fighting with all kinds of inputs that say 7 Bit ASCII and nothing else. Do diacritical signs still break domains? dysphoricUnicorn: They should not, because are actually reason why IDN's exist. So it was actually proposed by someone who has one of those sign in his name and probably just wanted the domain with his name. This was the actual reason why we have punycode in the first place and supporting emoji was kind of an unwanted side effect. So in theory, it should work without issues but still many people don't think about it enough when implementing their own thing, so you can never be too certain that it will. But it should. Herald: seventy posted here, seventy obviously runs a Windows, and in Windows 10, the emoji menu with the combination of the Windows and the full stop. Is that common already or is that new? I think it's common by now, it's been implemented and ever since then everybody's been using emojis. And there is also a remark that says "MS Outlook has actually pretty good unicode-punycode support but still don't try emojis". I remember a story about when the Bosnian wars broke, when the Yugoslav war broke, especially the ones in Bosnia broke out, there were about a hundred thousand Bosnians that fled to Switzerland, and about fifteen thousand were granted citizenship, but they couldn't be registered in the citizenship register, because that only supported 7-bit or 8-bit ASCII but no diacritical sign of [unintelligible]. I think they fixed it by now but that was quite a thing some years back. I see no further question, - oh, there is one ... coughs ... one... coughs excuse me that came in right now... coughs... is there a uniform way to generate punycode over multiple platforms? Mobiles do not work well with entering unicode numbers as we all know. dysphoricUnicorn: I'm not sure I understood this correctly. The easiest way that I used during my testing was a simple online converters that would work on every page. And actually my system doesn't have a shortcut for emoji so I would always copy and paste from emojipedia into an online punycode converter and just use it from there. Because I don't actually use emoji that much. Herald: Okay, we've come to the end of our time. We still would have another minute or two, but we have no more questions. Thank you in the meantime for coming and holding this talk. You have another talk. I think it's tomorrow? Outro Subtitles created by c3subtitles.de in the year 2021. Join, and help us!