-
Intro
-
Herald: So welcome to this evening's next
talk with the wonderfully broken title, I
-
love it, "Emoji domains and how
wonderfully broken they are" by a very,
-
very wonderful person, Jennifer, who is a
web developer, and you wouldn't believe
-
it, her nick is "unicorn", here is a
unicorn! ... Hi! Jennifer this is, tell us
-
everything about emoji domains, and why
they are so rotten broken. You go!
-
dysphoricUnicorn: Yeah, thank you a lot.
Exactly - are we speaking about these
-
wonderfully broken things and the talk
will be kind of like... I start with a bit
-
of an intro dump about the history of
emoji domains and what they actually are
-
and then I will talk about my personal
experience breaking things with them. So
-
yeah, let's start right of with the
history. So, DNS were standardized in
-
1987, with a very limited character set.
So, you can see, like, only roman naturals
-
and some numbers, and, like, four non-
letters. So these are definitely not
-
sufficient for many languages and it's a
very euro-centric view, or not even just
-
euro-centric, but it's actually very
centered on the english language and it
-
was clear that this won't suffice so in
1996, internationalized domain names were
-
posed, which allow encoding characters
that are not supported or that are not
-
officially supported into this very small
character set so that browsers could
-
simply convert them on the fly. This...
sources kind of disagree when this exactly
-
went live or when you could start it, when
you were able to use it for the first
-
time. The IDNA2003 standard allowed the
support, but the first emoji domains were
-
actually registered in 2001. Interesting
about these is that, in 2001, emojis
-
weren't part of Unicode yet. So you can
see these examples, like the "hot springs"
-
those do show as emoji. which is because
they are both emoji and Unicode pictographs.
-
So, not actually emoji domains at the
time, but right now, they were kind of
-
converted into emojis. Back then, they
were just pictographs. I couldn't really
-
find out if those domains actually
resolved if you entered the pictographs
-
back then or if it was just someone who
just was hoping they would rise in price
-
once IDNA2003 or whatever standard would
implement it, went live. So there was also
-
an IDNA2003 normalization, but that is not
too interesting for us because we just
-
want to look at the emoji side of things.
IDNA2008 actually banned emoji for most
-
major TLDs, because of concerns that it
would be used for phishing domains that
-
looked very similiar to actual, other
domains. Like every character exists as an
-
emoji, to be able to make to make country
flags, so that could be used for phishing
-
and they decided to ban it for most major
TLDs that comply with IDNA2003. Important
-
to my little story, in 2020, the emoji 13
standard added transgender pride flag
-
emoji. You'll see why that's important
later. So what actually is this punycode
-
encoding? It's non-human readable representation
of Unicode characters. So you can see this
-
symbol here would be translated x-n-dash-
dash C-8-H, which obviously doesn't make
-
much sense to type in but your browser
would take care of this. So, DNS didn't have to
-
be changed, it's only inside your browser
that these conversions happen. Compatible
-
browsers, depending on which browser you
use, will either intransparently or
-
semitransparently translate, Firefox for
example, as a mitigation to these phishing
-
attempts, does allow you to enter emoji or
other Unicode characters, but as soon as
-
you hit enter it will, the URL bar will
show this xn-dash-dash domain. Safari, as
-
far as I know, does not do it
transparently, so you will not know what
-
exactly the punycode representation is of
what you were just enterin'. And different
-
TLDs only support a specific subset as I
said, IDNA2008 actually banned it. Fun
-
fact, I forgot on the last slide: IDNA2008
went live in 2010 which is kind of
-
confusing, but whatever. Different TLDs
only support specific charsets, most don't
-
support emoji, but there are TLDs that
have "supporting emoji" as their main
-
selling point. TLDs that most people
wouldn't want to use unless they just
-
simply are interested in emoji. Why did I
end up breaking things with it? In early
-
2011... not 2011, 2021... this year - I
was unemployed and looking for interesting
-
ways to build my portfolio. I knew that
emoji were somewhat supported but I didn't
-
know what, how exactly it worked, I just
knew that there were some people that had
-
emoji domains and I was kind of happy that
there was a transgender pride emoji
-
added, so I decided, well, maybe it's a
good idea to add some domain that contains
-
this transgender pride emoji to also kind
of become less interesting for bigoted
-
potential employers. So, yeah, let's
register domain with that emoji. Well...
-
that seems to be a bit more difficult
because these domains, even though you
-
never really counter them, seemed to be
sold out. Nothing that I looked up worked,
-
and actually the web interface broke a
bit, but more to that later. Well... none
-
of these domains actually resolve to
anything: .dev does not support emoji at
-
all and namecheap doesn't support emoji
even with top-level domains that do
-
support them. So, I had to go to another
registrar, which was a bit annoying
-
because I thought, well, I like everything
in one place, not specifically I love
-
namecheap or anything. But, whatever. Few
months later, I am now the proud owner of
-
"transgender pride flag purple heart .
ws". At least, that what I think. So, I
-
just set up to build a small demo page for
it, and deploy it on my server and test it
-
and - wow. My server usually isn't that
slow. Timeouts... the route looks okay
-
inside my reverse proxy, trying again, and
after long time, I end up with this
-
wonderful error message. So we're sorry,
that domain is invalid. It also does not
-
show the transgender pride flag anymore,
but that could be down to the simply their
-
webfont not supporting it yet because it
was just added to emoji 13, at least
-
that's what I thought at that point.
Obviously, I was a bit scared because,
-
well I just spent 10 euros at something
and... I didn't really know when I would
-
have a stable income again so I did this
to find a new job and german unemployment
-
benefits are really difficult to get, so I
was a bit scared, but godaddy didn't sell
-
me some invalid domain or they also
definitely did not scam me, because if you
-
enter these exact characters that
apparently are invalid, it does resolve to
-
my server. So when I looked at the
godaddy web interface, it also showed
-
these three characters, the purple heart,
the white flag and the transgender symbol.
-
It's simply not the domain that I had
entered into the emoji domain search
-
engine. Wasn't just their webfont that
doesn't support it. And that is caused by
-
the wonderful zero-width joiners. To avoid
having tons of similar emoji, each with
-
their own code, many emoji are created by
combining others. So you have the skintone
-
modifiers for example or the country
flags, that are a combination of different
-
emoji with a zero-width joiner. The
transgender pride flag is a combination of
-
a white flag and a transgender symbol with
a zero-width joiner inbetween. And the
-
thing is, punycode does not really support
them so it was simply just dropped during
-
conversion while I bought my domain. But
that's not everything. Because I still had
-
this project, I still wanted emoji domains
and my interest was peaked so I wanted to
-
try out what else I could break. To avoid
spending even more money on this
-
project, I just moved my testing to sub
domains which was a good idea because I
-
have way more control over sub domains
than I have over regular ones. I can
-
register them with any registrar, so I
could use my go-to registrar. I can
-
register whatever strings I want, so even
invalid punycode. I can register them
-
under a TLD that does not allow it because
it's not a second-level domain but a
-
third-level domain. And, yeah, let's see
what browsers do about that. So I created
-
the sub domain "transgender pride flag .
dysphoric . dev". Firefox converts it to
-
xn-- and I'm not gonna say all that.
Chromium converts it to a different
-
string. Which, if you plug any of those
into a converter, it will tell you
-
that both are invalid punycode. However,
both are understood and routed, so I just
-
simply added an [unintelligible] all-route
to my reverse proxy, so that both would
-
work. If you use dig, which is a command-
line tool that lets you look up domain
-
records - first of all, it doesn't do the
punycode conversion at all, so I had to
-
use one of the strings that one of my
browsers gave me, but when I use that
-
string it also gave me this "It's not a
valid IDNA2008 name. Disable validation
-
using these tool parameters." also didn't
tell me that I needed both. So I added the
-
first and then, oh, you still need the
second. But, whatever. Once both were
-
added, I was able to get correct results
and my site was reachable. The next thing
-
I thought of was, what if I will move my
domain to a non-supported registrar,
-
because as I just talked about, namecheap
does not actually allow emoji domains and I
-
was interested to see how their web interface
would handle it. Sadly, it simply did not
-
handle at all, because they don't support
.ws domains. I wasn't really going to
-
contact their support team to try and
still get it because this was only a
-
simple thing that I will probably just
simply not interested in hosting that
-
domain because it breaks their web
interface if you try to. Or other things
-
about emoji domains break their web
interface, so I don't really see why their
-
support team would actually be on my side
here. So, what about email? Because,
-
apparently, email clients really enjoy
breaking. From my experience at least. Do
-
they break with emoji? When trying to add
an emoji domain as a sender, my mail
-
server actually broke because validation
was run after punycode to unicode
-
conversion, which caused an uncaught
exception, which was suprising, it's
-
already fixed but the patch is not
released yet so I couldn't yet test it.
-
But there's still the local part which I
could already control as much as I wanted
-
to and the [unintelligible] so Thunderbird
simply ignored it and showed the punycode
-
and Apple Mail dropped the zero-width
joiner and also showed the punycode under
-
the thing where it shows the exact domain.
So, mixed results, nothing too spectacular,
-
no exceptions or crashing clients or
anything interesting like that, sadly.
-
What did I learn doing this? Well,
obviously emoji domains are very buggy.
-
Implementations vary from browser to
browser so you can have the same input
-
string and get different punycodes out of
it, so testing in just one browser
-
definitely is not enough, well, it never
is, but here especially it isn't. And, you
-
may be able to buy a domain that won't
work as you would think which can cause
-
quite the annoyance. But it's still a lot
of fun to mess around with this stuff,
-
just not for productive use. I like to end
my talks but telling people to join a
-
labor union that doesn't have anything to
do with this but that's what I do for some
-
reason. And I've got also a blog post
about this where I've written it up and I
-
would publish the slides under the
wonderful domain "poop emoji nycode . ws".
-
It's just a link to my regular blog for
now. I'm sorry. I think I went a bit fast
-
but I still thank you for your time and
I'm open to questions.
-
Herald: [talks, but no sound is audible]
Herald: I'm online... oops, I'm sorry. I'm
-
awfully sorry, my machine is slow. I muted
myself about half a minute ago. Thank you
-
for that beautiful talk, Jennifer. I had
to grin a couple of times, because it was
-
great and it made my day. And actually we
have a question. The question is in
-
German, I'll say it in English: why is DNSSec
so complicated for emoji domains?
-
dysphoricUnicorn: Well, because no one
actually really likes emoji domains except
-
the people who sell them. At least that
was my experience looking up things for
-
that. So, they are kind of disallowed in the
standard, but just some of top level domains just
-
ignore the standard and still let you register
them and it's just something that people
-
will implement things don't want to think
about at all. I haven't actually tried
-
DNSSec, but it's just something that is
easilly forgotten because it shouldn't
-
actually exist, which may
be a bit harsh, but...
-
Herald: Is - you remember the ringtone
fads when smartphones didn't exist yet -
-
is this just a fad like this ringtone
thing and it will just disappear within
-
the next couple of years or would you
think emojis are here to stay? Is this
-
serious?
dysphoricUnicorn: I think emojis are here
-
to stay but not within domains or... like,
it was possible since 2001, kind of, but
-
at least since 2011 where the first actual
emoji domain was registered. But most
-
domains that are, like, popular examples
already don't resolve anymore or resolve
-
to sites that say "emoji domains". So,
emoji domains definitely are not much more
-
than a fad or a nice, funny thing to just
look at for a bit. However, emojis as a
-
whole are such a large part of our
culture, I don't think they're going to go
-
away any time soon because it's been more
than ten years and the annoying
-
downloadable ringtones were popular
for a bit less time, I think.
-
Herald: This is a question that I actually
wanted to ask myself as well, because I
-
run my own email server as well and...
which email server software do you talk
-
about? Do you know about
supporting the others?
-
dysphoricUnicorn: Errm...
Herald: What do you use as a software on
-
your email server?
dysphoricUnicorn: My email server is
-
running on mailhue, which is a set of
Docker containers that are specially made
-
to work together to make setting up an
email server as painless as possible for
-
free. So I haven't actually tested any
other servers, however in theory they
-
shouldn't actually have any issues. So, the part
of mailhue that failed wasn't actually the
-
mail server part. It was simply a parser.
So, in theory, with another mail server,
-
it should work, if they didn't also mess
up parsing at some point.
-
Herald: Somebody asked here, is there a
list of top-level domains that support
-
emojis and somebody posted and answering
Wikipedia, is that correct? Wikipedia has
-
such a list?
dysphoricUnicorn: It has, but it isn't
-
actually correct, the list, that it has it
is the english Wikipedia. It lists at
-
least one domain that no longer supports
emojis which is actually kind of a big
-
political thing where they removed
support. So, the Wikipedia list is not
-
complete or contains too much. There are,
however, registrars, that are specialized
-
in emoji domains and those will have
current lists. So, I had .ws as one of
-
them. It's not the red heart emoji, though
because that's invalid punycode and so I
-
don't really know what to enter in my URL
bar to get to them other than searching
-
it on Google, so...
Herald: laughs Next question, is there a
-
difference between single punycode and
multiple emoji chained together as a
-
second or third level domain?
dysphoricUnicorn: It's just different
-
punycode, depending on how many emoji you
have but theoretically, the implementation
-
for this would just, I think the technical
term was ASCII-to-Unicode something, which
-
is like, an algorithm to convert it, does
handle multiple emoji similarly. Or - it
-
should work without any
issues if one of the two works.
-
Herald: Are there any emoji
first-level domains?
-
dysphoricUnicorn: No. There are not. There
are punycode first-level domains, because
-
there are languages that simply do not use
the same letters as english does, so
-
punycode first-level domains are existent
but no emoji first-level domains at this
-
point. Maybe there will be, but I kind of
doubt it because the people in charge of
-
this emoji domains are kind of an eye sore
to them from what I could read, so...
-
Herald: Talking about eye sores: I always
have the impression, that at least to the
-
old coders, diacritical signs in
themselves were considered an eye sore.
-
You know, that funny little dots those
German speaking people have up there.
-
Don't talk about the Czech and the Poles.
Now, my name contains such a diacritical
-
sign, my first name is André and I've been
fighting with all kinds of inputs that say
-
7 Bit ASCII and nothing else. Do
diacritical signs still break domains?
-
dysphoricUnicorn: They should not, because
are actually reason why IDN's exist. So it
-
was actually proposed by someone who has
one of those sign in his name and probably
-
just wanted the domain with his name. This
was the actual reason why we have punycode
-
in the first place and supporting emoji
was kind of an unwanted side effect. So in
-
theory, it should work without issues but
still many people don't think about it
-
enough when implementing their own thing,
so you can never be too certain that it
-
will. But it should.
Herald: seventy posted here, seventy
-
obviously runs a Windows, and in Windows
10, the emoji menu with the combination of
-
the Windows and the full stop. Is that
common already or is that new? I think
-
it's common by now, it's been implemented
and ever since then everybody's been using
-
emojis. And there is also a remark that
says "MS Outlook has actually pretty good
-
unicode-punycode support but still don't
try emojis". I remember a story about when
-
the Bosnian wars broke, when the Yugoslav
war broke, especially the ones in Bosnia
-
broke out, there were about a hundred
thousand Bosnians that fled to
-
Switzerland, and about fifteen thousand
were granted citizenship, but they
-
couldn't be registered in the citizenship
register, because that only supported
-
7-bit or 8-bit ASCII but no diacritical
sign of [unintelligible]. I think they
-
fixed it by now but that was quite a thing
some years back. I see no further
-
question, - oh, there is one ... coughs
... one... coughs excuse me that came in
-
right now... coughs... is there a
uniform way to generate punycode over
-
multiple platforms? Mobiles do not work
well with entering unicode numbers as we
-
all know.
dysphoricUnicorn: I'm not sure I
-
understood this correctly. The easiest
way that I used during my testing was a
-
simple online converters that would work
on every page. And actually my system
-
doesn't have a shortcut for emoji so I
would always copy and paste from
-
emojipedia into an online punycode converter
and just use it from there. Because I
-
don't actually use emoji that much.
Herald: Okay, we've come to the end of our
-
time. We still would have another minute
or two, but we have no more questions.
-
Thank you in the meantime for coming and
holding this talk. You have another talk.
-
I think it's tomorrow?
-
Outro
-
Subtitles created by c3subtitles.de
in the year 2021. Join, and help us!