-
Not Synced
I am Nicolas Dandrimont.
-
Not Synced
I am going to talk to you about a year of fedmsg in Debian.
-
Not Synced
We had a problem before with infrastructure in distributions.
-
Not Synced
Services are bit like people.
-
Not Synced
There are dozen of services maintained by many people
-
Not Synced
and each of those services has its own way of communicating with the rest of the world
-
Not Synced
Meaning that if you want to spin up a new service
-
Not Synced
that needs to talk to other services in the distribution
-
Not Synced
which is basically any service you want to include
-
Not Synced
you will need to implement a bunch of communication systems
-
Not Synced
For instance, in the Debian infrastructure
-
Not Synced
we have our archive software, which is dak,
-
Not Synced
that mostly uses emails and databases to communicate.
-
Not Synced
The metadat is available in a RFC822 format with no real API.
-
Not Synced
The database is not public either.
-
Not Synced
The build queue management software, which is wanna-build,
-
Not Synced
polls a database every so often to know what needs to get built.
-
Not Synced
There is no API outside of its database
-
Not Synced
that isn't public either
-
Not Synced
Out bug tracking system, which is called debbugs,
-
Not Synced
works via email, stores its data in flat files, for now,
-
Not Synced
and exposes a read-only SOAP API.
-
Not Synced
Our source control managament pushes in the distro-provided repos on alioth
-
Not Synced
can trigger an IRC bot or some emails
-
Not Synced
but there is no real central notification mechanism.
-
Not Synced
We have some kludges that are available to overcome those issues.
-
Not Synced
We have the Ultimate Debian Database
-
Not Synced
which contains a snapshot of a lot of the databases that are underlying the Debian infrastructure
-
Not Synced
This means that every so often,
-
Not Synced
there is a cron that runs and imports data from a service here, a service there.
-
Not Synced
There is no realtime data.
-
Not Synced
It's useful for distro-wide Q&A stuff because you don't need to have realtime data
-
Not Synced
But when you want some notification for trying to build a new package or something
-
Not Synced
That doesn't work very well
-
Not Synced
and the consistency between the data sources is not guaranteed.
-
Not Synced
We have another central notification system which the package tracking system
-
Not Synced
which also is cron-triggered or email-triggered
-
Not Synced
You can update the data from the BTS using ??
-
Not Synced
But you can subscribe to email updates on a given package
-
Not Synced
But the messages are not uniform,
-
Not Synced
they can be machine parsed.
-
Not Synced
There are a few headers but they are not sufficient to know what the messages are about.
-
Not Synced
And it's still not realtime.
-
Not Synced
The Fedora people invented something that could improve stuff which is called fedmsg.
-
Not Synced
It was actually introduced in 2009.
-
Not Synced
It's an unified message bus that can reduce the coupling between different services of a distribution.
-
Not Synced
That services can subscribe to one or several message topics, register callbacks and react to events
-
Not Synced
that are triggered by all the services in the distribution.
-
Not Synced
There is a bunch of stuff that are already implemented in fedmsg.
-
Not Synced
You get a stream of data with all the activity in your infrastructure which allows you to do statistics for instance
-
Not Synced
You decouple interdepent services because you can swap one thing with another
-
Not Synced
Or just listen to the messages and start doing stuff directly without having to ?? a database or something.
-
Not Synced
You can get a pluggable unified notification system that can gather all the events in the project and send them by email, by IRC
-
Not Synced
on your mobile phone, on your desktop, everywhere you want.
-
Not Synced
Fedora people use fedmsg to implement a badge system
-
Not Synced
which is some kind of gamification of the development process of the distribution
-
Not Synced
They implemented a live web dashboard
-
Not Synced
They implemented IRC feed.
-
Not Synced
And then they als go some bot bans on social networks because they were flooding
-
Not Synced
How does it work?
-
Not Synced
Well, the first idea was to use AMQP as implemented by qpid
-
Not Synced
Basically, you take all your services and you have them send their messages in a central broker
-
Not Synced
and then you have several listeners that can send messages to clients.
-
Not Synced
There were a few issues with this.
-
Not Synced
Basically, you have a single point of failure at the central broker
-
Not Synced
And the brokers weren't really reliable.
-
Not Synced
When they tested it under load, the brokers were tipping over ??
-
Not Synced
The actual implementation of fedmsg uses 0mq.
-
Not Synced
Basically what you get is not a single broker.
-
Not Synced
You get a mesh of interconnected services.
-
Not Synced
Basically, you can connect only to the service that you want to listen to.
-
Not Synced
The big drawback of this is that each and every service has to open up a port on the public Internet
-
Not Synced
for people to be able to connect to it.
-
Not Synced
There are some solutions for that which I will talk about.
-
Not Synced
But the main advantages is that you have no central broker
-
Not Synced
And they got like a hundred-fold speedup over the previous implementation.
-
Not Synced
You also have an issue with service discovery
-
Not Synced
You can write a broker which gives you back your single point of failure
-
Not Synced
You can use DNS which means that can say "Hey I added a new service, let's use this SRV record to get to it"
-
Not Synced
Or you can distribute a text file.
-
Not Synced
Last year, during the Google Summer of Code, I mentored Simon Choping
-
Not Synced
who implemented the DNS solution for integration in fedmsg in Debian.
-
Not Synced
The Fedora people as they control their whole infrastructure just distribute a text file
-
Not Synced
with the list of servers that are sending fedmsg messages
-
Not Synced
How do you use it?
-
Not Synced
This is the Fedora topology.
-
Not Synced
I didn't have much time to do the Debian one.
-
Not Synced
It's really simpler. I'll talk about it later.
-
Not Synced
Basically, the messages are split in topics where you have a hierarchy of topics.
-
Not Synced
It's really easy to filter out the things that you want to listen to.
-
Not Synced
For instance, you can filter all the messages that concern package upload by using the dak service.
-
Not Synced
Or everything that involves a given package or something else.
-
Not Synced
Publishing messages is really trivial.
-
Not Synced
From Python, you only have to import the module,
-
Not Synced
do fedmsg.publish with a dict of the data that you want to send
-
Not Synced
And that's it, your message is published.
-
Not Synced
From the shell, it's really easy too.
-
Not Synced
You just have a command called fedmsg-logger that you can pipe some input to
-
Not Synced
And it goes on the bus, so it's really simple.
-
Not Synced
Receiving messages is trivial too.
-
Not Synced
In Python, you load the configuration
-
Not Synced
and you just have an iterator
-
Not Synced
(video problems, resume at 10:10)
-
Not Synced
was a replay mechanism with just a sequence number
-
Not Synced
which will have your client query the event senders for new messages that you would have missed
-
Not Synced
in case of a network failure ??
-
Not Synced
That's how basically the system works.
-
Not Synced
Now, what about fedmsg in Debian
-
Not Synced
During the last Google Summer of code, a lot happened thanks to Simon Chopin's involvement
-
Not Synced
He did most of the packaging of fedmsg and its dependencies
-
Not Synced
It means that you can just apt-get install fedmsg and get it running
-
Not Synced
It's available in sid, jessie and wheezy-backports
-
Not Synced
He adapted the code of fedmsg to make it distribution agnostic
-
Not Synced
So he had a lot of support from upstream developers in Fedora to make that happen
-
Not Synced
They are really excited to have their stuff being used by Debian or by other organizations
-
Not Synced
?? fedmsg was the right solution for event notification
-
Not Synced
And finally, we bootstrapped the Debian bus by using mailing-list subscriptions
-
Not Synced
to get bug notifications and package upload notifications
-
Not Synced
and on mentors.debian.net which is a service I can control, so it's easy to add new stuff to it.
-
Not Synced
What then?
-
Not Synced
After the Google Summer of Code, there was some packaging adaptations to make it easier to run services based on fedmsg,
-
Not Synced
proper backports and maintainance of the bus
-
Not Synced
Which mostly means keeping the software up-to-date
-
Not Synced
because the upstream is really active and responsive to bug reports
-
Not Synced
It's really nice to work with them
-
Not Synced
Since July 14th 2013 which is the day we started sending messages on the bus
-
Not Synced
we had around 200k messages split accross 155k bug mails and 45k uploads
-
Not Synced
which proves that Debian is a really active project, I guess
-
Not Synced
[laughs]
-
Not Synced
The latest developments with fedmsg is the packaging of Datanommer
-
Not Synced
Which is a database component that can store messages that has been sent to the bus
-
Not Synced
It allows Fedora to do queries on their messages
-
Not Synced
and give people the achievements that they did like "yeah, you got a hundred build failures"
-
Not Synced
or stuff like that [laughs]
-
Not Synced
One big issue with fedmsg that I said earlier is that Debian services are widely distributed
-
Not Synced
Some of the times, firewall restrictions are out of Debian control
-
Not Synced
which is also the case of with the Fedora infrastructure
-
Not Synced
because some of their servers are hosted within Redhat
-
Not Synced
and Redhat networking sometimes don't want to open firewall ports
-
Not Synced
So we need a way for services to push their messages instead of having clients pull the messages
-
Not Synced
There is a component in fedmsg which have been created by the Fedora people which is called fedmsg-relay
-
Not Synced
Which basically is just a tube where you push your message using a 0mq socket
-
Not Synced
and it then pushes it to the subscribers on the other side
-
Not Synced
It just allows to bypass firwalls
-
Not Synced
The issue is that it uses a non-standard port and a non-standard protocol
-
Not Synced
It's just 0mq so it basically put your data on the wire and that's it.
-
Not Synced
So, I am pondering a way for services to push their messages using more classic web services
-
Not Synced
You will take your JSON dictionary and push it by POST through HTTPS
-
Not Synced
And then after that send the message to the bus
-
Not Synced
Which I think will make it easier to integrate with other Debian services
-
Not Synced
This was a really short talk
-
Not Synced
I hope there is some discussions afterwards
-
Not Synced
In conclusion, ??
-
Not Synced
I am really glad ??
-
Not Synced
For the moment, it's really apart from the Debian infrastructure
-
Not Synced
So the big challenge will be to try to integrate fedmsg to Debian infrastructure
-
Not Synced
Use it for real
-
Not Synced
If you want to contact me, I am olasd
-
Not Synced
I am here for the whole conference
-
Not Synced
If you want to talk to me about it, if you want to help me,
-
Not Synced
I am a little bit alone on this project, so I'll be glad if someone would join
-
Not Synced
I'll be glad to hold an hacking session later this week
-
Not Synced
Thanks for your attention
-
Not Synced
[applause]
-
Not Synced
Was it this clear?