I am Nicolas Dandrimont. I am going to talk to you about a year of fedmsg in Debian. We had a problem before with infrastructure in distributions. Services are bit like people. There are dozen of services maintained by many people and each of those services has its own way of communicating with the rest of the world Meaning that if you want to spin up a new service that needs to talk to other services in the distribution which is basically any service you want to include you will need to implement a bunch of communication systems For instance, in the Debian infrastructure we have our archive software, which is dak, that mostly uses emails and databases to communicate. The metadat is available in a RFC822 format with no real API. The database is not public either. The build queue management software, which is wanna-build, polls a database every so often to know what needs to get built. There is no API outside of its database that isn't public either Out bug tracking system, which is called debbugs, works via email, stores its data in flat files, for now, and exposes a read-only SOAP API. Our source control managament pushes in the distro-provided repos on alioth can trigger an IRC bot or some emails but there is no real central notification mechanism. We have some kludges that are available to overcome those issues. We have the Ultimate Debian Database which contains a snapshot of a lot of the databases that are underlying the Debian infrastructure This means that every so often, there is a cron that runs and imports data from a service here, a service there. There is no realtime data. It's useful for distro-wide Q&A stuff because you don't need to have realtime data But when you want some notification for trying to build a new package or something That doesn't work very well and the consistency between the data sources is not guaranteed. We have another central notification system which the package tracking system which also is cron-triggered or email-triggered You can update the data from the BTS using ?? But you can subscribe to email updates on a given package But the messages are not uniform, they can be machine parsed. There are a few headers but they are not sufficient to know what the messages are about. And it's still not realtime. The Fedora people invented something that could improve stuff which is called fedmsg. It was actually introduced in 2009. It's an unified message bus that can reduce the coupling between different services of a distribution. That services can subscribe to one or several message topics, register callbacks and react to events that are triggered by all the services in the distribution. There is a bunch of stuff that are already implemented in fedmsg. You get a stream of data with all the activity in your infrastructure which allows you to do statistics for instance You decouple interdepent services because you can swap one thing with another Or just listen to the messages and start doing stuff directly without having to ?? a database or something. You can get a pluggable unified notification system that can gather all the events in the project and send them by email, by IRC on your mobile phone, on your desktop, everywhere you want. Fedora people use fedmsg to implement a badge system which is some kind of gamification of the development process of the distribution They implemented a live web dashboard They implemented IRC feed. And then they als go some bot bans on social networks because they were flooding How does it work? Well, the first idea was to use AMQP as implemented by qpid Basically, you take all your services and you have them send their messages in a central broker and then you have several listeners that can send messages to clients. There were a few issues with this. Basically, you have a single point of failure at the central broker And the brokers weren't really reliable. When they tested it under load, the brokers were tipping over ?? The actual implementation of fedmsg uses 0mq. Basically what you get is not a single broker. You get a mesh of interconnected services. Basically, you can connect only to the service that you want to listen to. The big drawback of this is that each and every service has to open up a port on the public Internet for people to be able to connect to it. There are some solutions for that which I will talk about. But the main advantages is that you have no central broker And they got like a hundred-fold speedup over the previous implementation. You also have an issue with service discovery You can write a broker which gives you back your single point of failure You can use DNS which means that can say "Hey I added a new service, let's use this SRV record to get to it" Or you can distribute a text file. Last year, during the Google Summer of Code, I mentored Simon Choping who implemented the DNS solution for integration in fedmsg in Debian. The Fedora people as they control their whole infrastructure just distribute a text file with the list of servers that are sending fedmsg messages How do you use it? This is the Fedora topology. I didn't have much time to do the Debian one. It's really simpler. I'll talk about it later. Basically, the messages are split in topics where you have a hierarchy of topics. It's really easy to filter out the things that you want to listen to. For instance, you can filter all the messages that concern package upload by using the dak service. Or everything that involves a given package or something else. Publishing messages is really trivial. From Python, you only have to import the module, do fedmsg.publish with a dict of the data that you want to send And that's it, your message is published. From the shell, it's really easy too. You just have a command called fedmsg-logger that you can pipe some input to And it goes on the bus, so it's really simple. Receiving messages is trivial too. In Python, you load the configuration and you just have an iterator (video problems, resume at 10:10) was a replay mechanism with just a sequence number which will have your client query the event senders for new messages that you would have missed in case of a network failure ?? That's how basically the system works. Now, what about fedmsg in Debian During the last Google Summer of code, a lot happened thanks to Simon Chopin's involvement He did most of the packaging of fedmsg and its dependencies It means that you can just apt-get install fedmsg and get it running It's available in sid, jessie and wheezy-backports He adapted the code of fedmsg to make it distribution agnostic So he had a lot of support from upstream developers in Fedora to make that happen They are really excited to have their stuff being used by Debian or by other organizations ?? fedmsg was the right solution for event notification And finally, we bootstrapped the Debian bus by using mailing-list subscriptions to get bug notifications and package upload notifications and on mentors.debian.net which is a service I can control, so it's easy to add new stuff to it. What then? After the Google Summer of Code, there was some packaging adaptations to make it easier to run services based on fedmsg, proper backports and maintainance of the bus Which mostly means keeping the software up-to-date because the upstream is really active and responsive to bug reports It's really nice to work with them Since July 14th 2013 which is the day we started sending messages on the bus we had around 200k messages split accross 155k bug mails and 45k uploads which proves that Debian is a really active project, I guess [laughs] The latest developments with fedmsg is the packaging of Datanommer Which is a database component that can store messages that has been sent to the bus It allows Fedora to do queries on their messages and give people the achievements that they did like "yeah, you got a hundred build failures" or stuff like that [laughs] One big issue with fedmsg that I said earlier is that Debian services are widely distributed Some of the times, firewall restrictions are out of Debian control which is also the case of with the Fedora infrastructure because some of their servers are hosted within Redhat and Redhat networking sometimes don't want to open firewall ports So we need a way for services to push their messages instead of having clients pull the messages There is a component in fedmsg which have been created by the Fedora people which is called fedmsg-relay Which basically is just a tube where you push your message using a 0mq socket and it then pushes it to the subscribers on the other side It just allows to bypass firwalls The issue is that it uses a non-standard port and a non-standard protocol It's just 0mq so it basically put your data on the wire and that's it. So, I am pondering a way for services to push their messages using more classic web services You will take your JSON dictionary and push it by POST through HTTPS And then after that send the message to the bus Which I think will make it easier to integrate with other Debian services This was a really short talk I hope there is some discussions afterwards In conclusion, ?? I am really glad ?? For the moment, it's really apart from the Debian infrastructure So the big challenge will be to try to integrate fedmsg to Debian infrastructure Use it for real If you want to contact me, I am olasd I am here for the whole conference If you want to talk to me about it, if you want to help me, I am a little bit alone on this project, so I'll be glad if someone would join I'll be glad to hold an hacking session later this week Thanks for your attention [applause] Was it this clear?