We hit a major milestone today on Dendrite, our next-generation golang homeserver: Dendrite received its first messages!!
Before you get too excited, please understand that Dendrite is still a pre-alpha work in progress – whilst we successfully created some rooms on an instance and sent a bunch of messages into them via the Client-Server API, most other functionality (e.g. receiving messages via /sync, logging in, registering, federation etc) is yet to exist. It cannot yet be used as a homeserver. However, this is still a huge step in the right direction, as it demonstrates the core DAG functionality of Matrix is intact, and the beginnings of a usable Client-Server API are hooked up.
The architecture of Dendrite is genuinely interesting – please check out the wiring diagram if you haven’t already. The idea is that the server is broken down into a series of components which process streams of data stored in Kafka-style append-only logs. Each component scales horizontally (you can have as many as required to handle load), which is an enormous win over Synapse’s monolithic design. Each component is also decoupled from each other by the logs, letting them run on entirely different machines as required. Please note that whilst the initial implementation is using Kafka for convenience, the actual append-only log mechanism is abstracted away – in future we expect to see configurations of Dendrite which operate entirely from within a single go executable (using go channels as the log mechanism), as well as alternatives to Kafka for distributed solutions.
The components which have taken form so far are the central roomserver service, which is responsible (as the name suggests) for maintaining the state and integrity of one or more rooms – authorizing events into the room DAG; storing them in postgres, tracking the auth chain of events where needed; etc. Much of the core matrix DAG logic of the roomserver is provided by gomatrixserverlib. The roomserver receives events sent by users via the ‘client room send’ component (and ‘federation backfill’ component, when that exists). The ‘client room send’ component (and in future also ‘client sync’) is provided by the clientapi service – which is what as of today is successfully creating rooms and events and relaying them to the roomserver!
The actual events we’ve been testing with are the history of the Matrix Core room: around 10k events. Right now the roomserver (and the postgres DB that backs it) are the main bottleneck in the pipeline rather than clientapi, so it’s been interesting to see how rapidly the roomserver can consume its log of events. As of today’s benchmark, on a generic dev workstation and an entirely unoptimised roomserver (i.e. no caching whatsoever) running on a single core, we’re seeing it ingest the room history at over 350 events per second. The vast majority of this work is going into encoding/decoding JSON or waiting for postgres: with a simple event cache to avoid repeatedly hitting the DB for every auth and state event, we expect this to increase significantly. And then as we increase the number of cores, kafka partitions and roomserver instances it should scale fairly arbitrarily(!)
For context, the main synapse process for Matrix.org currently maxes out persisting events at around 15 and 20 per second (although it is also spending a bunch of time relaying events to the various worker processes, and other miscellanies). As such, an initial benchmark for Dendrite of 350 msgs/s really does look incredibly promising.
You may be wondering where this leaves Synapse? Well, a major driver for implementing Dendrite has been to support the growth of the main matrix.org server, which currently persists around 10 events/s (whilst emitting around 1500 events/s). We have exhausted most of the low-hanging fruit for optimising Synapse, and have got to point where the architectural fixes required are of similar shape and size to the work going into Dendrite. So, whilst Synapse is going to be around for a while yet, we’re putting the majority of our long-term plans into Dendrite, with a distinct degree of urgency as we race against the ever-increasing traffic levels on the Matrix.org server!
Finally, you may be wondering what happened to Dendron, our original experiment in Golang servers. Well: Dendron was an attempt at a strangler pattern rewrite of Synapse, acting as an shim in front of Synapse which could gradually swap out endpoints with their golang implementations. In practice, the operational complexity it introduced, as well as the amount of room for improvement (at the time) we had in Synapse, and the relatively tight coupling to Synapse’s existing architecture, storage & schema, meant that it was far from a clear win – and effectively served as an excuse to learn Go. As such, we’ve finally formally killed it off as of last week – Matrix.org is now running behind a normal haproxy, and Dendron is dead. Meanwhile, Dendrite (aka Dendron done Right ;) is very much alive, progressing fast, free from the shackles of Synapse.
We’ll try to keep the blog updated with progress on Dendrite as it continues to grow!
Hi everyone. I’m Kegan, one of the core developers at matrix.org. This is the first in a series on the matrix.org IRC bridge. The aim of this series is to try to give a behind the scenes look at how the IRC bridge works, what kinds of problems we encountered, and how we plan to scale in the future. This post looks at how the IRC bridge actually works.
Firstly, what is “bridging”? The simple answer is that it is a program which maps between different messaging protocols so that users on different protocols can communicate with each other. Some protocols may have features which are not supported in the other (typing notifications in Matrix, DCC – direct file transfers – in IRC). This means that bridging will always be “inferior” to just using the respective protocol. That being said, where there is common ground a bridge can work well; all messaging protocols support sending and receiving text messages for example. As we’ll see however, the devil is in the detail…
A lot of existing IRC bridges for different protocols share one thing in common: they use a single global bot to bridge traffic. This bot listens to all messages from IRC, and sends them to the other network. The bot also listens for messages from users on the other network, and sends messages on their behalf to IRC. This is a lot easier than having to maintain dedicated TCP connections for each user. However, it isn’t a great experience for IRC users as they:
- Don’t know who is reading messages on a channel as there is just 1 bot in the membership list.
- Cannot PM users on the other network.
- Cannot kick/ban users on the other network without affecting everyone else.
- Cannot bing/mention users on the other network easily (tab completion).
We made the decision very early on that we would keep dedicated TCP connections for each Matrix user. This means every Matrix user has their own tiny IRC client. This has its own problems:
- It involves multiple connections to the IRCd so you need special permission to set up an i:line.
- You need to be able to support identification of individual users (via ident or unique IPv6 addresses).
- With all these connections to the same IRC channels, you need to have some way to identify which incoming messages have already been handled and which have not.
So now that we have a way to send and receive messages, how do we map the rooms/channels between protocols? This isn’t as easy as you may think. We can have a single static one-to-one mapping:
- All messages to
#channel go to
- All messages from
!abcdef:matrix.org go to
- All PMs between
@alice:matrix.org and Bob go to
!wxyz:matrix.org and the respective PM on IRC.
In order to make PMs secure, we need to limit who can access the room. This is done by making the Matrix PM room “invite-only”. This can cause problems though if the Matrix user ever leaves that room: they won’t be able to ever re-join! The IRC bridges get around this by allowing Matrix users to replace their dedicated PM room with a new room, and by checking to make sure that the Matrix user is inside the room before sending messages.
Then you have problem of “ownership” of rooms. Who should be able to kick users in a bridged room? There are two main scenarios to consider:
- The IRC channel has existed for a while and there are existing IRC channel operators.
- The IRC channel does not exist, but there are existing Matrix moderators.
In the first case, we want to defer ownership to the channel operators. This is what happens by default for all bridged IRC channels on matrix.org. The Matrix users have no power in the room, and are at the mercy of the IRC channel operators. The channel operators are represented by virtual Matrix users in the room. However, they do not have any power level: they are at the same level as real Matrix users. Why? The bridge does this because, unlike IRC, it’s not possible in Matrix to bring a user to the same level as yourself (e.g
+o), and then downgrade them back to a regular user (e.g.
-o). Instead, the bridge bot itself acts as a custodian for the room, and performs privileged IRC operations (topic changing, kickbans, etc) on the IRC channel operator’s behalf.
In the second case, we want to defer ownership to the Matrix moderators. This is what happens when you “provision a room” in Matrix. The bridge will PM a currently online channel operator and ask for their permission to bridge to Matrix. If they accept, the bridge is made and the power levels in the pre-existing Matrix room are left untouched, giving moderators in Matrix control over the room. However, this power doesn’t extend completely to IRC. If a Matrix moderator grants moderator powers to another Matrix user, this will not be mapped to IRC. Why? It’s not possible for the bridge to give chanops to any random user on any random IRC channel, so it cannot always honour the request. This relies on the humans on either side of the bridge to communicate and map power accordingly. This is done on purpose as there is no 100% perfect mapping between IRC powers and Matrix powers: it’s always going to need to compromise which only a human can make.
Finally, there is the problem of one-to-many mappings. It is possible to have two Matrix rooms bridged to the same IRC channel. The problem occurs when a Matrix user in one room speaks. The bridge can easily map that to IRC, but unless it also maps it back to Matrix, the message will never make it to the 2nd Matrix room. The bridge cannot control/puppet the Matrix user who spoke, so instead it creates a virtual Matrix user to represent that real Matrix user and then sends the message into the 2nd Matrix room. Needless to say, this can be quite confusing and we strongly discourage one-to-many mappings for this reason.
Mapping Matrix messages to IRC is rather easy for the most part. Messages are passed from the Homeserver to the bridge via the AS API, and the bridge sends a textual representation of the message to IRC using the IRC connection for that Matrix user. The exact form of the text for images, videos and long text can be quite subjective, and there is inevitably some data loss along the way. For example, you can send big text headings, tables and lists in Matrix, but there is no equivalent on IRC. Thankfully, most Matrix users are sending the corresponding markdown and so the formatting can be reasonably preserved by just sending the plaintext (markdown) body.
Mapping IRC messages to Matrix is more difficult: not because it’s hard to represent the message in Matrix, but because of the architecture of the bridge. The bridge maintains separate connections for each Matrix user. This means the bridge might have, for example, 5 users (and hence connections) on the same channel. When an IRC user sends a message, the bridge gets 5 copies of the message. How does the bridge know:
- If the message has already been sent?
- If the message is an intentional duplicate?
The IRC protocol does not have message IDs, so the bridge cannot de-duplicate messages as they arrive. Instead, it “nominates” a single user’s connection to be responsible for delivering messages from that channel. This introduces another problem though. Long-lived TCP connections are fickle things, and can fail without any kind of visible warning until you try to send bytes down it. If a user’s connection drops, another user needs to take over responsibility for delivering messages. This is what the “IRC Event Broker” class does. It allows users to “steal” messages if the bridge has any indication that the connection in charge has dropped. This technique has worked well for us, and gives us the ability to have more robust connections to the channel than with one TCP connection alone.
Admin rooms are private Matrix rooms between a real Matrix user and the bridge bot. It allows the Matrix user to control their connection to IRC. It allows:
- The IRC nick to be changed.
- The ability to issue /whois commands.
- The ability to bypass the bridge and send raw IRC commands directly down the TCP connection (e.g. MODE commands).
- The ability to save a NickServ password for use when the bridge reconnects you.
- The ability to disconnect from the network entirely.
To perform these actions, Matrix users send a text message which starts with a command name, e.g
!whois $ARG. Like all commands, you expect to get a reply once you’ve issued it. However, IRC makes this extremely difficult to do. There is no request/response pair like there is with HTTP requests. Instead, the IRC server may:
- Ignore the request entirely.
- Send an error you’re aware of (in the RFC/most servers)
- Send some information which can be assumed to indicate success.
- Send an error you’re unaware of.
- Send some information which sometimes indicates success.
This makes it very difficult to know if a request succeeded or failed, and I’ll go into more detail in the next post which focuses on problems we’ve encountered when developing the IRC bridge. This room is also used to inform the Matrix user about general information about their IRC connection, such as when their connection has been lost, or if there are any errors (e.g. “requires chanops to do this action”). The bridge makes no effort to parse these errors, because it doesn’t always know what caused the error to happen.
Developing a comprehensive IRC bridge is a very difficult task. This post has outlined a few of the ways in which we’ve designed our bridge, and some of the general problems in this field. The bridge is constantly improving as we discover new edge cases with the plethora of IRCd implementations out there. The next post will look at some of these edge cases and look back at some previous outages and examine why they occurred.
Bridges come in many flavours, and we need consistent terminology within the Matrix community to ensure everyone (users, developers, core team) is on the same page. This post is primarily intended for bridge developers to refer to when building bridges.
The most recent version of this document is here (source) but we’re also posting it as a blog post for visibility.
Types of rooms
Bridges can register themselves as controlling chunks of room aliases namespace, letting Matrix users join remote rooms transparently if they /join #freenode_#wherever:matrix.org or similar. The resulting Matrix room is typically automatically bridged to the single target remote room. Access control for Matrix users is typically managed by the remote network’s side of the room. This is called a portal room, and is useful for jumping into remote rooms without any configuration needed whatsoever – using Matrix as a ‘bouncer’ for the remote network.
Alternatively, an existing Matrix room can be can plumbed into one or more specific remote rooms by configuring a bridge (which can be run by anyone). For instance, #matrix:matrix.org is plumbed into #matrix on Freenode, matrixdotorg/#matrix on Slack, etc. Access control for Matrix users is necessarily managed by the Matrix side of the room. This is useful for using Matrix to link together different communities.
Migrating rooms between a portal & plumbed room is currently a bit of a mess, as there’s not yet a way for users to remove portal rooms once they’re created, so you can end up with a mix of portal & plumbed users bridged into a room, which looks weird from both the Matrix and non-Matrix viewpoints. https://github.com/matrix-org/matrix-appservice-irc/issues/387 tracks this.
Types of bridges (simplest first):
The simplest way to exchange messages with a remote network is to have the bridge log into the network using one or more predefined users called bridge bots – typically called MatrixBridge or MatrixBridge etc. These relay traffic on behalf of the users on the other side, but it’s a terrible experience as all the metadata about the messages and senders is lost. This is how the telematrix matrix<->telegram bridge currently works.
Bot-API (aka Virtual user) based bridges
Some remote systems support the idea of injecting messages from ‘fake’ or ‘virtual’ users, which can be used to represent the Matrix-side users as unique entities in the remote network. For instance, Slack’s inbound webhooks lets remote bots be created on demand, letting Matrix users be shown cosmetically correctly in the timeline as virtual users. However, the resulting virtual users aren’t real users on the remote system, so don’t have presence/profile and can’t be tab-completed or direct-messaged etc. They also have no way to receive typing notifs or other richer info which may not be available via bot APIs. This is how the current matrix-appservice-slack bridge works.
Simple puppeted bridge
This is a richer form of bridging, where the bridge logs into the remote service as if it were a real 3rd party client for that service. As a result, the Matrix user has to already have a valid account on the remote system. In exchange, the Matrix user ‘puppets’ their remote user, such that other users on the remote system aren’t even aware they are speaking to a user via Matrix. The full semantics of the remote system are available to the bridge to expose into Matrix. However, the bridge has to handle the authentication process to log the user into the remote bridge.
This is essentially how the current matrix-appservice-irc bridge works (if you configure it to log into the remote IRC network as your ‘real’ IRC nickname). matrix-appservice-gitter is being extended to support both puppeted and bridgebot-based operation. It’s how the experimental matrix-appservice-tg bridge works.
Going forwards we’re aiming for all bridges to be at least simple puppeted, if not double-puppeted.
A simple ‘puppeted bridge’ allows the Matrix user to control their account on their remote network. However, ideally this puppeting should work in both directions, so if the user logs into (say) their native telegram client and starts conversations, sends messages etc, these should be reflected back into Matrix as if the user had done them there. This requires the bridge to be able to puppet the Matrix side of the bridge on behalf of the user.
This is the holy-grail of bridging; matrix-puppet-bridge is a community project that tries to facilitate development of double puppeted bridges, having done so for several networks. The main obstacle is working out an elegant way of having the bridge auth with Matrix as the matrix user (which requires some kind of scoped access_token delegation).
Some remote protocols (IRC, XMPP, SIP, SMTP, NNTP, GnuSocial etc) support federation – either open or closed. The most elegant way of bridging to these protocols would be to have the bridge participate in the federation as a server, directly bridging the entire namespace into Matrix.
We’re not aware of anyone who’s done this yet.
Finally: the types of bridging described above assume that you are synchronising the conversation history of the remote system into Matrix, so it may be decentralised and exposed to multiple users within the wider Matrix network.
This can cause problems where the remote system may have arbitrarily complicated permissions (ACLs) controlling access to the history, which will then need to be correctly synchronised with Matrix’s ACL model, without introducing security issues such as races. We already see some problems with this on the IRC bridge, where history visibility for +i and +k channels have to be carefully synchronised with the Matrix rooms.
You can also hit problems with other network-specific features not yet having equivalent representation in the Matrix protocol (e.g. ephemeral messages, or op-only messages – although arguably that’s a type of ACL).
One solution could be to support an entirely different architecture of bridging, where the Matrix client-server API is mapped directly to the remote service, meaning that ACL decisions are delegated to the remote service, and conversations are not exposed into the wider Matrix. This is effectively using the bridge purely as a 3rd party client for the network (similar to Bitlbee). The bridge is only available to a single user, and conversations cannot be shared with other Matrix users as they aren’t actually Matrix rooms. (Another solution could be to use Active Policy Servers at last as a way of centralising and delegating ACLs for a room)
This is essentially an entirely different product to the rest of Matrix, and whilst it could be a solution for some particularly painful ACL problems, we’re focusing on non-sidecar bridges for now.
Hey everyone! As of last week, we are now bridging irc.gimp.org (GIMPNet) for all your GTK+/GNOME needs! It’s running a bleeding-edge version of the IRC bridge which supports basic chanops syncing from IRC to Matrix. This means that if an IRC user gives chanops to a Matrix connection, the bridge will give that Matrix user moderator privileges in the room, allowing them to set the room topic/avatar/alias/etc! We hope this will make customising Matrix-bridged rooms a lot easier.
For a more complete list of current and future bridged IRC networks, see the official wishlist.
We are very happy to again be one of the organisations selected for Google Summer of Code (GSoC)!
Last year we had two students working on Matrix-projects over the summer – you can read the retrospective here – and now we are again offering students to work on Matrix as part of GSoC! Currently we are in the stage where students can propose interesting project ideas to any of the open source organisations picked by Google. Of course, we encourage students to get in touch with us and discuss their ideas before writing their application – please come say hi in the #gsoc:matrix.org room!
We are very eager to see what ideas students come up with. We have added our own ideas here, but students are expected to do some research and come up with their own ideas for projects. We have also written down some general tips on what to include in the application.
Applications can be submitted from March 20th, so there’s still plenty of time to have a play with Matrix and come up with a cool project idea!
We just pushed Synapse 0.19.2, which contains a single but important fix for protecting event visibility when accessed via the
/context API. Please upgrade from https://github.com/matrix-org/synapse!
Changes in synapse v0.19.2 (2017-02-20)
- Fix bug with event visibility check in /context/ API. Thanks to Tokodomo for pointing it out! (PR #1929)
Since FOSDEM we’ve seen even more interest in Matrix than normal, and we’ve been having some problems getting the Matrix.org homeserver to keep up with demand. This has resulted in performance being slightly slower than normal at peak times, but the main impact has been the additional traffic exacerbating outages on the homeserver – either by revealing new failure modes, or making it harder to recover rapidly after something goes wrong.
Specifically: on Friday afternoon we had a service disruption caused by someone sending an unusual event into Matrix HQ. It turns out that both matrix-android-sdk and matrix-ios-sdk based clients (e.g. Riot/Android and iOS) handled this naively by simply resyncing the room state… which has been fine in the past, but not when you have several hundred clients actively syncing the room, and resulted in a thundering herd effect which overloaded the server for ~10 mins or so whilst they all resynced the room (which, in turn, nowadays, involves calculating and syncing several MB of JSON state to each client). The traffic load was then high enough that it took the server a further 10-20 minutes for the server to fully catch up and recover after the herd had dissipated. We then had a repeat performance on Monday morning of the same failure mode.
Similarly, we had disruption last night after a user who hadn’t used the service for ages logged on for the first time and rapidly caught up on a few rooms which literally had *millions* of unread messages in them. Generally this would be okay, but the combination of loaded DB and the sheer number of notifications being deleted ended up with 4 long-running DB deletes in parallel. This seems to have caused postgres to lock the event_actions_table more aggressively than we’d expect, blocking other queries which were trying to access it… causing most requests to block until the deletes were over. At the current traffic volumes this meant that the main synapse process tried to serve thousands of simultaneous requests as they stacked up and ran out of filehandles within about 10 minutes and wedged the whole synapse solid before the DB could unblock. Irritatingly, it turns out our end-to-end monitoring has a bug where it in turn can crash on receiving a 500 from synapse, so despite having PagerDuty all set up and running (and having been receiving pages for traffic delays over the last few weeks)… we didn’t get paged when we got actual failed traffic rather than slow traffic, which delayed resolving the issue. Finally, whilst rolling out a fix this afternoon, we again hit issues with the traffic load causing more problems than we were expecting, making a routine redeploy distinctly more disruptive.
So, what are we doing about this?
- Fix the root causes:
- The ‘android/iOS thundering herd’ bug is being worked on both the android/iOS side (fixing the naive behaviour) and the server side. A temporary mitigation is in now place which moves the server-side code to worker processes so that worst case it can’t take out the main synapse process and can scale better.
- The ‘event_push_actions table is inefficient’ bug had already been fixed – so this was a matter of rushing through the hotfix to matrix.org before we saw a recurrence.
- Move to faster hardware. Our current DB master is a “fast when we bought it 5 years ago” machine whose IO is simply starting to saturate (6x 300GB 10krpm disks in RAID5, fwiw), which is maxing out at around 500IOPS and 20MB/s of random access, and acting as a *very* hard limit to the current synapse performance. We’re currently in the process of evaluating SSD-backed IO for the DB (in fact, we’re already running a DB slave), and assuming this tests out okay we’re hoping to migrate next week, which should give us a 10x-20x speed up on disk IO and buy considerable headroom. Watch this space for details.
- Make synapse faster. We’re continuing to plug away at optimisations (e.g. stuff like this), but these are reaching the point of diminishing returns, especially relative to the win from faster hardware.
- Fix the end-to-end monitoring. This already happened.
- Load-test before deploying. This is hard, as you really need to test against precisely the same traffic profile as live traffic, and that’s hard to simulate. We’re thinking about ways of fixing this, but the best solution is probably going to be clustering and being able to do incremental redeploys to gradually test new changes. On which note:
- Fix synapse’s architectural deficiencies to support clustering, allowing for rolling zero-downtime redeploys, and better horizontal scalability to handle traffic spikes like this. We’re choosing not to fix this in synapse, but we are currently in full swing implementing dendrite as a next-generation homeserver in Golang, architected from the outset for clustering and horizontal scalability. N.B. most of the exciting stuff is happening on feature branches and gomatrixserverlib atm. Also, we’re deliberately taking the time to try to get it right this time, unlike bits of synapse which were something of a rush job. It’ll be a few weeks before dendrite is functional enough to even send a message (let alone finish the implementation), but hopefully faster hardware will give the synapse deployment on matrix.org enough headroom for us to get dendrite ready to take over when the time comes!
The good news of course is that you can run your own synapse today to avoid getting caught up in this operational fun & games, and unless you’re planning to put tens of thousands of daily active users on the server you should be okay!
Meanwhile, please accept our apologies for the instability and be assured that we’re doing everything we can to get out this turbulence as rapidly as possible.
We’re a little late with this, but Synapse 0.19.1 was released last week. The only change is a bugfix to a regression in room state replication that snuck in during the performance improvements that landed in 0.19.0. Please upgrade if you haven’t already. We’ve also fixed the Debian repository to make installing Synapse easier on Jessie by including backported packages for stuff like Twisted where we’re forced to use the latest releases.
You can grab it from https://github.com/matrix-org/synapse/ as always.
Changes in synapse v0.19.1 (2017-02-09)
- Fix bug where state was incorrectly reset in a room when synapse received an event over federation that did not pass auth checks (PR #1892)
FOSDEM this year was even more crazy and incredible than ever – with attendance up from 6,000 to 9,000 folks, it’s almost impossible to describe the atmosphere. Matt Jordan from Asterisk describes it as DisneyWorld for OSS Geeks, but it’s even more than that: it’s basically a corporeal representation of the whole FOSS movement. There is no entrance fee; there is no intrusive sponsorship; there is no corporate presence: it’s just a venue for huge numbers of FOSS projects and their users and communities to come together in one place (the Université Libre de Bruxelles) and talk and learn. Imagine if someone built a virtual world with storefronts for every open source project imaginable, where you could chat to the core team, geek out with other users, or gather in auditoriums to hear updates on the latest projects & ideas. Well, this is FOSDEM… except even better, it’s in real life. With copious amounts of Belgian beer.
Anyway: this year we had our normal stand on the 2nd floor of K building, sharing the Realtime Lounge chill-out space with the XMPP Standards Foundation. This year we had a larger representation than ever before with Matthew, Erik and Luke from the London team as well as Manu & Yannick from Rennes – which is just as well given all 5 of us ended up speaking literally non-stop from 10am to 6pm on both Saturday & Sunday (and then into the night as proceedings deteriorated/evolved into an impromptu Matrix meetup with Coffee, uhoreg, tadzik, realitygaps and others!). The level of interest at the Matrix booth was frankly phenomenal: a major change from the last two FOSDEMs in that this year pretty much everyone had already heard of Matrix, and were most likely to want to enthuse about features and bugs in Synapse or Riot, or geek out about writing new bridges/bots/clients, or trying to work out a way to incorporate Matrix into their own projects or companies.
Synapse 0.19 and Riot 0.9.7 were also released on Saturday to try to ensure that anyone joining Matrix for the best time at FOSDEM were on the latest & greatest code – especially given the performance and E2E fixes present in both. Amazingly the last-minute release didn’t backfire: if you haven’t upgraded to Synapse 0.19 we recommend going so asap. And if you’re a Riot user, make sure you’re on the latest version :)
We were very lucky to have two talks accepted this year: the main one in the Security Track on the Jansen main stage telling the tale of how we added end-to-end encryption to Matrix via Olm & Megolm – and the other in the Decentralised Internet room (AW1.125), focusing on the unsolved future problems of decentralised accounts, identity, reputation in Matrix. Both talks were well attended, with huge queues for the Decentralised Internet room: we can only apologise to everyone who queued for 20+ minutes only to still not be able to get in. Hopefully next year FOSDEM will allocate a larger room for decentralisation! On the plus side, this year FOSDEM did an amazing job of videoing the sessions – livestreaming every talk, and automatically publishing the recordings (via a fantastic ‘publish your own talk’ web interface) – so many of the people who couldn’t get into the room (as well as the rest of the world) were able to watch it live anyway by the stream.
You can watch the video of the talks from the FOSDEM website here and here. Both talks necessarily include the similar exposition for folks unfamiliar with Matrix, so apologies for the duplication – also, the “future of decentralised communication” talk ended up a bit rushed; 20 minutes is not a lot of time to both explain Matrix and give an overview of the challenges we face in fixing spam, identity, moderation etc. But if you like hearing overenthusiastic people talking too fast about how amazing Matrix is, you may wish to check out the videos :) You can also get at the slides as PDF here (E2E Encryption) and here (Future of Decentralisation).
Huge thanks to evevryone who came to the talks or came and spoke to us at the stand or around the campus. We had an amazing time, and are already looking forward to next year!
Matthew & the team
We’re happy to announce the release of Synapse 0.19.0 (same as 0.19.0-rc4) today, just in time for anyone discovering Matrix for the first time at FOSDEM 2017! In fact, here’s Erik doing the release right now (with moral support from Luke):
This is a pretty big release, with a bunch of new features and lots and lots of debugging and optimisation work following on some of the dramas that we had with 0.18 over the Christmas break. The biggest things are:
- IPv6 Support (unless you have an IPv6 only resolver), thanks to contributions from Glyph from Twisted and Kyrias!
- A new API for tracking the E2E devices present in a room (required for fixing many of the remaining E2E bugs…)
- Rewrite the ‘state resolution’ algorithm to be orders of magnitude more performant
- Lots of tuning to the caching logic.
If you’re already running a server, please upgrade! And if you’re not, go grab yourself a brand new Synapse from Github. Debian packages will follow shortly (as soon as Erik can figure out the necessary backporting required for Twisted 16.6.0)
And here’s the full changelog…
No changes since RC 4.
Changes in synapse v0.19.0-rc4 (2017-02-02)
- Bump cache sizes for common membership queries (PR #1879)
Changes in synapse v0.19.0-rc3 (2017-02-02)
- Fix email push in pusher worker (PR #1875)
- Make presence.get_new_events a bit faster (PR #1876)
- Make /keys/changes a bit more performant (PR #1877)
Changes in synapse v0.19.0-rc2 (2017-02-02)
- Include newly joined users in /keys/changes API (PR #1872)
Changes in synapse v0.19.0-rc1 (2017-02-02)
- Add support for specifying multiple bind addresses (PR #1709, #1712, #1795, #1835). Thanks to @kyrias!
- Add /account/3pid/delete endpoint (PR #1714)
- Add config option to configure the Riot URL used in notification emails (PR #1811). Thanks to @aperezdc!
- Add username and password config options for turn server (PR #1832). Thanks to @xsteadfastx!
- Implement device lists updates over federation (PR #1857, #1861, #1864)
- Implement /keys/changes (PR #1869, #1872)
- Improve IPv6 support (PR #1696). Thanks to @kyrias and @glyph!
- Log which files we saved attachments to in the media_repository (PR #1791)
- Linearize updates to membership via PUT /state/ to better handle multiple joins (PR #1787)
- Limit number of entries to prefill from cache on startup (PR #1792)
- Remove full_twisted_stacktraces option (PR #1802)
- Measure size of some caches by sum of the size of cached values (PR #1815)
- Measure metrics of string_cache (PR #1821)
- Reduce logging verbosity (PR #1822, #1823, #1824)
- Don’t clobber a displayname or avatar_url if provided by an m.room.member event (PR #1852)
- Better handle 401/404 response for federation /send/ (PR #1866, #1871)
- Fix ability to change password to a non-ascii one (PR #1711)
- Fix push getting stuck due to looking at the wrong view of state (PR #1820)
- Fix email address comparison to be case insensitive (PR #1827)
- Fix occasional inconsistencies of room membership (PR #1836, #1840)
- Don’t block messages sending on bumping presence (PR #1789)
- Change device_inbox stream index to include user (PR #1793)
- Optimise state resolution (PR #1818)
- Use DB cache of joined users for presence (PR #1862)
- Add an index to make membership queries faster (PR #1867)