As you may know, for the last few months anoa (Andrew) and APWhiteHat have been working on Dendrite, the next generation Matrix homeserver, written in Go. We asked for an update on their progress, and Andrew provided the blog post below. Serious progress has been made on Dendrite this summer!
Hey everyone, my name is Andrew Morgan and I've been working full-time over the summer on Dendrite, our next-generation Matrix homeserver. Over the last two months, I've seen the project transform from a somewhat functioning toy server to a near-production-ready homeserver that is working towards complete feature support. I've appreciated the thought put into the project since day one, and enjoy the elegance of the multi-component design. Documentation is fairly decent at the moment, but comments are plentiful throughout the codebase, while the code itself tends towards simple and maintainable rather than complex and unmanageable.
In Dendrite, we are taking this one step further by introducing OpenTracing, a language and platform-agnostic framework for tracking the journey of an endpoint call from incoming request to outgoing response, with every method, hierarchy change and database call in between. It will be immensely useful in tracking down performance issues, as well as providing insight into the most critical paths throughout the codebase and where we should focus most of our optimization efforts on. It also comes with a lovely dashboard courtesy of Jaeger:
We've also started to see some people running Dendrite in live environments, which is incredibly exciting for us to see! While Dendrite is not considered production-ready yet (though it moves closer every day), if you are interested in giving it a go please consult the quickstart installation guide. We look forward to any feedback you may have!
We've just released Synapse 0.33.0! This is a major performance upgrade which speeds up /sync (i.e. receiving messages) by a factor of almost 2x! This has already made a massive difference to the CPU usage and snappiness of the matrix.org homeserver since we rolled it out a few days ago - you can see the drop in sync worker CPU just before midday on July 17th; previously we were regularly hitting the CPU ceiling (at which point everything grinds to a halt) - now we're back down hovering between 40% and 60% CPU (at the current load). This is actually fixing a bug which crept in around Synapse 0.31, so please upgrade - especially if Synapse has been feeling slower than usual recently, and especially if you are still on Synapse 0.31.
Meanwhile we have a lot of new stuff coming on the horizon - a whole new algorithm for state resolution (watch this space for details); incremental state resolution (at last!) to massively speed up state resolution and mitigate extremities build up (and speed up the synapse master process, which is now the bottleneck again on the matrix.org homeserver); better admin tools for managing resource usage, and all the Python3 porting work (with associated speedups and RAM & GC improvements). Fun times ahead!
The full changelog follows below; as always you can grab Synapse from https://github.com/matrix-org/synapse. Thanks for flying Matrix!
On Monday (2018-06-11) we had an incident where #matrix:matrix.org was hijacked by a malicious user pretending to join the room immediately after its creation in 2014 and then setting an m.room.power_levels event ‘before' the correct initial power_level for the room.
Under normal circumstances this should be impossible because the initial m.room.power_levels for a room should be set before its m.room.join_rules event, meaning users who join the room are subject to its power levels. However, back before we'd even released Synapse, the first two rooms ever created in Matrix (#test:matrix.org and #matrix:matrix.org) were manually created and set the join_rules before the power_levels event, letting users join before the room's power_levels were defined, and so were vulnerable to this attack. We've since re-created #matrix:matrix.org - please re-/join the room if you haven't already!
As a defensive measure, we are releasing a security update of Synapse (0.31.2) today which changes the rules used to authenticate power_level events, such that we fail-safe rather than fail-deadly if the existing auth mechanisms fail. In practice this means changing the default power level required to set state to be 50 rather than 0 if there is no power_levels event present, thus meaning that only the room creator can set the initial power_levels event.
We are not aware of anyone abusing this (other than the old #matrix:matrix.org room) but we'd rather be safe than sorry, so would recommend that everyone upgrade as soon as possible.
This of course constitutes a change to the spec, so full technical details and ongoing discussion around the Matrix Spec Change proposal can be followed over at MSC1304.EDIT: if you are aware of your server participating in rooms whose first power_levels event is deliberately set by a different user to their creator, please let us know asap (and don't upgrade!)
This work is all part of a general push to finalise and harden and fully specify the Server-Server API as we push towards a long-awaited stable release of Matrix!
As always, you can get the new update from https://github.com/matrix-org/synapse/releases/tag/v0.31.2 or from any of the sources mentioned at https://github.com/matrix-org/synapse.
thanks, and apologies for the inconvenience.
m.room.power_levelsevent in force in the room. (PR #3397)
Discussion around the Matrix Spec change proposal for this change can be followed at https://github.com/matrix-org/matrix-doc/issues/1304.
We've been able to start investing more time in advancing the Matrix Specification itself over the last month or so thanks to Ben joining the core team (and should be able to accelerate even more with uhoreg joining in a few weeks!) The first step in the new wave of work has been to provide much better infrastructure for the process of actually evolving the spec - whether that's from changes proposed by the core team or the wider Matrix Community.
So, without further ado, we'd like to introduce https://matrix.org/docs/spec/proposals - a dashboard for all the spec change proposals we've accumulated so far (ignoring most of the ones which have already been merged), as well as a clearer workflow for how everyone can help improve the Matrix spec itself. Part of this is introducing a formal numbering system - e.g. MSC1228 stands for Matrix Spec Change 1228 (where 1228 is the ID of the Github issue on the github.com/matrix-org/matrix-doc/issues repository that tracks the proposal).
Please note that these are NOT like XEPs or RFCs - i.e. optional proposals or add-ons to the protocol; instead they are literally proposals for changes to the Matrix Spec itself. Once merged into the spec, they are only of historical interest.
We've also created a new room: #matrix-spec:matrix.org to discuss specific spec proposal changes - please join if you want to track how proposals are evolving! (Conversation is likely to fork off into per-proposal rooms or overflow into #matrix-dev:matrix.org or #matrix-architecture:matrix.org depending on traffic levels, however).
Feedback would be much appreciated on this - so please head over to #matrix-spec:matrix.org and let us know how it feels and how it could do better.
This is also a major step towards properly formalising Matrix.org's governance model - hopefully the changes above are sufficient to improve the health of the evolution of the Spec as we work towards an initial stable release later this year, and then you should expect to see a spec proposal for formal governance once we've (at last!) exited beta :)
Huge thanks to Ben for putting this together, and thanks to everyone who's contributed so far to the spec - we're looking forward to working through the backlog of proposals and turning them all into merged spec PRs!!
Many people will have noticed disruption in #matrix:matrix.org and #matrix-dev:matrix.org on Sunday, when a validation bug in Synapse was exploited which allowed a malicious event to be inserted into the room with 'depth' value that made the rooms temporarily unusable. Whilst a transient workaround was found at the time (thanks to /dev/ponies, kythyria and Po Shamil for the workaround and to Half-Shot for working on a proposed fix), we're doing an urgent release of Synapse 0.28.1 to provide a temporary solution which will mitigate the attack across all rooms in upgraded servers and un-break affected ones. Meanwhile we have a full long-term fix on the horizon (hopefully) later this week.This vulnerability has already been exploited in the wild; please upgrade as soon as possible.
Synapse 0.28.1 is available from https://github.com/matrix-org/synapse/releases/tag/v0.28.1 as normal.
The 'depth' parameter is used primarily as a way for servers to signal the intended cosmetic order of their events within a room (particularly when the room's message graph has gaps in it due to the server being offline, or due to users backfilling old disconnected chunks of conversation). This means that affected rooms may experience message ordering problems until a full long-term fix is provided, which we're working on currently (and tentatively involves no longer trusting 'depth' information from servers). For full details you can see the proposal documents for the temporary fix in 0.28.1 and the options for the imminent long-term fix.
We'd like to acknowledge jzk for identifying the vulnerability, and Max Dor for providing feedback on the fixes.
As a general reminder, Synapse is still beta (as is the Matrix spec) and the federation API particularly is still being debugged and refined and is pre-r0.0.0. For the benefit of the whole community, please disclose vulnerabilities and exploits responsibly by emailing [email protected] or DMing someone from +matrix:matrix.org. Thanks.
Heads up that we made an emergency release of Riot/Web 0.13.5 a few hours ago to fix a XSS vulnerability found and reported by walle303 - many thanks for disclosing it responsibly.Please upgrade to Riot/Web 0.13.5 asap. If you're using riot.im/app or riot.im/develop this simply means hitting Refresh; otherwise please upgrade your Riot deployment as soon as possible. Alpine, Debian and Fedora/RPM packages are already updated - huge thanks to the maintainers for the fast turnaround.
The issue lies in the relatively obscure external_url feature, which lets bridges specify a URL for bridged events, letting Riot/Web users link through to the 'original' event (e.g. a twitter URL on a bridged tweet). The option is hidden in a context menu and labelled "Source URL", and is only visible on events which have the external_url field set. Unfortunately Riot/Web didn't sanitise the URL correctly, allowing a malicious URL to be injected - and this has been the case since the feature landed in Riot 0.9.0 (Nov 2016).
If you're not able to upgrade to Riot/Web 0.13.5 for some reason, then please do not click on the 'Source URL' feature on the event context menu:
Apologies for the inconvenience,