Matrix.org status update – July 2017

Hi folks,

Thought it was worth giving a quick status update on what’s going on since our last blog post, which explained the funding situation Matrix has found itself in.  The TL;DR is that we’re still here; things are moving faster than ever (not least as we refocus on getting everything needed to get Matrix funded and sustainable in the longer term), but we still need concrete support from the community (both company sponsorship and personal donations) to ensure things keep going at the current rate.

 

Funding Status

So, the good news is that we had a great initial response to last week’s call to help – right now we have 199 people signed up on Patreon (go on, be the 200th! you know you want to :D), ~30 on Liberapay, and 14 bitcoin donations.  This sums up to just over $2000/month – which is getting close to our initial Patreon goal of $2500/month to helping support half the cost of the less senior devs working on Matrix. Endless thanks to everyone who has donated – especially the 19 folks (18 on Patreon, one on Liberapay) who have so generously pledged $50 or more a month!! Meanwhile, if you’re reading this and you haven’t pledged support yet – *please* consider heading over to Patreon or Liberapay or Bitcoin 1LxowEgsquZ3UPZ68wHf8v2MDZw82dVmAE and helping keep the project running.  Literally every dollar counts.

Meanwhile, while Patreon & friends are headed in the right direction to support one developer, we still have another 10 people working on all the various core components of Matrix itself who need to be supported in the near future.  (We look to be safe for the next month or two, but beyond that we’re counting on having solved this problem ;).  Right now we are hoping that companies who believe in Matrix and/or are building services on top will step up to sponsor development – as it’s pretty obvious that accelerating Dendrite, final E2E, Groups etc will improve professional Matrix-based services immeasurably.  If this sounds like you, please get in touch asap.

We’re also able to provide paid consulting and development (and prioritised development) services on Matrix (through Vector, the for-profit company responsible for Riot) for large pieces of work – for instance, if you’re anxious to see enterprise-focused Matrix features land sooner than later, please reach out.

Exciting news is that we already have one concrete offer of paid consulting work from a very major company who happens to love Matrix, building out Integrations capabilities which should directly benefit the wider Matrix ecosystem – and we also are very proud to announce our very first official corporate sponsor (see the next section for details)!  However, we still have a long way to go, so don’t be shy about getting in touch: we need your support!

Heads up that we’ve also started our various reward schemes for supporters – folks donating more than $5 on Patreon will have already heard most of this update in the first episode of the video blog that Amandine & I posted last Friday; and folks donating more than $10 will have heard some of the other details first hand through the broadcast of the global team weekly sync on Monday!  We’re still figuring out how to get these rewards over to liberapay & bitcoin supporters (not helped by both services being anonymous…).  We haven’t yet opened up the #matrix-supporters:matrix.org room as maintaining the accesslist is effectively blocked on Groups landing.  We also want to use Groups to manage the various lists of supporters around the place, so apologies that we haven’t got the lists published yet!

Finally on the funding side of things: we’re setting up the Matrix.org Foundation non-profit legal entity this week, letting us accept donations and sponsorship in a way which can directly fund the core developers (more details as we have them).  As soon as it’s incorporated, we’ll be able to sign up fully on Liberapay to accept donations there.

 

Announcing UpCloud: our very first official Matrix.org Corporate Sponsor!

As hinted above, we’re incredibly excited and happy that UpCloud have signed up as our first official corporate sponsor.  UpCloud has already been hosting all of Matrix.org’s infrastructure for the last 6 months (no mean feat, given the scale of the Matrix.org synapse & postgres!) – and last week they committed to extend their sponsorship further to help the project out in our time of need.

We’ve been very impressed with UpCloud’s service since migrating over back in February – particularly their spectacularly fast block IO (~500MB/s write, ~10,000 IOPS) which is incredibly useful for running a huge synapse deployment like Matrix.org’s – and they have a great footprint of datacentres around the world.

They also like Matrix so much that they’ve written this great tutorial for getting Synapse up and running on their hosts – and best of all, they have a special $25 discount for anyone in the Matrix community who wishes to use them: check out https://www.upcloud.com/matrix/ for the details!

We’d like to thank them profusely for being first in line to support us – and we look forward to seeing how far we can push their hardware over the coming months! :D

 

Development Status

Finally, loads and loads of stuff is happening on Matrix itself.  The main headlines are:

  • Groups.  Work in Synapse and matrix-react-sdk is happening at breakneck speed to get Groups out the door as soon as possible, so we can use them both to support the funding drive and in general to implement one of the most asked-for features of Matrix: the ability to group rooms together into a well-defined community (similar to Slack Teams or Discord Servers etc).  The way Groups work is to let users define groupings of both users and rooms; you can also define a metadata for the group to let you build homepages similar to the one which Riot/Web sprouted a few months ago.  You can then refer to the group of users when inviting/banning/kicking etc – or when managing your own roomlist.  We think it’s going to completely change how people use Matrix, and can’t wait to see it land on riot.im/develop, although it’s still a few weeks away.
  • E2E Crypto.  We have three main things remaining here, after which E2E should be much much more usable for day-to-day purposes:
    1. Fixing the matrix-js-sdk to store crypto state in indexeddb rather than localstorage, to prevent multiple browser tabs racing and corrupting localstorage (which provides no locking mechanism).  This turns out to be much more of an epic than we thought, as indexeddb’s APIs are all strictly async, resulting in a whole bunch of previously synchronous APIs in matrix-js-sdk needing to become async too, as well as requiring us to switch promises library at least from Q to Bluebird.  However, most of this is now done so hopefully the new storage layer will land shortly.  https://github.com/vector-im/riot-web/issues/2325 is the bug tracking this one…
    2. Fixing the overall UX of managing devices in a room (including key shares).  https://github.com/vector-im/riot-web/issues/4522 is the bug for this one :)  Relatedly we also need to ensure invitees can decrypt messages in e2e rooms before they join (if history visibility allows it) – this is https://github.com/vector-im/riot-web/issues/3821
    3. Fixing the UX of verifying devices (including cross-signing devices), to minimise the pain in verifying device ownership. https://github.com/vector-im/riot-web/issues/2142 is the master bug for this.
  • Integrations.  A large slice of the team is working on our next-generation integration hosting platform, which is starting to look unspeakably awesome.  We’ll be yelling loudly about this once there’s something to see and play with…

  • Rich Text Editor.  This was originally a GSoC project from last year, but is finally on by default now in matrix-react-sdk – letting users author their messages with full WYSIWYG behaviour and critically have a radically improved autocompletion UI/UX, including emoji, user names, room names, etc.  You can check it out at riot.im/develop already :)
  • Mentions.  We’re finally semantically tagging references to users in messages so that they can be displayed nicely in the UI, and help with highlighting notifications!  This is due as soon as the Rich Text Editor work has finished.
  • Mobile SDKs.  The iOS & Android teams are currently on a mission to get parity between the iOS & Android SDKs and matrix-react-sdk.  This is stuff like implementing the new User Search API; Membership Event List Summaries; Dark theme(!); Translations; etc.  Progress is looking good!
  • Synapse performance.  Many many optimisations when calculating push rules when sending messages, which was taking up a substantial amount of the send path time.  Synapse develop looks to have reduced this significantly now – and as of Monday we’re running the new optimisations on Matrix.org.
  • Dendrite.  Lots of work going into implementing Invitations currently, including improvements to the overall append-only log architecture to support them nicely.
  • Riot-Static.  This is one of our GSoC projects this year, written by Michael Telatynski (t3chguy) – providing a full static (no-JS) read-only view of Matrix, suitable for dumb web browsers and search engines.  It’s looking really exciting (although needs CSS) – there’s a copy currently deployed over at https://stormy-bastion-98790.herokuapp.com/.

Meanwhile, there’s a tonne of stuff happening in the community – an excellent summary may be found at this Community Round Up blog post by uhoreg!

So: this is where things stand right now – the team is sprinting away getting all the stuff above landed, and meanwhile I’m spending most of my life worrying about funding.  We’ll try to keep blogging more regularly to give better visibility on progress on both the funding & development situation, as well as to ensure there’s a written public record as well as the regular supporter-only updates.  However, for the latest realtime updates and sneak previews and tidbits you’ll probably want to sign up on Patreon or Liberapay :D

–Matthew

A Call to Arms: Supporting Matrix!

Hi folks,

TL;DR: if you like Matrix (and especially if you’re building stuff on it), please support us via Patreon or Liberapay to keep the core team able to work on it full-time, otherwise the project is going to be seriously impacted.  And if you’re a company who is invested in Matrix (e.g. itching for Dendrite), please get in touch ASAP if you’d like to sponsor core development work from the team.  And if you’re a philanthropic billionaire who believes in our ideals of decentralisation, encryption, and open communication as a basic human right – we’d love to hear from you too O:-)

I was expecting this blog post to be the Matrix Summer Special, focusing entirely on the incredible progress and updates we’ve made in the last few months in Matrix.  However, instead I’m going to talk about something different and literally critical to Matrix’s success.

As many people know, Matrix.org development has historically been exclusively and very generously sponsored by a large multinational telecoms infrastructure company for whom most of the core team once built telco messaging apps.  However, despite the project progressing better than ever (more on that later), we have just had our funding dramatically cut by >60%.

We seem to be suffering from a darkly amusing paradox, as the rationale from our corporate overlords is essentially: “Wow! Matrix is doing great and growing well – and you seem to have all sorts of exciting people and companies using and building on it.  But we’ve been footing the whole development bill since the outset in May 2014, and this simply doesn’t feel fair.  We’re happy to keep funding though – but only if others do too!”.  In other words, in some ways we are a victim of our own success…

So we now find ourselves in the situation that despite the project looking better than ever and having a tonne of amazing stuff in the pipeline, we are suddenly missing the funding to keep the core team working on it.  And the team is quite sizeable – reflecting the ambition and size of Matrix: right now we have effectively 11 people working specifically on Matrix itself: 1 on Synapse, 1 on Dendrite, 1 on e2e crypto, 2 on matrix-react-sdk (which powers Riot/Web), 2 on matrix-ios-kit / matrix-ios-sdk, 2 on matrix-android-sdk, 1 on bridges, and me (Matthew) managing the overall project.  (This ignores folks who overlap the team who are working specifically on Riot stuff).

Over the last few years we’ve had countless people ask if they can financially support Matrix. We haven’t been able to accept it for various reasons, but now is the time for us to step towards a more independent setup, and avoid a repeat of the situation we’re currently facing by opening up to external support.

So we need help from the community to keep going!  Please head over to Patreon or Liberapay and put some money in the meter (or send some bitcoin to 1LxowEgsquZ3UPZ68wHf8v2MDZw82dVmAE). In return, you’ll get to keep Matrix evolving at a decent rate, be a member of the upcoming +supporters:matrix.org group (complete with flair badges!), and other benefits like access to #matrix-supporters:matrix.org – a new dedicated room for prioritised support, discounted goodies from Riot once paid services arrive, access to a weekly supporters-only status podcast(!), and of course receive our eternal thanks. :)

Meanwhile, if you’re a company who depends on Matrix: please get in touch. We have the option for you to sponsor core Matrix development (e.g. Dendrite) or for us to provide you with more targeted support or feature development.  We’re already talking to several organisations who want to accelerate Dendrite specifically – and the more support we have there the faster we can go.

We’d also like to thank UpCloud for sponsoring hosting for the Matrix.org synapse instances – UpCloud has been coping impressively with the massive I/O and CPU/RAM requirements we have, and we recommend them unreservedly for folks looking to run their own homeservers.

Finally, one of the longer term plans to help fund Matrix is to get sponsorship from Riot, once Riot starts offering paid services. So, if you’re an investor who’s interested in the for-profit sides of Riot (paid integrations and paid Matrix hosting) then please get in touch with the Riot team ASAP!

Moving forward we are confident that we can secure funding, through sponsorship and Riot paid services, but in truth this decision caught us by surprise and so we need help both long term but also right now!

And whatever the funding situation, we’re of course always looking for contributions for code, bug reports, or just spreading the word about the project too! :)

Status Update

(or scroll to next section to see why this is bigger than “just” decentralised encrypted communication)

Despite the funding issue, the project really is going very well. Our vital stats (as seen through the lens of the matrix.org synapse) are looking like this:


And meanwhile, looking back at the last big update (Holiday Special 2016), we can compare our progress with our goals for 2017 thus far:

  • Getting E2E Encryption out of beta ASAP.

This has progressed massively – we haven’t really yelled about it yet, but latest https://riot.im/develop/ now finally implements the ability to share message keys between clients to let them decrypt older history and fix “unable to decrypt” errors (Mobile coming soon).  Meanwhile various root causes of “unable to decrypt” errors have been gradually eliminated; I can’t actually remember the last time I saw one! Once key-sharing and improved device verification UX is fully tested and tuned we should be able to declare E2E out of beta.

 

  • Ensuring we can scale beyond Synapse – i.e. implement Dendrite

Likewise, Dendrite is on track: we’ve implemented all the Hard Stuff which forms the skeleton of Dendrite (core federation, message signing, /sync, message sending, media repository etc) – which takes us to over 50% of Phase 1. After phase 1, we will have an initial usable release for all the core functionality.  Synapse’s performance has also improved enormously this year.

 

  • Getting as many bots and bridges into Matrix as possible, and doing everything we can to support them, host them and help them be as high quality as possible – making the public federated Matrix network as useful and diverse as possible.

Bridges and bots continue – from the core team we have a ‘puppeting’ Telegram bridge (matrix-appservice-tg), and from the wider community we have Discord, Skype, Signal, new Rocket.Chat and more.  Getting them polished and live is certainly an area where we need more manpower though.

  • Supporting Riot’s leap to the mainstream, ensuring Matrix has at least one killer app.

Riot has been sprouting new releases every few weeks, with a huge emphasis on proving UX:

  • an entirely new streamlined sign-up process
  • the new concept of home pages
  • a user directory search that actually works
  • internationalised to 27 languages
  • compact layout
  • loads of desktop improvements
  • piwik analytics support; etc.

There is still a lot of UX work to be done, but it’s converging fast on being a great entry point into the Matrix ecosystem, driving its growth across different groups and communities..

Meanwhile, a massive update to the iOS & Android apps just landed yesterday, switching to an entirely new UI layout to separate People from Rooms, synchronized Read Markers, and more!

  • Adding the final major missing features:
    • Customisable User Profiles (this is almost done, actually)

This is still hovering at ‘almost done’, and will be needed for some of the implementation of Groups (see below)..

  • Groups (i.e. ability to define groups of users, and perform invites, powerlevels, etc. per-group as well as per-user)

Groups are also in testing in Synapse too!  These will probably be the single biggest change to Matrix that we’ve seen since E2E encryption landed: it changes the dynamic of the whole network, given users can explicitly declare allegiance to different groups, which in turn have their own home pages and directories etc.  It lets users form communities, and declare their participation in those communities (if desired), and also lets rooms be grouped together.  One of our single biggest requests has been “subrooms” and we’re incredibly excited to see how well Groups solve this.

  • Threading

Sadly no progress on Threading so far this year.

  • Editable events (and Reactions)

We’re hoping to get looking at this (at last!) once Groups are done.

  • Maturing and polishing the spec (we are way overdue a new release)

You’ll have noticed that in the “how many people work on Matrix?” stats above, we didn’t mention anyone working on the Spec.  Because right now there isn’t anyone explicitly maintaining it, unfortunately; updates are done best-effort when everyone’s primary responsibilities allow it.  That said, there’s quite a lot of good stuff currently unreleased on HEAD. This is something which is obviously critical to fix once we have sustainable funding sorted again.  We can only apologise to folks like the Ruma developers who have suffered from the spec lag. :(

  • Improving VoIP – especially conferencing.

VoIP is improving lots on iOS, thanks to Denis Morozov’s GSoC project, and meanwhile we have all new conferencing powered by Jitsi on the horizon in the next few weeks too.

  • Reputation/Moderation management (i.e. spam/abuse filtering).

Lots of thinking about this (see below), but no development yet.

  • Much-needed SDK performance work on matrix-{react,ios,android}-sdk.

About 40% of the desired performance work has happened here (although not all has gone live yet).

  • …and a few other things which would be premature to mention right now. :D

All will be revealed in the next week or two – but suffice it to say that Integrations are going to be getting a Lot More Useful™. :)

Reflections

There are very very few people actually working professionally on trying to build general-purpose open communication networks and protocols.  There’s us, some XMPP, IRCv3 and GNU Social/Mastodon folks, GNU Ring, Tox, Briar, Secure Scuttlebutt, IPFS, Status.im, Ricochet… and that’s literally all the major projects I can think of (sorry if I missed you!).  There’s probably only 50 developers in total working in this domain as their day job.

Meanwhile, there are literally hundreds of thousands of folks trudging away building more and more near-indistinguishable proprietary closed communication systems – trapping users inside ever more silos and fragmenting the basic ability to communicate on the ‘net.  It’s like a world where the open web was pushed into a tiny underground resistance, and everyone else was trapped in the walled gardens of AOL and Compuserve (or more contemporarily: Facebook, Twitter, WhatsApp etc).

In other words: the whole world of decentralised communication desperately needs your support.  This is a clear case of user choice and freedom: to give users the ability to pick who they trust with their data and metadata, without being forced into unilaterally trusting the Silicon Valley megacorps.  And this, dear Reader, is your chance to fix the world for the greater good. Seriously, the Matrix team is one of a handful in the world in a position to continue to push things in the right direction and avoid us falling into a permanent dystopia where communication is even more closed and proprietary than the Public Telephone Network!

Finally, there’s an even bigger issue at stake here than open communication.  As an open network, people can literally publish whatever content they like into Matrix – same as the web or the internet itself.  As a result, there’s scope for spam; abusive/malicious content; propaganda; and generally the whole spectrum of the best and worst of humanity.  Now, if we were a centralised system like Facebook, we might hire thousands of content moderators to frantically impose a rulebook on ‘acceptable’ content.  Or we might build invisible filter-bubbles for our users based on their social graph, cocooning them from scary unfamiliar content outside their social circles and reinforcing their preconceptions (whilst the resulting self-affirmation keeps them coming back, viewing more and more ads).

But we’re decentralised, and we have no absolute moral arbiter, and nor should we – on an open network it should be up to users and users alone to define and manage their own worldview and alignment.  Plus we are not fiscally obligated to keep users coming back to view more ads no matter what.  Instead, we are forced to confront the fundamental problem: building tools which empower users to curate and visualise their own content filters; letting them filter out the stuff they’re not interested in or find repellant… while still helping them be aware of their own viewpoint and the shape of the world beyond it.  We haven’t really started building this yet, but in the long term our feeling is that these tools will literally be vital for the survival of the human race (e.g. exposing anti-climate-change propaganda for what it is or helping users opt out of World War 3) – let alone the success of decentralisation.  A world where users blindly consume propaganda is doomed, and it’s a fascinating situation that the same tools which will allow Matrix users to tune out the rooms, users and conversations they’re not interested in could be directly applied to the bigger global problem.

So: Matrix needs you. Please become a supporter on Patreon or Liberapay, and help us save the world :)

– Matthew, Amandine & the whole Matrix.org team.

 

Synapse 0.21.1 released!

Hi folks – we forgot to mention that Synapse 0.21.1 was released last week.  This contains a important fix to the report-stats option, which was otherwise failing to report any usage stats for folks who had the option turned on.

This is a good opportunity to note that the report-stats option is really really important for the ongoing health of the Matrix ecosystem: when raising funding to continue working on Matrix we have to be able to demonstrate how the ecosystem is growing and why it’s a good idea to support us to work on it. In practice, the data we collect is: hostname, synapse version & uptime, total_users, total_nonbridged users, total_room_count, daily_active_users, daily_active_rooms, daily_messages and daily_sent_messages.

Folks: if you have turned off report-stats for whatever reason, please consider upgrading to 0.21.1 and turning it back on.

In return, the plan is that we’ll start to publish an official Grafana of the anonymised aggregate stats, probably embedded into the frontpage of Matrix.org, and then you and everyone else can have a view of the state of the Matrix universe. And critically, it’ll really help us continue to justify $ to spend on growing the project!

You can get Synapse 0.21.1 from https://github.com/matrix-org/synapse or https://github.com/matrix-org/synapse/releases/tag/v0.21.1 as normal.

Use you a Matrix for Great Good!

Hi all,

We’re currently looking into different ways that Matrix is being used in the wild, and an important question that has come up is whether anyone is using Matrix yet for decentralised communication in parts of the world where centralised communication poses a problem – due to bad connectivity or privacy concerns.  Similarly we’d love to hear from anyone who is seriously trialling Matrix’s end-to-end encryption for use in geographies where privacy is a particularly big issue for human rights.

So, if anyone has stories (anecdotal or otherwise) about how they’re using or planning to use Matrix to make the world a better place, in a location where that’s particularly critical, please can you let us know as soon as possible (@matthew:matrix.org or @Amandine:matrix.org).  This is fairly urgent because we’re currently looking at various options for how to prioritise effort and funding for Matrix, and if there are people out there who are depending on Matrix in this manner it would significantly help us support them!

thanks,

Matthew, Amandine & the team.

Update on Matrix.org homeserver reliability

Hi folks,

We’ve had a few outages over the last week on the Matrix.org homeserver which have caused problems for folks using bridges or accounts hosted on matrix.org itself – we’d like to apologise to everyone who’s been caught in the crossfire.  In the interests of giving everyone visibility on what’s going on and what we’re doing about it (and so folks can learn from our mistakes! :), here’s a quick writeup (all times are UTC):

  • 2017-05-04 21:05: The datacenter where we host matrix.org performs an emergency unscheduled upgrade of the VM host where the main matrix.org HS & DB master lives.  This means a live-migration of the VM onto another host, which freezes the (huge) VM for 9 minutes, during which service is (obviously) down.  Monitoring fires; we start investigating and try to get in on the console, but by the point we’re considering failing over to the hot-spare, the box has come back and recovers fine other than a load spike as all the traffic catches up.  The clock however is off by 9 minutes due to its world having paused.
  • 2017-05-04 22:30: We step NTP on the host to fix the clock (maximum clock skew on ISC ntpd is 500ppm, meaning it would take weeks to reconverge naturally, during which time we’re issuing messages with incorrect timestamps).
  • 2017-05-05 01:25: Network connectivity breaks between the matrix.org homeserver and the DC where all of our bridges/bots are hosted.
  • 2017-05-05 01:40: Monitoring alerts fire for bridge traffic activity and are picked up.  After trying to restart the VPN between the DC a few times, it turns out that the IP routes needed for the VPN have mysteriously disappeared.
  • 2017-05-05 02:23: Routes are manually readded and VPN recovers and traffic starts flowing again.  It turns out that the routes are meant to be maintained by a post-up script in /etc/network/interfaces, which assumes that /sbin/ip is on the path.  Historically this hasn’t been a problem as the DHCP lease on the host has never expired (only been renewed every 6 hours) – but the time disruption caused by the live-migration earlier means that on this renewal cycle the lease actually expires and the routes are lost and not-readded.  Basic bridging traffic checks are done (e.g. Freenode->Matrix).
  • 2017-05-05 08:30: Turns out that whilst Freenode->Matrix traffic was working, Matrix->Freenode was wedged due to a missing HTTP timeout in the AS logic on Synapse.  Synapse is restarted and the bug fixed.
  • …the week goes by…
  • 2017-05-11 18:00: (Unconnected to the rest of this outage, an IRC DDoS on GIMPnet cause intermittent load problems and delayed messages on matrix.org; we turn off the bridge for a few hours until it subsides).
  • 2017-05-12 02:50: The postgres partition on the matrix.org DB master diskfills and postgres halts.  Monitoring alerts fire (once, phone alerts), but the three folks on call manage to sleep through their phone ringing.
  • 2017-05-12 04:45: Folks get woken up and notice the outage; clear up diskspace; restart postgres. Meanwhile, synapse appears to recover, other than mysteriously refusing to send messages from local users.  Investigation commences in the guts of the database to try to see what corruption has happened.
  • 2017-05-12 06:00: We realise that nobody has actually restarted synapse since the DB outage begun, and the failure is probably a known issue where worker replication can get fail and cause the master synapse process to fail to process writes.  Synapse is restarted; everything recovers (including bridges).
  • 2017-05-12 06:20: Investigation into the cause of the diskfill reveals it to be due to postgres replication logs (WALs) stacking up on the DB master, due to replication having broken to a DB slave during the previous networking outage.  Monitoring alerts triggered but weren’t escalated due to a problem in PagerDuty.

Lessons learned:

  • Test your networking scripts and always check your box self-recovers after a restart (let alone a DHCP renewal).
  • Don’t use DHCP in production datacenters unless you really have no choice; it just adds potential ways for everything to break.
  • We need better end-to-end monitoring for bridged traffic.
  • We need to ensure HS<->Bridge traffic is more reliable (improved by fixing timeout logic in synapse).
  • We need better monitoring and alerting of DB replication traffic.
  • We need to escalate PagerDuty phone alerts more aggressively (done).
  • We need better alerting for disk fill thresholds (especially “time until fill”, remembering to take into account the emergency headroom reserved by the filesystem for the superuser).
  • We should probably have scripts to rapidly (or even automatedly) switch between synapse master & hot-spare, and to promote DB slaves in the event of a master failure.

Hopefully this is the last we’ve seen of this root cause; we’ll be working through the todo list above.  Many apologies again for the instability – however please do remember that you can (and should!) run your own homeserver & bridges to stay immune to whatever operational dramas we have with the matrix.org instance!