Tech

62 posts tagged with "Tech" (See all categories)

Synapse 0.26 released!

2018-01-05 — Tech — Matthew Hodgson

Hi folks,

Synapse 0.26 is here (with no changes since RC1 which we released just before Christmas). It's a general maintenance release, albeit with a few new features but mainly lots of bugfixes and general refinements. Enjoy!

As always, you can get it from https://github.com/matrix-org/synapse/releases/tag/v0.26.0.

🔗Changes in synapse v0.26.0 (2018-01-05)

No changes since v0.26.0-rc1

🔗Changes in synapse v0.26.0-rc1 (2017-12-13)

Features:

Add ability for ASes to publicise groups for their users (PR #2686)
Add all local users to the user_directory and optionally search them (PR #2723)
Add support for custom login types for validating users (PR #2729)

Changes:

Update example Prometheus config to new format (PR #2648) Thanks to @krombel!
Rename redact_content option to include_content in Push API (PR #2650)
Declare support for r0.3.0 (PR #2677)
Improve upserts (PR #2684, #2688, #2689, #2713)
Improve documentation of workers (PR #2700)
Improve tracebacks on exceptions (PR #2705)
Allow guest access to group APIs for reading (PR #2715)
Support for posting content in federation_client script (PR #2716)
Delete devices and pushers on logouts etc (PR #2722)

Bug fixes:

Fix database port script (PR #2673)
Fix internal server error on login with ldap_auth_provider (PR #2678) Thanks to @jkolo!
Fix error on sqlite 3.7 (PR #2697)
Fix OPTIONS on preview_url (PR #2707)
Fix error handling on dns lookup (PR #2711)
Fix wrong avatars when inviting multiple users when creating room (PR #2717)
Fix 500 when joining matrix-dev (PR #2719)

Synapse 0.25 is out... as is Matrix Specification 0.3(!!!)

2017-11-15 — Tech — Matthew Hodgson

Hi all,

Today is a crazy release day here - not only do we have Synapse 0.25, but we've also made a formal release of the Matrix Specification (CS API) for the first time in 16 months!

🔗Matrix CS API 0.3

Talking first about the spec update: the workflow of the Matrix spec is that new experimental features get added to an /unstable API prefix, and then whenever we release the Matrix spec, these get moved over to being part of the /r0 prefix (or whatever version we happen to be on). We've been very constrained on manpower to work on the spec over the last ~18 months, but we've been keeping it up-to-date on a best effort basis, with a bit of help from the wider community. As such, this latest release does not contain all the latest APIs (and certainly not experimental ones like Groups/Communities which are still evolving), but it does release all of the unstable ones which we've managed to document and which are considered stable enough to become part of the 'r0' prefix. Going forwards, we're hoping that the wider community will help us fill in the remaining gaps (i.e. propose PRs against the matrix-org/matrix-doc repository to formalise the various spec drafts flying around the place) - and we're also hoping (if/when funding crisis is abated) to locate full-time folk to work on the spec.

The full changelog for 0.3 of the spec is:

Breaking changes:
- Change the rule kind of .m.rule.contains_display_name from underride to override. This works with all known clients which support push rules, but any other clients implementing the push rules API should be aware of this change. This makes it simple to mute rooms correctly in the API (#373).
- Remove /tokenrefresh from the API (#395).
- Remove requirement that tokens used in token-based login be macaroons (#395).
Changes to the API which will be backwards-compatible for clients:
- Add filename parameter to POST /_matrix/media/r0/upload (#364).
- Document CAS-based client login and the use of m.login.token in /login (#367).
- Make origin_server_ts a mandatory field of room events (#379).
- Add top-level account_data key to the responses to GET /sync and GET /initialSync (#380).
- Add is_direct flag to POST /createRoom and invite member event. Add 'Direct Messaging' module (#389).
- Add contains_url option to RoomEventFilter (#390).
- Add filter optional query param to /messages (#390).
- Add 'Send-to-Device messaging' module (#386).
- Add 'Device management' module (#402).
- Require that User-Interactive auth fallback pages call window.postMessage to notify apps of completion (#398).
- Add pagination and filter support to /publicRooms. Change response to omit fields rather than return null. Add estimate of total number of rooms in list. (#388).
- Allow guest accounts to use a number of endpoints which are required for end-to-end encryption. (#751).
- Add key distribution APIs, for use with end-to-end encryption. (#894).
- Add m.room.pinned_events state event for rooms. (#1007).
- Add mention of ability to send Access Token via an Authorization Header.
- New endpoints:
  - GET /joined_rooms (#999).
  - GET /rooms/{'{'}roomId{'}'}/joined_members (#999).
  - GET /account/whoami (#1063).
  - GET /media/{'{'}version{'}'}/preview_url (#1064).
Spec clarifications:
- Add endpoints and logic for invites and third-party invites to the federation spec and update the JSON of the request sent by the identity server upon 3PID binding (#997)
- Fix "membership" property on third-party invite upgrade example (#995)
- Fix response format and 404 example for room alias lookup (#960)
- Fix examples of m.room.member event and room state change, and added a clarification on the membership event sent upon profile update (#950).
- Spell out the way that state is handled by POST /createRoom (#362).
- Clarify the fields which are applicable to different types of push rule (#365).
- A number of clarifications to authentication (#371).
- Correct references to user_id which should have been sender (#376).
- Correct inconsistent specification of redacted_because fields and their values (#378).
- Mark required fields in response objects as such (#394).
- Make m.notice description a bit harder in its phrasing to try to dissuade the same issues that occurred with IRC (#750).
- GET /user/{'{'}userId{'}'}/filter/{'{'}filterId{'}'} requires authentication (#1003).
- Add some clarifying notes on the behaviour of rooms with no m.room.power_levels event (#1026).
- Clarify the relationship between username and user_id in the /register API (#1032).
- Clarify rate limiting and security for content repository. (#1064).

...and you can read the spec itself of course over at https://matrix.org/docs/spec. It's worth noting that we have slightly bent the rules by including three very minor 'breaking changes' in 0.3, but all for features which to our knowledge nobody is depending on in the wild. Technically this should mean bumping the major version prefix (i.e. moving to r1), but given how minor and nonimpacting these are we're turning a blind eye this time.

🔗Meanwhile, Synapse 0.25 is out!

This is a medium-sized release; the main thing being to support configurable room visibility within groups (so that whenever you add a room to a group, you're not forced into sharing their existence with the general public, but can choose to just tell group members about them). There's also a bunch of useful bug fixes and some performance improvements, including lots of contributions from the community this release (thank you!). Full release notes are:

🔗Changes in synapse v0.25.0 (2017-11-15)

Bug fixes:

Fix port script (PR #2673)

🔗Changes in synapse v0.25.0-rc1 (2017-11-14)

Features:

Add is_public to groups table to allow for private groups (PR #2582)
Add a route for determining who you are (PR #2668) Thanks to @turt2live!
Add more features to the password providers (PR #2608, #2610, #2620, #2622, #2623, #2624, #2626, #2628, #2629)
Add a hook for custom rest endpoints (PR #2627)
Add API to update group room visibility (PR #2651)

Changes:

Ignore tags when generating URL preview descriptions (PR #2576) Thanks to @maximevaillancourt!
Register some /unstable endpoints in /r0 as well (PR #2579) Thanks to @krombel!
Support /keys/upload on /r0 as well as /unstable (PR #2585)
Front-end proxy: pass through auth header (PR #2586)
Allow ASes to deactivate their own users (PR #2589)
Remove refresh tokens (PR #2613)
Automatically set default displayname on register (PR #2617)
Log login requests (PR #2618)
Always return is_public in the /groups/:group_id/rooms API (PR #2630)
Avoid no-op media deletes (PR #2637) Thanks to @spantaleev!
Fix various embarrassing typos around user_directory and add some doc. (PR #2643)
Return whether a user is an admin within a group (PR #2647)
Namespace visibility options for groups (PR #2657)
Downcase UserIDs on registration (PR #2662)
Cache failures when fetching URL previews (PR #2669)

Bug fixes:

Fix port script (PR #2577)
Fix error when running synapse with no logfile (PR #2581)
Fix UI auth when deleting devices (PR #2591)
Fix typo when checking if user is invited to group (PR #2599)
Fix the port script to drop NUL values in all tables (PR #2611)
Fix appservices being backlogged and not receiving new events due to a bug in notify_interested_services (PR #2631) Thanks to @xyzz!
Fix updating rooms avatar/display name when modified by admin (PR #2636) Thanks to @farialima!
Fix bug in state group storage (PR #2649)
Fix 500 on invalid utf-8 in request (PR #2663)

🔗Finally...

If you haven't noticed already, Riot/Web 0.13 is out today, as is Riot/iOS 0.6.2 and Riot/Android 0.7.4. These contain massive improvements across the board - particularly mainstream Communities support at last on Riot/Web; CallKit/PushKit on Riot/iOS thanks to Denis Morozov (GSoC 2017 student for Matrix) and Share Extension on iOS thanks to Aram Sargsyan (also GSoC 2017 student!); and End-to-end Key Sharing on Riot/Android and a full rewrite of the VoIP calling subsystem on Android.

Rather than going on about it here, though, there's a full write-up over on the Riot Blog.

And so there you go - new releases for eeeeeeeeveryone! Enjoy! :)

--Matthew, Amandine & the team.

Synapse 0.22.0 released!

2017-07-06 — Tech — Thomas Lant

Hi Synapsefans,

Synapse 0.22.0 has just been released! This release lands a few interesting features:

The new User directory API which supports Matrix clients' providing a much more intuitive and effective user search capability by exposing a list of:
- Everybody your user shares a room with, and
- Everybody in a public room your homeserver knows about
New support for server admins, including a Shutdown Room API (to remove a room from a local server) and a Media Quarrantine API (to render a media item inaccessible without its actually being deleted)

As always there are lots of bug fixes and performance improvements, including increasing the default cache factor size from 0.1 to 0.5 (should improve performance for those running their own homeservers).

You can get Synapse 0.22.0 from https://github.com/matrix-org/synapse or https://github.com/matrix-org/synapse/releases/tag/v0.22.0 as normal.

🔗Changes in synapse v0.22.0 (2017-07-06)

No changes since v0.22.0-rc2

🔗Changes in synapse v0.22.0-rc2 (2017-07-04)

Changes:

Improve performance of storing user IPs (PR #2307, #2308)
Slightly improve performance of verifying access tokens (PR #2320)
Slightly improve performance of event persistence (PR #2321)
Increase default cache factor size from 0.1 to 0.5 (PR #2330)

Bug fixes:

Fix bug with storing registration sessions that caused frequent CPU churn (PR #2319)

🔗Changes in synapse v0.22.0-rc1 (2017-06-26)

Features:

Add a user directory API (PR #2252, and many more)
Add shutdown room API to remove room from local server (PR #2291)
Add API to quarantine media (PR #2292)
Add new config option to not send event contents to push servers (PR #2301) Thanks to @cjdelisle!

Changes:

Various performance fixes (PR #2177, #2233, #2230, #2238, #2248, #2256, #2274)
Deduplicate sync filters (PR #2219) Thanks to @krombel!
Correct a typo in UPGRADE.rst (PR #2231) Thanks to @aaronraimist!
Add count of one time keys to sync stream (PR #2237)
Only store event_auth for state events (PR #2247)
Store URL cache preview downloads separately (PR #2299)

Bug fixes:

Fix users not getting notifications when AS listened to that user_id (PR #2216) Thanks to @slipeer!
Fix users without push set up not getting notifications after joining rooms (PR #2236)
Fix preview url API to trim long descriptions (PR #2243)
Fix bug where we used cached but unpersisted state group as prev group, resulting in broken state of restart (PR #2263)
Fix removing of pushers when using workers (PR #2267)
Fix CORS headers to allow Authorization header (PR #2285) Thanks to @krombel!

Synapse 0.21.1 released!

2017-06-22 — Tech — Matthew Hodgson

Hi folks - we forgot to mention that Synapse 0.21.1 was released last week. This contains a important fix to the report-stats option, which was otherwise failing to report any usage stats for folks who had the option turned on.

This is a good opportunity to note that the report-stats option is really really important for the ongoing health of the Matrix ecosystem: when raising funding to continue working on Matrix we have to be able to demonstrate how the ecosystem is growing and why it's a good idea to support us to work on it. In practice, the data we collect is: hostname, synapse version & uptime, total_users, total_nonbridged users, total_room_count, daily_active_users, daily_active_rooms, daily_messages and daily_sent_messages.

Folks: if you have turned off report-stats for whatever reason, please consider upgrading to 0.21.1 and turning it back on.

In return, the plan is that we'll start to publish an official Grafana of the anonymised aggregate stats, probably embedded into the frontpage of Matrix.org, and then you and everyone else can have a view of the state of the Matrix universe. And critically, it'll really help us continue to justify $ to spend on growing the project!

You can get Synapse 0.21.1 from https://github.com/matrix-org/synapse or https://github.com/matrix-org/synapse/releases/tag/v0.21.1 as normal.

Synapse 0.21.0 is released!

2017-05-18 — Tech — Thomas Lant

Hi all,

Synapse 0.21.0 was released a moment ago. This release lands a number of performance improvements and stability fixes, plus a couple of small features.

For those of you upgrading https://github.com/matrix-org/synapse has the details as usual. Full changelog follows:

🔗Changes in synapse v0.21.0 (2017-05-17)

Features:

Add per user rate-limiting overrides (PR #2208)
Add config option to limit maximum number of events requested by /sync and /messages (PR #2221) Thanks to @psaavedra!

Changes:

Various small performance fixes (PR #2201, #2202, #2224, #2226, #2227, #2228, #2229)
Update username availability checker API (PR #2209, #2213)
When purging, don't de-delta state groups we're about to delete (PR #2214)
Documentation to check synapse version (PR #2215) Thanks to @hamber-dick!
Add an index to event_search to speed up purge history API (PR #2218)

Bug fixes:

Fix API to allow clients to upload one-time-keys with new sigs (PR #2206)

🔗Changes in synapse v0.21.0-rc2 (2017-05-08)

Changes:

Always mark remotes as up if we receive a signed request from them (PR #2190)

Bug fixes:

Fix bug where users got pushed for rooms they had muted (PR #2200)

🔗Changes in synapse v0.21.0-rc1 (2017-05-08)

Features:

Add username availability checker API (PR #2183)
Add read marker API (PR #2120)

Changes:

Enable guest access for the 3pl/3pid APIs (PR #1986)
Add setting to support TURN for guests (PR #2011)
Various performance improvements (PR #2075, #2076, #2080, #2083, #2108, #2158, #2176, #2185)
Make synctl a bit more user friendly (PR #2078, #2127) Thanks @APwhitehat!
Replace HTTP replication with TCP replication (PR #2082, #2097, #2098, #2099, #2103, #2014, #2016, #2115, #2116, #2117)
Support authenticated SMTP (PR #2102) Thanks @DanielDent!
Add a counter metric for successfully-sent transactions (PR #2121)
Propagate errors sensibly from proxied IS requests (PR #2147)
Add more granular event send metrics (PR #2178)

Bug fixes:

Fix nuke-room script to work with current schema (PR #1927) Thanks @zuckschwerdt!
Fix db port script to not assume postgres tables are in the public schema (PR #2024) Thanks @jerrykan!
Fix getting latest device IP for user with no devices (PR #2118)
Fix rejection of invites to unreachable servers (PR #2145)
Fix code for reporting old verify keys in synapse (PR #2156)
Fix invite state to always include all events (PR #2163)
Fix bug where synapse would always fetch state for any missing event (PR #2170)
Fix a leak with timed out HTTP connections (PR #2180)
Fix bug where we didn't time out HTTP requests to ASes (PR #2192)

Docs:

Clarify doc for SQLite to PostgreSQL port (PR #1961) Thanks @benhylau!
Fix typo in synctl help (PR #2107) Thanks @HarHarLinks!
web_client_location documentation fix (PR #2131) Thanks @matthewjwolff!
Update README.rst with FreeBSD changes (PR #2132) Thanks @feld!
Clarify setting up metrics (PR #2149) Thanks @encks!

Update on Matrix.org homeserver reliability

2017-05-12 — Tech — Matthew Hodgson

Hi folks,

We've had a few outages over the last week on the Matrix.org homeserver which have caused problems for folks using bridges or accounts hosted on matrix.org itself - we'd like to apologise to everyone who's been caught in the crossfire. In the interests of giving everyone visibility on what's going on and what we're doing about it (and so folks can learn from our mistakes! :), here's a quick writeup (all times are UTC):

2017-05-04 21:05: The datacenter where we host matrix.org performs an emergency unscheduled upgrade of the VM host where the main matrix.org HS & DB master lives. This means a live-migration of the VM onto another host, which freezes the (huge) VM for 9 minutes, during which service is (obviously) down. Monitoring fires; we start investigating and try to get in on the console, but by the point we're considering failing over to the hot-spare, the box has come back and recovers fine other than a load spike as all the traffic catches up. The clock however is off by 9 minutes due to its world having paused.
2017-05-04 22:30: We step NTP on the host to fix the clock (maximum clock skew on ISC ntpd is 500ppm, meaning it would take weeks to reconverge naturally, during which time we're issuing messages with incorrect timestamps).
2017-05-05 01:25: Network connectivity breaks between the matrix.org homeserver and the DC where all of our bridges/bots are hosted.
2017-05-05 01:40: Monitoring alerts fire for bridge traffic activity and are picked up. After trying to restart the VPN between the DC a few times, it turns out that the IP routes needed for the VPN have mysteriously disappeared.
2017-05-05 02:23: Routes are manually readded and VPN recovers and traffic starts flowing again. It turns out that the routes are meant to be maintained by a post-up script in /etc/network/interfaces, which assumes that /sbin/ip is on the path. Historically this hasn't been a problem as the DHCP lease on the host has never expired (only been renewed every 6 hours) - but the time disruption caused by the live-migration earlier means that on this renewal cycle the lease actually expires and the routes are lost and not-readded. Basic bridging traffic checks are done (e.g. Freenode->Matrix).
2017-05-05 08:30: Turns out that whilst Freenode->Matrix traffic was working, Matrix->Freenode was wedged due to a missing HTTP timeout in the AS logic on Synapse. Synapse is restarted and the bug fixed.
...the week goes by...
2017-05-11 18:00: (Unconnected to the rest of this outage, an IRC DDoS on GIMPnet cause intermittent load problems and delayed messages on matrix.org; we turn off the bridge for a few hours until it subsides).
2017-05-12 02:50: The postgres partition on the matrix.org DB master diskfills and postgres halts. Monitoring alerts fire (once, phone alerts), but the three folks on call manage to sleep through their phone ringing.
2017-05-12 04:45: Folks get woken up and notice the outage; clear up diskspace; restart postgres. Meanwhile, synapse appears to recover, other than mysteriously refusing to send messages from local users. Investigation commences in the guts of the database to try to see what corruption has happened.
2017-05-12 06:00: We realise that nobody has actually restarted synapse since the DB outage begun, and the failure is probably a known issue where worker replication can get fail and cause the master synapse process to fail to process writes. Synapse is restarted; everything recovers (including bridges).
2017-05-12 06:20: Investigation into the cause of the diskfill reveals it to be due to postgres replication logs (WALs) stacking up on the DB master, due to replication having broken to a DB slave during the previous networking outage. Monitoring alerts triggered but weren't escalated due to a problem in PagerDuty.

Lessons learned:

Test your networking scripts and always check your box self-recovers after a restart (let alone a DHCP renewal).
Don't use DHCP in production datacenters unless you really have no choice; it just adds potential ways for everything to break.
We need better end-to-end monitoring for bridged traffic.
We need to ensure HS<->Bridge traffic is more reliable (improved by fixing timeout logic in synapse).
We need better monitoring and alerting of DB replication traffic.
We need to escalate PagerDuty phone alerts more aggressively (done).
We need better alerting for disk fill thresholds (especially "time until fill", remembering to take into account the emergency headroom reserved by the filesystem for the superuser).
We should probably have scripts to rapidly (or even automatedly) switch between synapse master & hot-spare, and to promote DB slaves in the event of a master failure.

Hopefully this is the last we've seen of this root cause; we'll be working through the todo list above. Many apologies again for the instability - however please do remember that you can (and should!) run your own homeserver & bridges to stay immune to whatever operational dramas we have with the matrix.org instance!

Synapse 0.20.0 is released!

2017-04-11 — Tech — Matthew Hodgson

Hi folks,

Synapse 0.20.0 was released a few hours ago - this is a major new release with loads of stability and performance fixes and some new features too. The main headlines are:

Support for using phone numbers as 3rd party identifiers as well as email addresses! This is huge: letting you discover other users on Matrix based on whether they've linked their phone number to their matrix account, and letting you log in using your phone number as your identifier if you so desire. Users of systems like WhatsApp should find this both familiar and useful ;)
Fixes some very nasty failure modes where the state of a room could be reset if a homeserver received an event it couldn't verify. Folks who have suffered rooms suddenly losing their name/icon/topic should particularly upgrade - this won't fix the rooms retrospectively (your server will need to rejoin the room), but it should fix the problem going forwards.
Improves the retry schedule over federation significantly - previously there were scenarios where synapse could try to retry aggressively on servers which were offline. This fixes that.
Significant performance improvements to /publicRooms, /sync, and other endpoints.
Lots of juicy bug fixes.

We highly recommend upgrading (or installing!) asap - https://github.com/matrix-org/synapse has the details as usual. Full changelog follows:

🔗Changes in synapse v0.20.0 (2017-04-11)

Bug fixes:

Fix joining rooms over federation where not all servers in the room saw the new server had joined (PR #2094)

🔗Changes in synapse v0.20.0-rc1 (2017-03-30)

Features:

Add delete_devices API (PR #1993)
Add phone number registration/login support (PR #1994, #2055)

Changes:

Use JSONSchema for validation of filters. Thanks @pik! (PR #1783)
Reread log config on SIGHUP (PR #1982)
Speed up public room list (PR #1989)
Add helpful texts to logger config options (PR #1990)
Minor /sync performance improvements. (PR #2002, #2013, #2022)
Add some debug to help diagnose weird federation issue (PR #2035)
Correctly limit retries for all federation requests (PR #2050, #2061)
Don't lock table when persisting new one time keys (PR #2053)
Reduce some CPU work on DB threads (PR #2054)
Cache hosts in room (PR #2060)
Batch sending of device list pokes (PR #2063)
Speed up persist event path in certain edge cases (PR #2070)

Bug fixes:

Fix bug where current_state_events renamed to current_state_ids (PR #1849)
Fix routing loop when fetching remote media (PR #1992)
Fix current_state_events table to not lie (PR #1996)
Fix CAS login to handle PartialDownloadError (PR #1997)
Fix assertion to stop transaction queue getting wedged (PR #2010)
Fix presence to fallback to last_active_ts if it beats the last sync time. Thanks @Half-Shot! (PR #2014)
Fix bug when federation received a PDU while a room join is in progress (PR #2016)
Fix resetting state on rejected events (PR #2025)
Fix installation issues in readme. Thanks @ricco386 (PR #2037)
Fix caching of remote servers' signature keys (PR #2042)
Fix some leaking log context (PR #2048, #2049, #2057, #2058)
Fix rejection of invites not reaching sync (PR #2056)

Opening up cyberspace with Matrix and WebVR!

2017-04-04 — Tech — Matthew Hodgson

TL;DR: here's the demo!

Hi everyone,

Today is a special day, the sort of day where you take a big step towards an ultimate dream. Starting Matrix and seeing it gaining momentum is already huge for us, a once in a lifetime opportunity. But one of the crazier things which drove us to create Matrix is the dream of creating cyberspace; the legendary promised land of the internet.

Whether it's the Matrix of Neuromancer, the Metaverse of Snow Crash or the Other Plane of True Names, an immersive 3D environment where people can meet from around the world to communicate, create and share is the ultimate expression of the Internet's potential as a way to connect: the idea of an open, neutral, decentralised virtual reality within the 'net.

This is essentially the software developer equivalent of lying on your back at night, looking up at the stars, and wondering if you'll ever fly among them... and Matrix is not alone in dreaming of this! There have been many walled-garden virtual worlds over the years - Second Life, Habbo Hotel, all of the MMORPGs, Project Sansar etc. And there have been decentralised worlds which lack the graphics but share the vision - whether it's FidoNet, Usenet, IRC servers, XMPP, the blogosphere or Matrix as it's used today. And there are a few ambitious projects like Croquet/OpenCobalt, Open Simulator, JanusVR or High Fidelity which aim for a decentralised cyberspace, albeit without defining an open standard.

But despite all this activity, where is the open cyberspace? Where is the universal fabric which could weave these worlds together? Where is the VR equivalent of The Web itself? Where is the neutral communication layer that could connect the galaxies of different apps and users into a coherent reality? How do you bridge between today's traditional web apps and their VR equivalents?

Aside from cultural ones, we believe there are three missing ingredients which have been technically holding back the development of an open cyberspace so far:

The hardware
Client software support (i.e. apps)
A universal real-time data layer to store the space

Nowadays the hardware problem is effectively solved: the HTC Vive, Oculus Rift and even Google Cardboard have brought VR displays to the general public. Meanwhile, accelerometers and head-tracking turn normal screens into displays for immersive content without even needing goggles, giving everyone a window into a virtual world.

Client software is a more interesting story: If there are many custom and proprietary VR apps that already exist out there, almost none of them can connect to other servers than the ones ran by their own vendors, or even other services and apps. An open neutral cyberspace is just like the web: it needs the equivalent of web browsers, i.e. ubiquitous client apps which can connect to any servers and services written by any vendors and hosted by any providers, communicating together via an open common language. And while web browsers of course exist, until very recently there was no way to link them into VR hardware.

This has changed with the creation of WebVR by Mozilla - defining an API to let browsers render VR content, gracefully degrading across hardware and platforms such that you get the best possible experience all the way from a top-end gaming PC + Vive, down to tapping on a link on a simple smartphone. WebVR is a genuine revolution: suddenly every webapp on the planet can create a virtual world! And frameworks like A-Frame, aframe-react and React VR make it incredibly easy and fun to do.

So looking back at our list, the final missing piece is nothing less than a backbone: some kind of data layer to link these apps together. Right now, all the WebVR apps out there are little islands - each its own isolated walled garden and there is no standard way to provide shared experiences. There is no standard way for users to communicate between these worlds (or between the VR and non-VR web) - be that by messaging, VoIP, Video or even VR interactions. There is no standard way to define an avatar, its location and movement within a world, or how it might travel between worlds. And finally, there is no standard way to describe the world's state in general: each webapp is free to manage its scene and its content any way it likes; there is nowhere to expose the realtime scene-graph of the world such that other avatars, bots, apps, services etc. can interact with it. The same way there is no standard way to exchange messages or reuse a user profile between messaging apps today: if the cyberspace is taking shape as we speak, it is definitely not taking the path of openness. At least not yet.

Predictably enough, it's this last point of the 'missing data layer for cyberspace' which we've been thinking about with Matrix: an open, neutral, decentralised meta-network powering or connecting these worlds. To start with, we've made Matrix available as a generic communications layer for VR, taking WebVR (via A-Frame) and combining it with matrix-js-sdk, as an open, secure and decentralised way to place voip calls, video calls and instant messaging both within and between WebVR apps and the rest of the existing Matrix ecosystem (e.g. apps like Riot).

In fact, the best way is to test it live: we've put together a quick demo at https://matrix.org/vrdemo to show it off, so please give it a go!

In the demo you get:

a virtual lobby, providing a 1:1 WebRTC video call via Matrix through to a ‘guide' user of your choice anywhere else in Matrix (VR or not). From the lobby you can jump into two other apps:
a video conference, calling between all the participants of a given Matrix room in VR (no interop yet with other Matrix apps)
a 'virtual tourism' example, featuring a 1:1 WebRTC video call with a guide, superimposed over the top of the user going skiing through 360 degree video footage.

Video calling requires a WebRTC-capable browser (Chrome or Firefox). Unfortunately no iOS browsers support it yet. If you have dedicated VR hardware (Vive or Rift), you'll have to configure your browser appropriately to use the demo - see https://webvr.rocks for the latest details.

Needless to say, the demo's open sourced under the Apache License like all things Matrix - you can check out the code from https://github.com/matrix-org/matrix-vr-demo. Huge kudos to Richard Lewis, Rob Swain and Ben Lewis for building this out - and to Aidan Taub and Tom Elliott for providing the 360 degree video footage!

The demo is quite high-bandwidth and hardware intensive, so here's a video of it in action, just in case:

Now, it's important to understand that here we're using Matrix as a standard communications API for VR, but we're not using Matrix to store any VR world data (yet). The demo uses plain A-Frame via aframe-react to render its world: we are not providing an API which exposes the world itself onto the network for folks to interact with and extend. This is because Matrix is currently optimised for storing and synchronising two types of data structure: decentralised timelines of conversation data, and arbitrary decentralised key-value data (e.g. room names, membership, topics).

However, the job of storing arbitrary world data requires storing and flexibly querying it as an object graph of some kind - e.g. as a scene graph hierarchy. Doing this efficiently whilst supporting Matrix's decentralised eventual consistency model is tantamount to evolving Matrix into being a generic decentralised object-graph database (whilst upholding the constraints of that virtual world). This is tractable, but it's a bunch more work than just supporting the eventually-consistent timeline & key-value store we have today. It's something we're thinking about though. :)

Also, Matrix is currently not super low-latency - on a typical busy Synapse deployment event transmission between clients has a latency of 50-200ms (ignoring network). This is fine for instant messaging and setting up VoIP calls etc, but useless for publishing the realtime state of a virtual world: having to wait 200ms to be told that something happened in an interactive virtual world would be a terrible experience. However, there are various fixes we can do here: Matrix itself is getting faster; Dendrite is expected to be one or two orders of magnitude faster than Synapse. We could also use Matrix simply as a signalling layer in order to negotiate a lower latency realtime protocol to describe world data updates (much as we use Matrix as a signalling layer for setting up RTP streams for VoIP calls or MIDI sessions).

Finally, you need somewhere to store the world assets themselves (textures, sounds, Collada or GLTF assets, etc). This is no different to using attachments in Matrix today - this could be as plain HTTP, or via the Matrix decentralised content store (mxc:// URLs), or via something like IPFS.

This said, it's only a matter of time before someone starts storing world data itself in Matrix. We have more work to do before it's a tight fit, but this has always been one of the long-term goals of the project and we're going to do everything we can to support it well.

So: this is the future we're thinking of. Obviously work on today's Matrix servers, clients, spec & bridges is our focus and priority right now and we lots of work left there - but the longer term plan is critical too. Communication in VR is pretty much a blank canvas right now, and Matrix can be the connecting fabric for it - which is unbelievably exciting. Right now our demo is just a PoC - we'd encourage all devs reading this to have a think about how to extend it, and how we all can build together the new frontier of cyberspace!

Finally, if you're interested in chatting more about VR on Matrix, come hang out over at #vr:matrix.org!

Matthew, Amandine & the Matrix team

Amandine grins at the future in the newest skunkworks zone of the London https://t.co/y2YCHNIbgU HQ... :> :D 😈 pic.twitter.com/K5xBz7U9o2
— Matrix (@matrixdotorg) March 18, 2017

Dendrite receives its first messages!!!

2017-03-15 — Tech — Matthew Hodgson

Hi all,

We hit a major milestone today on Dendrite, our next-generation golang homeserver: Dendrite received its first messages!!

Before you get too excited, please understand that Dendrite is still a pre-alpha work in progress - whilst we successfully created some rooms on an instance and sent a bunch of messages into them via the Client-Server API, most other functionality (e.g. receiving messages via /sync, logging in, registering, federation etc) is yet to exist. It cannot yet be used as a homeserver. However, this is still a huge step in the right direction, as it demonstrates the core DAG functionality of Matrix is intact, and the beginnings of a usable Client-Server API are hooked up.

The architecture of Dendrite is genuinely interesting - please check out the wiring diagram if you haven't already. The idea is that the server is broken down into a series of components which process streams of data stored in Kafka-style append-only logs. Each component scales horizontally (you can have as many as required to handle load), which is an enormous win over Synapse's monolithic design. Each component is also decoupled from each other by the logs, letting them run on entirely different machines as required. Please note that whilst the initial implementation is using Kafka for convenience, the actual append-only log mechanism is abstracted away - in future we expect to see configurations of Dendrite which operate entirely from within a single go executable (using go channels as the log mechanism), as well as alternatives to Kafka for distributed solutions.

The components which have taken form so far are the central roomserver service, which is responsible (as the name suggests) for maintaining the state and integrity of one or more rooms - authorizing events into the room DAG; storing them in postgres, tracking the auth chain of events where needed; etc. Much of the core matrix DAG logic of the roomserver is provided by gomatrixserverlib. The roomserver receives events sent by users via the 'client room send' component (and 'federation backfill' component, when that exists). The 'client room send' component (and in future also 'client sync') is provided by the clientapi service - which is what as of today is successfully creating rooms and events and relaying them to the roomserver!

The actual events we've been testing with are the history of the Matrix Core room: around 10k events. Right now the roomserver (and the postgres DB that backs it) are the main bottleneck in the pipeline rather than clientapi, so it's been interesting to see how rapidly the roomserver can consume its log of events. As of today's benchmark, on a generic dev workstation and an entirely unoptimised roomserver (i.e. no caching whatsoever) running on a single core, we're seeing it ingest the room history at over 350 events per second. The vast majority of this work is going into encoding/decoding JSON or waiting for postgres: with a simple event cache to avoid repeatedly hitting the DB for every auth and state event, we expect this to increase significantly. And then as we increase the number of cores, kafka partitions and roomserver instances it should scale fairly arbitrarily(!)

For context, the main synapse process for Matrix.org currently maxes out persisting events at around 15 and 20 per second (although it is also spending a bunch of time relaying events to the various worker processes, and other miscellanies). As such, an initial benchmark for Dendrite of 350 msgs/s really does look incredibly promising.

You may be wondering where this leaves Synapse? Well, a major driver for implementing Dendrite has been to support the growth of the main matrix.org server, which currently persists around 10 events/s (whilst emitting around 1500 events/s). We have exhausted most of the low-hanging fruit for optimising Synapse, and have got to point where the architectural fixes required are of similar shape and size to the work going into Dendrite. So, whilst Synapse is going to be around for a while yet, we're putting the majority of our long-term plans into Dendrite, with a distinct degree of urgency as we race against the ever-increasing traffic levels on the Matrix.org server!

Finally, you may be wondering what happened to Dendron, our original experiment in Golang servers. Well: Dendron was an attempt at a strangler pattern rewrite of Synapse, acting as an shim in front of Synapse which could gradually swap out endpoints with their golang implementations. In practice, the operational complexity it introduced, as well as the amount of room for improvement (at the time) we had in Synapse, and the relatively tight coupling to Synapse's existing architecture, storage & schema, meant that it was far from a clear win - and effectively served as an excuse to learn Go. As such, we've finally formally killed it off as of last week - Matrix.org is now running behind a normal haproxy, and Dendron is dead. Meanwhile, Dendrite (aka Dendron done Right ;) is very much alive, progressing fast, free from the shackles of Synapse.

We'll try to keep the blog updated with progress on Dendrite as it continues to grow!

How do I bridge thee? Let me count the ways...

2017-03-11 — Tech — Matthew Hodgson

Bridges come in many flavours, and we need consistent terminology within the Matrix community to ensure everyone (users, developers, core team) is on the same page. This post is primarily intended for bridge developers to refer to when building bridges.

The most recent version of this document is here (source) but we're also posting it as a blog post for visibility.

🔗Types of rooms

🔗Portal rooms

Bridges can register themselves as controlling chunks of room aliases namespace, letting Matrix users join remote rooms transparently if they /join #freenode_#wherever:matrix.org or similar. The resulting Matrix room is typically automatically bridged to the single target remote room. Access control for Matrix users is typically managed by the remote network's side of the room. This is called a portal room, and is useful for jumping into remote rooms without any configuration needed whatsoever - using Matrix as a ‘bouncer' for the remote network.

🔗Plumbed rooms

Alternatively, an existing Matrix room can be can plumbed into one or more specific remote rooms by configuring a bridge (which can be run by anyone). For instance, #matrix:matrix.org is plumbed into #matrix on Freenode, matrixdotorg/#matrix on Slack, etc. Access control for Matrix users is necessarily managed by the Matrix side of the room. This is useful for using Matrix to link together different communities.

Migrating rooms between a portal & plumbed room is currently a bit of a mess, as there's not yet a way for users to remove portal rooms once they're created, so you can end up with a mix of portal & plumbed users bridged into a room, which looks weird from both the Matrix and non-Matrix viewpoints. https://github.com/matrix-org/matrix-appservice-irc/issues/387 tracks this.

🔗Types of bridges (simplest first):

🔗Bridgebot-based bridges

The simplest way to exchange messages with a remote network is to have the bridge log into the network using one or more predefined users called bridge bots - typically called MatrixBridge or MatrixBridge[123] etc. These relay traffic on behalf of the users on the other side, but it's a terrible experience as all the metadata about the messages and senders is lost. This is how the telematrix matrix<->telegram bridge currently works.

🔗Bot-API (aka Virtual user) based bridges

Some remote systems support the idea of injecting messages from ‘fake' or ‘virtual' users, which can be used to represent the Matrix-side users as unique entities in the remote network. For instance, Slack's inbound webhooks lets remote bots be created on demand, letting Matrix users be shown cosmetically correctly in the timeline as virtual users. However, the resulting virtual users aren't real users on the remote system, so don't have presence/profile and can't be tab-completed or direct-messaged etc. They also have no way to receive typing notifs or other richer info which may not be available via bot APIs. This is how the current matrix-appservice-slack bridge works.

🔗Simple puppeted bridge

This is a richer form of bridging, where the bridge logs into the remote service as if it were a real 3rd party client for that service. As a result, the Matrix user has to already have a valid account on the remote system. In exchange, the Matrix user ‘puppets' their remote user, such that other users on the remote system aren't even aware they are speaking to a user via Matrix. The full semantics of the remote system are available to the bridge to expose into Matrix. However, the bridge has to handle the authentication process to log the user into the remote bridge.

This is essentially how the current matrix-appservice-irc bridge works (if you configure it to log into the remote IRC network as your ‘real' IRC nickname). matrix-appservice-gitter is being extended to support both puppeted and bridgebot-based operation. It's how the experimental matrix-appservice-tg bridge works.

Going forwards we're aiming for all bridges to be at least simple puppeted, if not double-puppeted.

🔗Double-puppeted bridge

A simple ‘puppeted bridge' allows the Matrix user to control their account on their remote network. However, ideally this puppeting should work in both directions, so if the user logs into (say) their native telegram client and starts conversations, sends messages etc, these should be reflected back into Matrix as if the user had done them there. This requires the bridge to be able to puppet the Matrix side of the bridge on behalf of the user.

This is the holy-grail of bridging; matrix-puppet-bridge is a community project that tries to facilitate development of double puppeted bridges, having done so for several networks. The main obstacle is working out an elegant way of having the bridge auth with Matrix as the matrix user (which requires some kind of scoped access_token delegation).

🔗Server-to-server bridging

Some remote protocols (IRC, XMPP, SIP, SMTP, NNTP, GnuSocial etc) support federation - either open or closed. The most elegant way of bridging to these protocols would be to have the bridge participate in the federation as a server, directly bridging the entire namespace into Matrix.

We're not aware of anyone who's done this yet.

🔗Sidecar bridge

Finally: the types of bridging described above assume that you are synchronising the conversation history of the remote system into Matrix, so it may be decentralised and exposed to multiple users within the wider Matrix network.

This can cause problems where the remote system may have arbitrarily complicated permissions (ACLs) controlling access to the history, which will then need to be correctly synchronised with Matrix's ACL model, without introducing security issues such as races. We already see some problems with this on the IRC bridge, where history visibility for +i and +k channels have to be carefully synchronised with the Matrix rooms.

You can also hit problems with other network-specific features not yet having equivalent representation in the Matrix protocol (e.g. ephemeral messages, or op-only messages - although arguably that's a type of ACL).

One solution could be to support an entirely different architecture of bridging, where the Matrix client-server API is mapped directly to the remote service, meaning that ACL decisions are delegated to the remote service, and conversations are not exposed into the wider Matrix. This is effectively using the bridge purely as a 3rd party client for the network (similar to Bitlbee). The bridge is only available to a single user, and conversations cannot be shared with other Matrix users as they aren't actually Matrix rooms. (Another solution could be to use Active Policy Servers at last as a way of centralising and delegating ACLs for a room)

This is essentially an entirely different product to the rest of Matrix, and whilst it could be a solution for some particularly painful ACL problems, we're focusing on non-sidecar bridges for now.