GDPR on matrix.org

If you’ve connected to the matrix.org homeserver today, you’ll have noticed some activity in support of GDPR compliance. The most obvious of these is an invite from System Alerts (aka @server:matrix.org):

We’ve rolled out the System Alerts feature to communicate important platform information to all of a homeserver’s users. Today, we’re using it to communicate the arrival of our new (and much-improved) Privacy Notice and Terms and Conditions to users on matrix.org.

The System Alerts service takes the form of an (unrejectable) invite to a room. We took this approach to support maximum compatibility with the myriad Matrix clients (since all Matrix clients can support conversations in a room 😊).

When we first rolled out System Alerts, we didn’t allow users leave the System Alerts room. Sorry! We got a bit overexcited – we’ve fixed that now (though please do provide your agreement before you leave).

What do I need to do?

At some point today the System Alerts service will provide you with unique link, directing you to review the new terms and provide your agreement.

For us to process your personal data lawfully, it’s really important that we know you understand and agree to our Privacy Notice and Terms and Conditions. For that reason, we will shortly be blocking any users who haven’t indicated their acceptance, so please act quickly when you receive your link.

Once the block is enabled, users who haven’t accepted the terms will see an error when they try and send a message, join a room, or send an invite. This message will also include the unique link to review and accept the terms, so users who haven’t seen the message from System Alerts will know what to do.

Don’t worry if you’re reading this some time after May 25 – accepting the terms at any time will unblock message sending on your account, and you won’t have missed any messages sent to you.

If you have any thoughts or suggestions on the legal documentation, you can provide comment via github.

Synapse v0.30.0 released today!

It’s release o’clock – GDPR time!!!!

v0.30.0 sees the introduction of Server Notices, which provides a channel whereby server administrators can send messages to users on the server, as well as Consent Management for tracking whether users have agreed to the terms and conditions set by the administrator of a server – and blocking access to the server until they have.

In conjunction these features support GDPR compliance in the form of providing a client agnostic means to contact users and ask for consent/agreement to a Privacy Notice.

For more information about our approach to GDPR compliance take a look here (although be aware that our position has evolved a bit; see the upcoming new privacy policy for the Matrix.org homeserver for details).

Additionally there are a host of bug fixes and refactors as well as an enhancement to our Dockerfile.

Get it now from https://github.com/matrix-org/synapse/releases/tag/v0.30.0

Changes in synapse v0.30.0 (2018-05-24)

‘Server Notices’ are a new feature introduced in Synapse 0.30. They provide a
channel whereby server administrators can send messages to users on the server.

They are used as part of communication of the server policies (see Consent Tracking),
however the intention is that they may also find a use for features such
as “Message of the day”.

This feature is specific to Synapse, but uses standard Matrix communication mechanisms,
so should work with any Matrix client. For more details see here

Further Server Notices/Consent Tracking Support:

  • Allow overriding the server_notices user’s avatar (PR #3273)
  • Use the localpart in the consent uri (PR #3272)
  • Support for putting %(consent_uri)s in messages (PR #3271)
  • Block attempts to send server notices to remote users (PR #3270)
  • Docs on consent bits (PR #3268)

Changes in synapse v0.30.0-rc1 (2018-05-23)

GDPR Support:

  • ConsentResource to gather policy consent from users (PR #3213)
  • Move RoomCreationHandler out of synapse.handlers.Handlers (PR #3225)
  • Infrastructure for a server notices room (PR #3232)
  • Send users a server notice about consent (PR #3236)
  • Reject attempts to send event before privacy consent is given (PR #3257)
  • Add a ‘has_consented’ template var to consent forms (PR #3262)
  • Fix dependency on jinja2 (PR #3263)

Features:

  • Cohort analytics (PR #3163#3241#3251)
  • Add lxml to docker image for web previews (PR #3239) Thanks to @ptman!
  • Add in flight request metrics (PR #3252)

Changes:

  • Remove unused update_external_syncs (PR #3233)
  • Use stream rather depth ordering for push actions (PR #3212)
  • Make purge_history operate on tokens (PR #3221)
  • Don’t support limitless pagination (PR #3265)

Bug Fixes:

  • Fix logcontext resource usage tracking (PR #3258)
  • Fix error in handling receipts (PR #3235)
  • Stop the transaction cache caching failures (PR #3255)

Synapse 0.29.1 Released!

It’s release time people, not to be outdone by our friends on the Riot web team, Synapse v0.29.1 lands today.

v0.29.1 contains an officially supported docker image (many thanks to the contribution from @kaiyou), continued progress towards Python 3 (thanks to @NotAFile) – as well as a heap of refactorings and bug fixes.

Something worth noting is a potentially breaking change in the error code that /login returns in the Client Server API. Details follow, but the change closes a gap between Synapse behaviour and the spec.

We’d like to give huge thanks to Silvio Fricke and Andreas Peters for writing and maintaining Synapse’s first Dockerfile, as well as allmende, jcgruenhage, ptman, and ilianaw for theirs!  The new Dockerfile from kaiyou has ended up being merged into the main synapse tree and we’re going to try to maintain it going forwards, but folks should use whichever one they prefer.

You can pick it up from https://github.com/matrix-org/synapse/releases/tag/v0.29.1 and thanks to everyone who tested the release candidate.

Changes in synapse v0.29.1 (2018-05-17)

Changes:

  • Update docker documentation (PR #3222)

Changes in synapse v0.29.0 (2018-05-16)

No changes since v0.29.0-rc1

Changes in synapse v0.29.0-rc1 (2018-05-14)

Potentially breaking change:

  • Make Client-Server API return 401 for invalid token (PR #3161).This changes the Client-server spec to return a 401 error code instead of 403 when the access token is unrecognised. This is the behaviour required by the specification, but some clients may be relying on the old, incorrect behaviour.Thanks to @NotAFile for fixing this.

Features:

  • Add a Dockerfile for synapse (PR #2846) Thanks to @kaiyou!

Changes – General:

  • nuke-room-from-db.sh: added postgresql option and help (PR #2337) Thanks to @rubo77!
  • Part user from rooms on account deactivate (PR #3201)
  • Make ‘unexpected logging context’ into warnings (PR #3007)
  • Set Server header in SynapseRequest (PR #3208)
  • remove duplicates from groups tables (PR #3129)
  • Improve exception handling for background processes (PR #3138)
  • Add missing consumeErrors to improve exception handling (PR #3139)
  • reraise exceptions more carefully (PR #3142)
  • Remove redundant call to preserve_fn (PR #3143)
  • Trap exceptions thrown within run_in_background (PR #3144)

Changes – Refactors:

  • Refactor /context to reuse pagination storage functions (PR #3193)
  • Refactor recent events func to use pagination func (PR #3195)
  • Refactor pagination DB API to return concrete type (PR #3196)
  • Refactor get_recent_events_for_room return type (PR #3198)
  • Refactor sync APIs to reuse pagination API (PR #3199)
  • Remove unused code path from member change DB func (PR #3200)
  • Refactor request handling wrappers (PR #3203)
  • transaction_id, destination defined twice (PR #3209) Thanks to @damir-manapov!
  • Refactor event storage to prepare for changes in state calculations (PR #3141)
  • Set Server header in SynapseRequest (PR #3208)
  • Use deferred.addTimeout instead of time_bound_deferred (PR #3127#3178)
  • Use run_in_background in preference to preserve_fn (PR #3140)

Changes – Python 3 migration:

Bug Fixes:

  • synapse fails to start under Twisted >= 18.4 (PR #3157) Thanks to @Half-Shot!
  • Fix a class of logcontext leaks (PR #3170)
  • Fix a couple of logcontext leaks in unit tests (PR #3172)
  • Fix logcontext leak in media repo (PR #3174)
  • Escape label values in prometheus metrics (PR #3175#3186)
  • Fix ‘Unhandled Error’ logs with Twisted 18.4 (PR #3182) Thanks to @Half-Shot!
  • Fix logcontext leaks in rate limiter (PR #3183)
  • notifications: Convert next_token to string according to the spec (PR #3190) Thanks to @mujx!
  • nuke-room-from-db.sh: fix deletion from search table (PR #3194) Thanks to @rubo77!
  • add guard for None on purge_history api (PR #3160) Thanks to @krombel!

GDPR Compliance in Matrix

Hi all,

As the May 25th deadline looms, we’ve had lots and lots of questions about how GDPR (the EU’s new General Data Protection Regulation legislation) applies to Matrix and to folks running Matrix servers – and so we’ve written this blog post to try to spell out what we’re doing as part of maintaining the Matrix.org server (and bridges and hosted integrations etc), in case it helps folks running their own servers.

The main controversial point is how to handle Article 17 of the GDPR: ‘Right to Erasure’ (aka Right to be Forgotten).  The question is particularly interesting for Matrix, because as a relatively new protocol with somewhat distinctive semantics it’s not always clear how the rules apply – and there’s no case law to seek inspiration from.

The key question boils down to whether Matrix should be considered more like email (where people would be horrified if senders could erase their messages from your mail spool), or should it be considered more like Facebook (where people would be horrified if their posts were visible anywhere after they avail themselves of their right to erasure).

Solving this requires making a judgement call, which we’ve approached from two directions: firstly, considering what the spirit of the GDPR is actually trying to achieve (in terms of empowering users to control their data and have the right to be forgotten if they regret saying something in a public setting) – and secondly, considering the concrete legal obligations which exist.  

The conclusion we’ve ended up with is to (obviously) prioritise that Matrix can support all the core concrete legal obligations that GDPR imposes on it – whilst also having a detailed plan for the full ‘spirit of the GDPR’ where the legal obligations are ambiguous.  The idea is to get as much of the longer term plan into place as soon as possible, but ensure that the core stuff is in place for May 25th.

Please note that we are still talking to GDPR lawyers, and we’d also very much appreciate feedback from the wider Matrix community – i.e. this plan is very much subject to change.  We’re sharing it now to ensure everyone sees where our understanding stands today.

The current todo list breaks down into the following categories. Most of these issues have matching github IDs, which we’ll track in a progress dashboard.

Right to Erasure

We’re opting to follow the email model, where the act of sending an event (i.e. message) into a room shares a copy of that message to everyone who is currently in that room.  This means that in the privacy policy (see Consent below) users will have to consent to agreeing that a copy of their messages will be transferred to whoever they are addressing.  This is also the model followed by IM systems such as WhatsApp, Twitter DMs or (almost) Facebook Messenger.

This means that if a user invokes their right to erasure, we will need to ensure that their events will only ever be visible to users who already have a copy – and must never be served to new users or the general public.  Meanwhile, data which is no longer accessible by any user must of course be deleted entirely.

In the email analogy: this is like saying that you cannot erase emails that you have sent other people; you cannot try to rewrite history as witnessed by others… but you can erase your emails from a public mail archive or search engine and stop them from being visible to anyone else.

It is important to note that GDPR Erasure is completely separate from the existing Matrix functionality of “redactions” which let users remove events from the room. A “redaction” today represents a request for the human-facing details of an event (message, join/leave, avatar change etc) to be removed.  Technically, there is no way to enforce a redaction over federation, but there is a “gentlemen’s agreement” that this request will be honoured.

The alternative to the ‘email-analogue’ approach would have been to facilitate users’ automatically applying the existing redact function to all of the events they have ever submitted to a public room. The problem here is that defining a ‘public room’ is subtle, especially to uninformed users: for instance, if a message was sent in a private room (and so didn’t get erased), what happens if that room is later made public? Conversely, if right-to-erasure removed messages from all rooms, it will end up destroying the history integrity of 1:1 conversations, which pretty much everyone agrees is abhorrent.  Hence our conclusion to protect erased users from being visible to the general public (or anyone who comes snooping around after the fact) – but preserving their history from the perspective of the people they were talking to at the time.

In practice, our core to-do list for Right to Erasure is:

  • As a first cut,  provide Article 17 right-to-erasure at a per-account granularity.  The simplest UX for this will be an option when calling the account deactivation API to request erasure as well as deactivation.  There will be a 30 day grace period, and (ideally) a 2FA confirmation (if available) to avoid the feature being abused.
  • Homeservers must delete events that nobody has access to any more (i.e. if all the users in a room have GDPR-erased themselves).  If users have deactivated their accounts without GDPR-erasure, then the data will persist in case they reactivate in future.
  • Homeservers must delete media that nobody has access to any more.  This is hard, as media is referenced by mxc:// URLs which may be shared across multiple events (e.g. stickers or forwarded events, including E2E encrypted events), and moreover mxc:// URLs aren’t currently authorized.  As a first cut, we track which user uploaded the mxc:// content, and if they erase themselves then the content will also be erased.
  • Homeservers must not serve up unredacted events over federation to users who were not in the room at the time.  This poses some interesting problems in terms of the privacy implications of sharing MXIDs of erased users over federation – see “GDPR erasure of MXIDs” below.
  • Matrix must specify a way of informing both servers and clients (especially bots and bridges) of GDPR erasures (as distinct from redactions), so that they can apply the appropriate erasure semantics.

GDPR erasure of Matrix IDs

One interesting edge case that comes out of GDPR erasure is that we need a way to stop GDPR-erased events from leaking out over federation – when in practice they are cryptographically signed into the event Directed Acyclic Graph (DAG) of a given room.  Today, we can remove the message contents (and preserve the integrity of the room’s DAG) via redaction – but this still leaves personally identifying information in the form of the Matrix IDs (MXIDs) of the sender of these events.

In practice, this could be quite serious: imagine that you join a public chatroom for some sensitive subject (e.g. #hiv:example.com) and then later on decide that you want to erase yourself from the room.  It would be very undesirable if any new homeserver joining that room received a copy of the DAG showing that your MXID had sent thousands of events into the room – especially if your MXID was clearly identifying (i.e. your real name).

Mitigating this is a hard problem, as MXIDs are baked into the DAG for a room in many places – not least to identify which servers are participating in a room.  The problem is made even worse by the fact that in Matrix, server hostnames themselves are often personally identifying (for one-person homeservers sitting on a personal domain).

We’ve spent quite a lot time reasoning through how to fix this situation, and a full technical spec proposal for removing MXIDs from events can be found at https://docs.google.com/document/d/1ni4LnC_vafX4h4K4sYNpmccS7QeHEFpAcYcbLS-J21Q.  The high level proposal is to switch to giving each user a different ID in the form of a cryptographic public key for every room it participates in, and maintaining a mapping of today’s MXIDs to these per-user-per-room keys.  In the event of a GDPR erasure, these mappings can be discarded, pseudonymising the user and avoiding correlation across different rooms. We’d also switch to using cryptographic public keys as the identifiers for Rooms, Events and Users (for cross-room APIs like presence).

This is obviously a significant protocol change, and we’re not going to do it lightly – we’re still waiting for legal confirmation on whether we need it for May 25th (it may be covered as an intrinsic technical limitation of the system).  However, the good news is that it paves the way towards many other desirable features: the ability to migrate accounts between homeservers; the ability to solve the problem of how to handle domain names being reused (or hijacked); the ability to decouple homeservers from DNS so that they can run clientside (for p2p matrix); etc.  The chances are high that this proposal will land in the relatively near future (especially if mandated by GDPR), so input is very appreciated at this point!

Consent

GDPR describes six lawful bases for processing personal data.  For those running Matrix servers, it seems the best route to compliance is the most explicit and active one: consent.

Consent requires that our users are fully informed as to exactly how their data will be used, where it will be stored, and (in our case) the specific caveats associated with a decentralised, federated communication system. They are then asked to provide their explicit approval before using (or continuing to use) the service.

In order to gather consent in a way that doesn’t break all of the assorted Matrix clients connecting to matrix.org today, we have identified both an immediate- and a long-term approach.

The (immediate-term) todo list for gathering consent is:

  • Modify Synapse to serve up a simple ‘consent tool’ static webapp to display the privacy notice/terms and conditions and gather consent to this API.
    • Add a ‘consent API’ to the CS API which lets a server track whether a given user has consented to the server’s privacy policy or not.
  • Send emails and push notifications to advise users of the upcoming change (and link through to the consent tool)
  • Develop a bot that automatically connects to all users (new and existing), posting a link to the consent tool.  This bot can also be used in the future as a general ‘server notice channel’ for letting server admins inform users of privacy policy changes; planned downtime; security notices etc.
  • Modify synapse to reject message send requests for all users who have not yet provided consent
    • return a useful error message which contains a link to the consent tool
  • Making our anonymised user analytics for Riot.im ‘opt in’ rather than ‘opt out’ – this isn’t a requirement of GDPR (since our analytics are fully anonymised) but reflects our commitment to user data sovereignty

Long-term:

  • Add a User Interactive Auth flow for the /register API to gather consent at register
  • As an alternative to the bot:
    • Fix user authentication in general to distinguish between ‘need to reauthorize without destroying user data’ and ‘destroy user data and login again’, so we can use the re-authorize API to gather consent via /login without destroying user data on the client.
    • port the /login API to use User Interactive Auth and also use it to gather consent for existing users when logging in

Deactivation

Account deactivation (the ability to terminate your account on your homeserver) intersects with GDPR in a number of places.

Todo list for account deactivation:

  • Remove deactivated users from all rooms – this finally solves the problem where deactivated users leave zombie users around on bridged networks.
  • Remove deactivated users from the homeserver’s user directory
  • Remove all 3PID bindings associated with a deactivated user from the identity servers
  • Improve the account deactivation UX to make sure users understand the full consequences of account deactivation

Portability

GDPR states that users have a right to extract their data in a structured, commonly used and machine-readable format.

In the medium term we would like to develop this as a core feature of Matrix (i.e. an API for exporting your logs and other data, or for that matter account portability between Matrix servers), but in the immediate term we’ll be meeting our obligations by providing a manual service.

The immediate todo list for data portability is:

  • Expose a simple interface for people to request their data
  • Implement the necessary tooling to provide full message logs (as a csv) upon request.  As a first cut this would be the result of manually running something like select * from events where user=?.

Other

GDPR mandates rules for all the personal data stored by a business, so there are some broader areas to bear in mind which aren’t really Matrix specific, including:

  • Making a clear statement as to how data is processed if you apply for a job
  • Ensuring you are seeking appropriate consent for cookies
  • Making sure all the appropriate documentation, processes and training materials are in place to meet GDPR obligations.

Conclusion

So, there you have it.  We’ll be tracking progress in github issues and an associated dashboard over the coming weeks; for now https://github.com/matrix-org/synapse/issues/1941 (for Right to Erasure) or https://github.com/vector-im/riot-meta/issues/149 (GDPR in general) is as good as place as any to gather feedback.  Alternatively, feel free to comment on the original text of this blog post: https://docs.google.com/document/d/1JTEI6RENnOlnCwcU2hwpg3P6LmTWuNS9S-ZYDdjqgzA.

It’s worth noting that we feel that GDPR is an excellent piece of legislation from the perspective of forcing us to think more seriously about our privacy – it has forced us to re-prioritise all sorts of long-term deficiencies in Matrix (e.g. dependence on DNS; improving User Interactive authentication; improving logout semantics etc).  There’s obviously a lot of work to be done here, but hopefully it should all be worth it!

 

Synapse 0.28.0 Released!

Well now, today sees the release of Synapse 0.28.0!

This release is particularly exciting as it’s a major bump mainly thanks to lots and lots of contributions from the wider community – including support for running Synapse on PyPy (thanks Valodim) and lots of progress towards official Python3 support (thanks notafile)!! However, almost all the changes are under the hood (and some are quite major), so this is more a performance, bugfix and synapse internals release rather than adding many new APIs or features

As always, you can get it from https://github.com/matrix-org/synapse/releases/tag/v0.28.0 and thanks to everyone who tested the release candidates.

Changes in synapse v0.28.0 (2018-04-26)

Bug Fixes:

  • Fix quarantine media admin API and search reindex (PR #3130)
  • Fix media admin APIs (PR #3134)

Changes in synapse v0.28.0-rc1 (2018-04-24)

Minor performance improvement to federation sending and bug fixes.

(Note: This release does not include state resolutions discussed in matrix live)

Features:

  • Add metrics for event processing lag (PR #3090)
  • Add metrics for ResponseCache (PR #3092)

Changes:

  • Synapse on PyPy (PR #2760) Thanks to @Valodim!
  • move handling of auto_join_rooms to RegisterHandler (PR #2996) Thanks to @krombel!
  • Improve handling of SRV records for federation connections (PR #3016) Thanks to @silkeh!
  • Document the behaviour of ResponseCache (PR #3059)
  • Preparation for py3 (PR #3061#3073#3074#3075#3103#3104#3106#3107#3109#3110) Thanks to @NotAFile!
  • update prometheus dashboard to use new metric names (PR #3069) Thanks to @krombel!
  • use python3-compatible prints (PR #3074) Thanks to @NotAFile!
  • Send federation events concurrently (PR #3078)
  • Limit concurrent event sends for a room (PR #3079)
  • Improve R30 stat definition (PR #3086)
  • Send events to ASes concurrently (PR #3088)
  • Refactor ResponseCache usage (PR #3093)
  • Clarify that SRV may not point to a CNAME (PR #3100) Thanks to @silkeh!
  • Use str(e) instead of e.message (PR #3103) Thanks to @NotAFile!
  • Use six.itervalues in some places (PR #3106) Thanks to @NotAFile!
  • Refactor store.have_events (PR #3117)

Bug Fixes:

  • Return 401 for invalid access_token on logout (PR #2938) Thanks to @dklug!
  • Return a 404 rather than a 500 on rejoining empty rooms (PR #3080)
  • fix federation_domain_whitelist (PR #3099)
  • Avoid creating events with huge numbers of prev_events (PR #3113)
  • Reject events which have lots of prev_events (PR #3118)

 

Matrix and Riot confirmed as the basis for France’s Secure Instant Messenger app

Hi folks,

We’re incredibly excited that the Government of France has confirmed it is in the process of deploying a huge private federation of Matrix homeservers spanning the whole government, and developing a fork of Riot.im for use as their official secure communications client! The goal is to replace usage of WhatsApp or Telegram for official purposes.

It’s a unbelievably wonderful situation that we’re living in a world where governments genuinely care about openness, open source and open-standard based communications – and Matrix’s decentralisation and end-to-end encryption is a perfect fit for intra- and inter-governmental communication.  Congratulations to France for going decentralised and supporting FOSS! We understand the whole project is going to be released entirely open source (other than the operational bits) – development is well under way and an early proof of concept is already circulating within various government entities.

I’m sure there will be more details from their side as the project progresses, but meanwhile here’s the official press release, and an English translation too. We expect this will drive a lot of effort into maturing Synapse/Dendrite, E2E encryption and matrix-{react,ios,android}-sdk, which is great news for the whole Matrix ecosystem! The deployment is going to be speaking pure Matrix and should be fully compatible with other Matrix clients and projects in addition to the official client.

So: exciting times for Matrix.  Needless to say, if you work on Open Government projects in other countries, please get in touch – we’re seeing that Matrix really is a sweet spot for these sort of use cases and we’d love to help get other deployments up and running.  We’re also hoping it’s going to help iron out many of the UX kinks we have in Riot.im today as we merge stuff back. We’d like to thank DINSIC (the Department responsible for the project) for choosing Matrix, and can’t wait to see how the project progresses!

English Translation:

The French State creates its own secure instant messenger

By the summer of 2018, the French State will have its own instant messenger, an alternative to WhatsApp and Telegram.

It will guarantee secure, end-to-end encrypted conversations without degradation of the user experience. It will be compatible with any mobile device or desktop, state or personal. In fact until now the installation of applications like WhatsApp or Telegram was not possible on professional mobile phones, which hindered easy sharing of information and documents.

Led by the Interministerial Department of State Digital, Information and Communication Systems (DINSIC), the project is receiving contributions from the National Agency for Information System Security (ANSSI), the IT Directorship (DSI) of the Armed Forces and the Ministry of Europe and Foreign Affairs.

The tool developed is based on open source software (Riot) that implements an open standard (Matrix). Powered by a Franco-British startup (New Vector), and benefiting from many contributions, this communication standard has already caught the attention of other states such as the Netherlands and Canada, with whom DINSIC collaborates closely.

The Matrix standard and its open source software are also used by private companies such as Thales, which has driven the teams to come together to ensure the interoperability of their tools and cooperate in the development of free and open source software.

After 3 months of development for a very limited cost, this tool is currently being tested in the State Secretary for Digital, DINSIC and in the IT departments of different ministries. It should be rolled out during the summer in administrations and cabinets.

“With this new French solution, the state is demonstrating its ability to work in an agile manner to meet concrete needs by using open source tools and very low development costs. Sharing information in a secure way is essential not only for companies but also for a more fluid dialogue within administrations.” – Mounir Mahjoubi, Secretary of State to the Prime Minister, in charge of Digital.