2   Unpadded Base64

Unpadded Base64 refers to 'standard' Base64 encoding as defined in RFC 4648, without "=" padding. Specifically, where RFC 4648 requires that encoded data be padded to a multiple of four characters using = characters, unpadded Base64 omits this padding.

For reference, RFC 4648 uses the following alphabet for Base 64:

Value Encoding  Value Encoding  Value Encoding  Value Encoding
    0 A            17 R            34 i            51 z
    1 B            18 S            35 j            52 0
    2 C            19 T            36 k            53 1
    3 D            20 U            37 l            54 2
    4 E            21 V            38 m            55 3
    5 F            22 W            39 n            56 4
    6 G            23 X            40 o            57 5
    7 H            24 Y            41 p            58 6
    8 I            25 Z            42 q            59 7
    9 J            26 a            43 r            60 8
   10 K            27 b            44 s            61 9
   11 L            28 c            45 t            62 +
   12 M            29 d            46 u            63 /
   13 N            30 e            47 v
   14 O            31 f            48 w
   15 P            32 g            49 x
   16 Q            33 h            50 y

Examples of strings encoded using unpadded Base64:

UNPADDED_BASE64("") = ""
UNPADDED_BASE64("f") = "Zg"
UNPADDED_BASE64("fo") = "Zm8"
UNPADDED_BASE64("foo") = "Zm9v"
UNPADDED_BASE64("foob") = "Zm9vYg"
UNPADDED_BASE64("fooba") = "Zm9vYmE"
UNPADDED_BASE64("foobar") = "Zm9vYmFy"

When decoding Base64, implementations SHOULD accept input with or without padding characters whereever possible, to ensure maximum interoperability.

3   Signing JSON

Various points in the Matrix specification require JSON objects to be cryptographically signed. This requires us to encode the JSON as a binary string. Unfortunately the same JSON can be encoded in different ways by changing how much white space is used or by changing the order of keys within objects.

Signing an object therefore requires it to be encoded as a sequence of bytes using Canonical JSON, computing the signature for that sequence and then adding the signature to the original JSON object.

3.1   Canonical JSON

We define the canonical JSON encoding for a value to be the shortest UTF-8 JSON encoding with dictionary keys lexicographically sorted by unicode codepoint. Numbers in the JSON must be integers in the range [-(2**53)+1, (2**53)-1].

We pick UTF-8 as the encoding as it should be available to all platforms and JSON received from the network is likely to be already encoded using UTF-8. We sort the keys to give a consistent ordering. We force integers to be in the range where they can be accurately represented using IEEE double precision floating point numbers since a number of JSON libraries represent all numbers using this representation.

import json

def canonical_json(value):
    return json.dumps(
        value,
        # Encode code-points outside of ASCII as UTF-8 rather than \u escapes
        ensure_ascii=False,
        # Remove unnecessary white space.
        separators=(',',':'),
        # Sort the keys of dictionaries.
        sort_keys=True,
        # Encode the resulting unicode as UTF-8 bytes.
    ).encode("UTF-8")

3.1.1   Grammar

Adapted from the grammar in http://tools.ietf.org/html/rfc7159 removing insignificant whitespace, fractions, exponents and redundant character escapes

value     = false / null / true / object / array / number / string
false     = %x66.61.6c.73.65
null      = %x6e.75.6c.6c
true      = %x74.72.75.65
object    = %x7B [ member *( %x2C member ) ] %7D
member    = string %x3A value
array     = %x5B [ value *( %x2C value ) ] %5B
number    = [ %x2D ] int
int       = %x30 / ( %x31-39 *digit )
digit     = %x30-39
string    = %x22 *char %x22
char      = unescaped / %x5C escaped
unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
escaped   = %x22 ; "    quotation mark  U+0022
          / %x5C ; \    reverse solidus U+005C
          / %x62 ; b    backspace       U+0008
          / %x66 ; f    form feed       U+000C
          / %x6E ; n    line feed       U+000A
          / %x72 ; r    carriage return U+000D
          / %x74 ; t    tab             U+0009
          / %x75.30.30.30 (%x30-37 / %x62 / %x65-66) ; u000X
          / %x75.30.30.31 (%x30-39 / %x61-66)        ; u001X

3.2   Signing Details

JSON is signed by encoding the JSON object without signatures or keys grouped as unsigned, using the canonical encoding described above. The JSON bytes are then signed using the signature algorithm and the signature is encoded using unpadded Base64. The resulting base64 signature is added to an object under the signing key identifier which is added to the signatures object under the name of the entity signing it which is added back to the original JSON object along with the unsigned object.

The signing key identifier is the concatenation of the signing algorithm and a key identifier. The signing algorithm identifies the algorithm used to sign the JSON. The currently supported value for signing algorithm is ed25519 as implemented by NACL (http://nacl.cr.yp.to/). The key identifier is used to distinguish between different signing keys used by the same entity.

The unsigned object and the signatures object are not covered by the signature. Therefore intermediate entities can add unsigned data such as timestamps and additional signatures.

{
   "name": "example.org",
   "signing_keys": {
     "ed25519:1": "XSl0kuyvrXNj6A+7/tkrB9sxSbRi08Of5uRhxOqZtEQ"
   },
   "unsigned": {
      "age_ts": 922834800000
   },
   "signatures": {
      "example.org": {
         "ed25519:1": "s76RUgajp8w172am0zQb/iPTHsRnb4SkrzGoeCOSFfcBY2V/1c8QfrmdXHpvnc2jK5BD1WiJIxiMW95fMjK7Bw"
      }
   }
}
def sign_json(json_object, signing_key, signing_name):
    signatures = json_object.pop("signatures", {})
    unsigned = json_object.pop("unsigned", None)

    signed = signing_key.sign(encode_canonical_json(json_object))
    signature_base64 = encode_base64(signed.signature)

    key_id = "%s:%s" % (signing_key.alg, signing_key.version)
    signatures.setdefault(signing_name, {})[key_id] = signature_base64

    json_object["signatures"] = signatures
    if unsigned is not None:
        json_object["unsigned"] = unsigned

    return json_object

3.3   Checking for a Signature

To check if an entity has signed a JSON object an implementation does the following:

  1. Checks if the signatures member of the object contains an entry with the name of the entity. If the entry is missing then the check fails.
  2. Removes any signing key identifiers from the entry with algorithms it doesn't understand. If there are no signing key identifiers left then the check fails.
  3. Looks up verification keys for the remaining signing key identifiers either from a local cache or by consulting a trusted key server. If it cannot find a verification key then the check fails.
  4. Decodes the base64 encoded signature bytes. If base64 decoding fails then the check fails.
  5. Removes the signatures and unsigned members of the object.
  6. Encodes the remainder of the JSON object using the Canonical JSON encoding.
  7. Checks the signature bytes against the encoded object using the verification key. If this fails then the check fails. Otherwise the check succeeds.

4   Security Threat Model

4.1   Denial of Service

The attacker could attempt to prevent delivery of messages to or from the victim in order to:

  • Disrupt service or marketing campaign of a commercial competitor.
  • Censor a discussion or censor a participant in a discussion.
  • Perform general vandalism.

4.1.1   Threat: Resource Exhaustion

An attacker could cause the victims server to exhaust a particular resource (e.g. open TCP connections, CPU, memory, disk storage)

4.1.2   Threat: Unrecoverable Consistency Violations

An attacker could send messages which created an unrecoverable "split-brain" state in the cluster such that the victim's servers could no longer derive a consistent view of the chatroom state.

4.1.3   Threat: Bad History

An attacker could convince the victim to accept invalid messages which the victim would then include in their view of the chatroom history. Other servers in the chatroom would reject the invalid messages and potentially reject the victims messages as well since they depended on the invalid messages.

4.1.4   Threat: Block Network Traffic

An attacker could try to firewall traffic between the victim's server and some or all of the other servers in the chatroom.

4.1.5   Threat: High Volume of Messages

An attacker could send large volumes of messages to a chatroom with the victim making the chatroom unusable.

4.1.6   Threat: Banning users without necessary authorisation

An attacker could attempt to ban a user from a chatroom with the necessary authorisation.

4.2   Spoofing

An attacker could try to send a message claiming to be from the victim without the victim having sent the message in order to:

  • Impersonate the victim while performing illicit activity.
  • Obtain privileges of the victim.

4.2.1   Threat: Altering Message Contents

An attacker could try to alter the contents of an existing message from the victim.

4.2.2   Threat: Fake Message "origin" Field

An attacker could try to send a new message purporting to be from the victim with a phony "origin" field.

4.3   Spamming

The attacker could try to send a high volume of solicited or unsolicited messages to the victim in order to:

  • Find victims for scams.
  • Market unwanted products.

4.3.1   Threat: Unsolicited Messages

An attacker could try to send messages to victims who do not wish to receive them.

4.3.2   Threat: Abusive Messages

An attacker could send abusive or threatening messages to the victim

4.4   Spying

The attacker could try to access message contents or metadata for messages sent by the victim or to the victim that were not intended to reach the attacker in order to:

  • Gain sensitive personal or commercial information.
  • Impersonate the victim using credentials contained in the messages. (e.g. password reset messages)
  • Discover who the victim was talking to and when.

4.4.1   Threat: Disclosure during Transmission

An attacker could try to expose the message contents or metadata during transmission between the servers.

4.4.2   Threat: Disclosure to Servers Outside Chatroom

An attacker could try to convince servers within a chatroom to send messages to a server it controls that was not authorised to be within the chatroom.

4.5   Threat: Disclosure to Servers Within Chatroom

An attacker could take control of a server within a chatroom to expose message contents or metadata for messages in that room.

5   Cryptographic Test Vectors

To assist in the development of compatible implementations, the following test values may be useful for verifying the cryptographic event signing code.

5.1   Signing Key

The following test vectors all use the 32-byte value given by the following Base64-encoded string as the seed for generating the ed25519 signing key:

SIGNING_KEY_SEED = decode_base64(
    "YJDBA9Xnr2sVqXD9Vj7XVUnmFZcZrlw8Md7kMW+3XA1"
)

In each case, the server name and key ID are as follows:

SERVER_NAME = "domain"

KEY_ID = "ed25519:1"

5.2   JSON Signing

Given an empty JSON object:

{}

The JSON signing algorithm should emit the following signed data:

{
    "signatures": {
        "domain": {
            "ed25519:1": "K8280/U9SSy9IVtjBuVeLr+HpOB4BQFWbg+UZaADMtTdGYI7Geitb76LTrr5QV/7Xg4ahLwYGYZzuHGZKM5ZAQ"
        }
    }
}

Given the following JSON object with data values in it:

{
    "one": 1,
    "two": "Two"
}

The JSON signing algorithm should emit the following signed JSON:

{
    "one": 1,
    "signatures": {
        "domain": {
            "ed25519:1": "KqmLSbO39/Bzb0QIYE82zqLwsA+PDzYIpIRA2sRQ4sL53+sN6/fpNSoqE7BP7vBZhG6kYdD13EIMJpvhJI+6Bw"
        }
    },
    "two": "Two"
}

5.3   Event Signing

Given the following minimally-sized event:

{
    "event_id": "$0:domain",
    "origin": "domain",
    "origin_server_ts": 1000000,
    "signatures": {},
    "type": "X",
    "unsigned": {
        "age_ts": 1000000
    }
}

The event signing algorithm should emit the following signed event:

{
    "event_id": "$0:domain",
    "hashes": {
        "sha256": "6tJjLpXtggfke8UxFhAKg82QVkJzvKOVOOSjUDK4ZSI"
    },
    "origin": "domain",
    "origin_server_ts": 1000000,
    "signatures": {
        "domain": {
            "ed25519:1": "2Wptgo4CwmLo/Y8B8qinxApKaCkBG2fjTWB7AbP5Uy+aIbygsSdLOFzvdDjww8zUVKCmI02eP9xtyJxc/cLiBA"
        }
    },
    "type": "X",
    "unsigned": {
        "age_ts": 1000000
    }
}

Given the following event containing redactable content:

{
    "content": {
        "body": "Here is the message content",
    },
    "event_id": "$0:domain",
    "origin": "domain",
    "origin_server_ts": 1000000,
    "type": "m.room.message",
    "room_id": "!r:domain",
    "sender": "@u:domain",
    "signatures": {},
    "unsigned": {
        "age_ts": 1000000
    }
}

The event signing algorithm should emit the following signed event:

{
    "content": {
        "body": "Here is the message content",
    },
    "event_id": "$0:domain",
    "hashes": {
        "sha256": "onLKD1bGljeBWQhWZ1kaP9SorVmRQNdN5aM2JYU2n/g"
    },
    "origin": "domain",
    "origin_server_ts": 1000000,
    "type": "m.room.message",
    "room_id": "!r:domain",
    "sender": "@u:domain",
    "signatures": {
        "domain": {
            "ed25519:1": "Wm+VzmOUOz08Ds+0NTWb1d4CZrVsJSikkeRxh6aCcUwu6pNC78FunoD7KNWzqFn241eYHYMGCA5McEiVPdhzBA"
        }
    },
    "unsigned": {
        "age_ts": 1000000
    }
}