Older documentation
This documentation hasn't been updated in a while. Some information might no longer be valid.
There may be more up to date information in the new documentation section.
Creating a simple read-only Matrix client
I created matrix-enact as a fun way to render Matrix rooms - it essentially "performs" the room history by progressively speaking each message event in chronological order. In this way, matrix-enact is effectively a simple, read-only Matrix client. Let's see how it was built.
This article will introduce two important concepts in Matrix, specifically in the Matrix Client-Server API:
- guest access
- the
/context
endpoint, which gets messages before and after a given event
Not using the matrix-js-sdk
Although written in JavaScript (and Reactjs), this project does not use the matrix-js-sdk, it makes direct HTTP calls to the Matrix Client-Server API. Because there are only three endpoints we need to hit, we can keep the project very light by not including an SDK.
Get Guest access_token
Matrix allows for guest access
by
providing an interface to register a new guest user and be immediately given an
access token. To do this we call the /register
endpoint with a query param
kind
set to guest
. In matrix-enact, this looks like:
import axios from 'axios';
var url = "https://matrix.org/_matrix/client/r0/register?kind=guest";
const res = await axios.post(url, {});
const { data } = await res;
// data.access_token will contain the access token, we must store it
Once we have the access token, we use it in the same way as if logged in with a normal user.
Translate a Room Alias to a Room ID
In the UI, the user can enter either a room alias or a room ID. Whichever they enter, to get message content from a room we need the ID. This means we need to detect if an alias has been entered, and if so get the correct room ID for that alias:
// we know that if the first character is a '#', we have an alias not an id
if (this.state.roomEntry[0] === "#") {
var getIdUrl = "https://matrix.org/_matrix/client/r0/directory/room/";
getIdUrl += encodeURIComponent(this.state.roomEntry);
const res = await axios.get(getIdUrl);
const { data } = await res;
// data.room_id contains the room id for the alias
}
/context
endpoint
We use the /context
endpoint to get chronological history of a room timeline.
Looking at this section of the Client-Server API we see:
This API returns a number of events that happened just before and after the specified event. This allows clients to get the context surrounding an event.
To get messages from this endpoint we need to provide a room id and the event id we want context for. Check out the comments in the code below to follow along.
async loadScriptFromEventId(startEventId) {
// first we construct the url as per the CS API
const url = `https://matrix.org/_matrix/client/r0/rooms/${encodeURIComponent(roomId)}/context/${encodeURIComponent(startEventId)}?limit=100&access_token=${this.state.accessToken}`;
axios.get(url).then(res => {
// make an array to store the events from the response
var newEvents = [];
// we only want the events that follow our start events
newEvents = newEvents.concat(res.data.events_after);
// and we only want events that contain a body field, i.e. that are messages
newEvents = newEvents.filter(e => e.content.body);
// finally, since we're using React for this app,
// we store these messages in the state object
this.setState({events: this.state.events.concat(newEvents)});
});
}
Loop until we have enough messages
Notice the previous URL we hit when calling /context
. We specified a limit
value of 100
. In fact, 100
is usually the limit enforced by the homeserver.
This limit refers to the number of events, not the number of messages -
remember that we are filtering them in the code above.
If we say that we want our script to be 50 lines long, but after filtering we
are left with only 30 messages, what should we do? Get more events after the
latest one, and append the new events to our script. Knowing that we have taken
a value from the form to be stored in state.messageCount
, and in the previous
section we inserted message events into state.events
, we can compare these
two variables, and if needed, call loadScriptFromEventId()
again with the
last known event.
if (this.state.messageCount > this.state.events.length) {
// get last known event
var lastEvent = res.data.events_after[res.data.events_after.length - 1];
this.loadScriptFromEventId(lastEvent.event_id);
} else {
this.setState({events: this.state.events.slice(0, this.state.messageCount), statusMessage: "Done"});
}
Using the Web Audio API
The Web Audio API is a massive topic, out of the scope of this article. We'll cover just enough to be able to show the "happy path" of performing Text-to-Speech (TTS) sequentially.
To deliver a line as audio, the fundamental code is as follows:
var utterance = new SpeechSynthesisUtterance();
utterance.text = "some string";
var someVoice = window.speechSynthesis.getVoices()[0];
utterance.voice = someVoice;
window.speechSynthesis.speak(utterance);
To find out when an utterance ends, attach a function to the onend event:
utterance.onend = function() {
// do something when the line ends
};
Knowing that we can perform TTS on strings we provide, and that we can call a function when a line ends, from here it's easy to see how we can use the list of messages to "enact" the message history.
Using the Web Audio API with React
We will:
- assign each user a random voice from TTS voices available in the current browser
- trigger each line sequentially and with the correct voice, thus giving the impression of a script being performed
Let's create a nextLine()
function in our App
component, and use this to
insert lines associated with "Parts", meaning that each part is a separate user
with an assigned voice.
nextLine() {
var line = this.state.line;
if (! this.state.events[line]) return;
var newPart = this.state.events[line].sender;
if (! this.state.parts.find(p =>{return p.name === newPart;})) {
this.setState({
parts: this.state.parts.concat([{
name: newPart,
voice: voices[getRandomInt(0, voices.length)]
}])
})
}
this.setState({
script: this.state.script.concat(this.state.events[line]),
line: this.state.line + 1,
nextText: "Continue"
});
}
By incrementing the line
counter, we progress through the script, adding a
line at a time to the correct Part
.
During rendering, the App renders an array of Part
Components, which in turn
render an array of lines, filtered for that particular Part:
const lines = this.props.script.map((line, lineNumber) => {
line.lineNumber = lineNumber;
return line;
}).filter(l => l.sender === part.name);
Knowing that in React, the constructor
for a Component is called only once, we
perform the TTS process itself inside the constructor method:
class Line extends Component {
constructor(props) {
super(props);
var utterance = new SpeechSynthesisUtterance();
var nextLine = this.props.nextLine;
utterance.text = this.props.lineText;
utterance.voice = this.props.part.voice;
synth.speak(utterance);
}
}
Finally, we'll use what we already learned about the onend
event to insert the
next line:
class Line extends Component {
constructor(props) {
super(props);
var utterance = new SpeechSynthesisUtterance();
var nextLine = this.props.nextLine;
utterance.onend = function(a) {
nextLine();
};
utterance.text = this.props.lineText;
utterance.voice = this.props.part.voice;
synth.speak(utterance);
}
}
In this way, nextLine() is called in a loop, meaning that the lines are added to React sequentially, and spoken aloud as they are added.
Conclusion
This article covered a lot of ground:
- Matrix Guess access
- the
/context/
API endpoint - filtering content from Matrix events
- passing these strings to the Web Audio API
To learn more about Matrix development, check out the Matrix Documentation.