MIDI over Audio

Jump to Section
Table of Contents

as_MoA is about transporting MIDI data over audio.

My first attempt was sending MIDI out via Analog audio. This kinda worked, but it was very fragile and difficult.

My second attempt was sending MIDI out via SPDIF. This worked solidly, and is in theory capable of transmitting a lot of data besides a single MIDI stream.

The downside is that the only way to get this to work in a DAW is via a special plugin, one which appears as an instrument (with MIDI input and audio output). But most DAWs don’t really let you send the same kind of data to an "instrument" and to the "MIDI Interface". For example, you’d often be able to send MIDI Sync (Clock) out of the MIDI interface, but can you send it to a plugin?

MIDI Sync is just one small problem, it can be sorta fixed even, the special plugin could generate its own tempo clock (like many plugins with built-in sequencers do already), and send that out.

The bigger problem is doing it the opposite direction: MIDI in to the DAW. This would require a different "special" plugin, one which takes audio input and emits MIDI. Now this is probably much more exotic for some DAWs.


The idea with as_MoA2 was to just take a MIDI byte (8bits) and encode it as amplitude, perhaps together with a few other bits for other things. Thus a single MIDI byte can be stored into a single audio sample value on one channel.

Can the audio handle it?

Maths time: the MIDI bitrate is 31250.
A MIDI Byte takes 10 clocks at that rate.
The maximum data rate is thus 3125 Bytes per second.

At a sampling rate of 44100Hz, if the MIDI stream was fully busy, the MIDI bytes would be coming every 14.112 audio samples. If using the above encoding scheme, that means roughly one audio sample out of every 14 audio samples would be used to encode+store the MIDI byte, while the other 13 audio samples are left unused. Note that this is on one channel.

In theory, at Fs=44100Hz and stereo audio, up to 28 full MIDI output streams can be transmitted this way. At higher sampling rates - things get better.


A special hardware box is gonna be needed to take the audio from above and "decode" the MIDI data, or whatever data was actually encoded, and turn it into usable outputs. In my attempt, i used an 8bit XMEGA microcontroller and an SPDIF interface chip.


The XMEGA is too slow and doesn’t have a peripheral to handle the fast audio clock, especially since i wanted this thing to work also at least up to Fs=96kHz, so i used some good old shift-registers to convert the fast serial audio into parallel data and then the XMEGA gets interrupted only once per audio sample, where it takes the data as parallel pin values. It then immediately figures out whether this audio sample contains anything, and if it contains a MIDI Byte - it tosses it to a UART and you get a traditional MIDI output.