theallelectricsmartgrid

Deep Vocoder

The Deep Vocoder (private/src/DeepVocoder.hpp) is an experimental spectral processing system. The name “Deep” is a reference to the Deep Listening practices of Pauline Oliveros.

While a traditional vocoder imposes the amplitude envelope of an analysis signal (the modulator) onto a set of fixed bandpass filters acting on a synthesis signal (the carrier), the Deep Vocoder operates entirely differently. It analyzes an incoming audio signal using a Spectral Modeling Synthesis (SMS) approach and uses the extracted partials to “quantize” the fundamental pitches generated by the Nonagon sequencer.

Spectral Analysis

The Deep Vocoder maintains a rolling buffer of incoming audio. Every hop (x_H samples), it performs a spectral analysis (m_spectralModel.ExtractAtoms).

It extracts a set of “atoms” (spectral partials/peaks) representing the prominent frequencies and magnitudes present in the input signal at that moment.
There is no attempt to perform traditional pitch tracking or identify a single fundamental frequency. Instead, any sufficiently prominent partial is considered a valid target.

V-Shaped Ranking and Quantization

When a voice in the synthesizer is triggered by the Nonagon, its intended fundamental pitch (m_pitchCenter) is passed to the Deep Vocoder (TransformNote).

The vocoder then searches the currently active spectral atoms to find the “best” match. It does this using a V-shaped thresholding function (MagnitudeThreshold):

For a given target frequency and a candidate atom frequency, it calculates an amplitude threshold.
The threshold is lowest exactly at the target frequency (determined by m_gainThreshold).
As the candidate frequency moves further away from the target frequency (either up or down), the required threshold increases exponentially.
The steepness of this V-shape is controlled independently for frequencies above the target (m_slopeUp) and below the target (m_slopeDown).

The vocoder evaluates all current atoms:

If an atom’s magnitude is below the threshold for its distance from the target frequency, it is ignored.
For all atoms above the threshold, it calculates a score: magnitude / threshold.
The atom with the highest score wins.

Trigger Cancellation

If no spectral atom meets the threshold criteria (e.g., if the input signal is silent, or if there are no partials remotely near the target pitch), the Deep Vocoder cancels the trigger (ahdControl->m_trig = false). The voice will not sound.

If a winning atom is found, the voice’s pitch is snapped (quantized) to the exact frequency of that partial, and the trigger proceeds normally.

This allows an external audio source to act as a dynamic, harmonically rich, real-time quantizer and gate for the sequenced melodies.

This site is open source. Improve this page.

theallelectricsmartgrid

Deep Vocoder

Spectral Analysis

V-Shaped Ranking and Quantization

Trigger Cancellation

Related