Jingles

Jingles are short audio clips that play over regular audio with automatic volume ducking.

How It Works

When a jingle plays:

The target player or zone volume is reduced (ducked)
The jingle audio plays through the target
When the jingle ends, the original volume fades back up

Fade-in and fade-out transitions are configurable.

Ducking lowers each source relative to its current volume, so a source that is already muted stays silent for the duration of the jingle — playing a jingle never makes a muted player audible. Sources at full volume duck to the configured duck level as usual.

Targets

Jingles can target:

Specific players — Duck and overlay on selected players
Specific zones — Duck and overlay on selected zones

Text-to-speech (TTS) jingles

Instead of supplying a pre-recorded audio file, you can create a jingle by typing text. MZAP synthesizes the speech to an audio file, and from that point on the TTS jingle behaves exactly like a file-based jingle — it plays through the same ducking pipeline, can be scheduled by a JinglePlay action, and appears in the jingle list (with a speech-bubble icon and the spoken text in its subtitle).

Creating a TTS jingle

On the Jingles tab, click TTS.
In the modal, set a name, pick a provider and voice, type the text, and adjust the speaking-rate slider.
Click Preview to hear the voice before saving — the preview audio is never stored on the jingle.
Save. The synthesized audio is generated and the jingle is created.

Editing a TTS jingle’s text, voice, or rate re-synthesizes the audio automatically. Cosmetic edits (volume, duck level, fade) skip re-synthesis.

Text is capped at a 5000-character soft limit (about six minutes of speech) on every synthesis entry point.

Providers

Provider	Type	Setup	License
Windows	Offline	None — uses the voices installed on the host machine	Included
Azure Speech	Cloud	Your own Azure key + region	Cloud TTS feature
ElevenLabs	Cloud	Your own ElevenLabs key (no region)	Cloud TTS feature

The Windows provider works offline and free, using the voices installed via Windows Settings → Time & language → Speech. The two cloud providers — Azure Speech (~140 neural voices across 90+ languages) and ElevenLabs (premium multilingual voices) — require the Cloud TTS license feature and an API key you supply yourself. See Settings — Text-to-speech for the step-by-step key setup.

Cloud-provider voices are multilingual or grouped by language, and failed synthesis (invalid key, region typo, quota error) is reported inline in the modal — no jingle is created if synthesis fails.

Use Cases

Store announcements over background music
Time signals or alerts
Scheduled welcome messages
DJ drops and station IDs
Spoken closing-time or safety announcements via TTS