The quality of your Voice Clone depends almost entirely on the quality of the audio you submit. Deliah’s AI can only work with what you give it — clean, natural recordings produce a clone that sounds genuinely like you, while recordings with noise, distortion, or an unnatural delivery produce one that falls flat. This guide walks you through everything you need to record well.Documentation Index
Fetch the complete documentation index at: https://docs.deliah.ai/llms.txt
Use this file to discover all available pages before exploring further.
What you need
You do not need professional studio equipment. A modern smartphone with a decent built-in microphone is sufficient if your recording environment is clean. That said, if you have access to a dedicated microphone — even an entry-level USB condenser mic — use it. Better equipment gives you more headroom. What matters most is not the hardware. It is the environment and your delivery.The recording process
Choose a quiet room
Find a space where external sounds cannot reach the microphone. A bedroom with soft furnishings works well — fabric absorbs echo and dampens ambient noise. Avoid kitchens, rooms with hard floors, or any space near a road, air conditioning unit, or appliance. Close windows and doors. Turn off fans, air conditioning, and any device that makes a continuous sound.Background sources to eliminate before you start:
- Music or TV playing anywhere in the space
- Air conditioning or heating units
- Street noise or open windows
- Washing machines, dishwashers, or other appliances running nearby
- Other people talking
Set up your device
Place your phone or microphone at a consistent distance — roughly 15–30 cm (6–12 inches) from your mouth. Hold it steady or prop it on a surface. Do not hold it in your hand while speaking, as movement and handling noise will be picked up. If using a phone, open your preferred voice memo or recording app and do a 5-second test clip. Play it back and listen critically for hum, echo, or background sounds before you begin the full recording.
Record your audio
Aim for at least 30 seconds of clean audio. For the best results, record 1–3 minutes. The more material you submit, the more your clone can learn — more natural pauses, more variation in pace and emphasis, a richer emotional range.You can talk about anything: what you are doing today, something you are excited about, a story from your week. The content is secondary to the delivery. What matters is that your voice sounds like you at your most natural and comfortable.
Listen back before submitting
Before you upload, play the recording back in full. Listen for:
- Audible background noise at any point
- Muffling or distortion (usually caused by holding the mic too close or moving it)
- Long silences where nothing is happening
- Other voices or sounds overlapping with yours
Submit your recording
Upload your audio through the Deliah platform. You can submit multiple files — in fact, recording in several shorter sessions and submitting them all is a great approach if you find it hard to speak naturally for a full minute at once.See the audio variations guide for how to record the three emotional variation types (Normal, Whisper, and Ecstasy) that give your clone its full range.
Minimum and recommended lengths
| Amount | What it produces |
|---|---|
| Under 30 seconds | Not accepted — insufficient data for a usable clone |
| 30 seconds – 1 minute | Minimum viable clone; limited range and naturalness |
| 1–3 minutes | Recommended — produces a natural, versatile clone |
| 3+ minutes across multiple variations | Best possible results; maximum authenticity and emotional range |
What counts as acceptable audio
You do not need to record everything fresh. You can also submit:- Existing audio files where only your voice is present
- Existing video files (Deliah extracts the audio)
- Voice messages from real fan chats, provided the audio quality is good