What MP3 qualities are supported?

Any bitrate (32-320 kbps), any sample rate (8-48 kHz), mono or stereo, CBR or VBR. ID3v1/v2 tags are ignored (content only).

How long does transcription take?

Typically 1/5 to 1/10 of audio duration. A 60-minute podcast usually finishes in 6-12 minutes on our servers.

85-99% for clear speech. Background music, reverb, crosstalk, and heavy accents reduce accuracy.

Very reliable for 2-3 distinct voices. Quality drops with 4+ simultaneous speakers or frequent interruptions.

Yes (coming soon with Pro) — one-click AI summary plus key-topic extraction from the transcript.

Free tier: 5 minutes per file, 10 minutes per day anonymous; 30 minutes per month signed-in. Pro plan: 10 hours per month.

Get accurate text from any MP3 audio file. Perfect for podcasts, interviews, lectures, and recorded meetings. Free to start.

1
Upload your MP3
Drop or browse for an MP3 file up to 2 GB. Any bitrate from 32 kbps mono up to 320 kbps stereo is accepted.
2
Choose language
Pick from 99+ languages or let AI auto-detect. Enable speaker separation for multi-voice recordings.
3
Export transcript
Review, edit, and download as TXT, DOCX, PDF, or with timecodes as SRT/VTT. MP3 deleted within 24 hours.

All MP3 bitrates
32 kbps mono to 320 kbps stereo. CBR, VBR, ID3v1/v2 metadata all handled.
Speaker separation
Labels different voices — essential for interviews, podcasts, and meetings.
99+ languages
Auto-detect or pick from every major world language.
Word-level timestamps
Jump from any phrase in the transcript directly to that moment in the audio.
Private and secure
Encrypted in transit and at rest. Files auto-deleted. Never used for AI training.

Podcast show notes
Generate searchable transcripts and episode summaries for every podcast.
Interview journalism
Transcribe recorded interviews for quoting, fact-checking, and article writing.
Voice memos
iPhone voice memos default to M4A but convert easily to MP3. Transcribe for meeting notes or idea capture.
Lecture recordings
Turn recorded classes into study notes, searchable text, and accessible summaries.

Built for creators, journalists, students, and researchers worldwide.

What MP3 qualities are supported?: Any bitrate (32-320 kbps), any sample rate (8-48 kHz), mono or stereo, CBR or VBR. ID3v1/v2 tags are ignored (content only).
How long does transcription take?: Typically 1/5 to 1/10 of audio duration. A 60-minute podcast usually finishes in 6-12 minutes on our servers.
How accurate is the AI?: 85-99% for clear speech. Background music, reverb, crosstalk, and heavy accents reduce accuracy.
Is speaker separation reliable with 2-3 speakers?: Very reliable for 2-3 distinct voices. Quality drops with 4+ simultaneous speakers or frequent interruptions.
Can I get a summary after transcription?: Yes (coming soon with Pro) — one-click AI summary plus key-topic extraction from the transcript.
Is this really free?: Free tier: 5 minutes per file, 10 minutes per day anonymous; 30 minutes per month signed-in. Pro plan: 10 hours per month.