1 month ago
Tues Feb 17, 2026 6:53pm PST
Ask HN: Best multi-lingual text-to-speech system
I'm looking for a way to bulk-generate audio based on text files. Ideally, it would be a system I can run locally (M3 mac, 24GB RAM), and support at least 10 languages natively.

I have tried a few systems (eSpeak, Piper, QWEN) and none of them have given satisfactory results. Huggingface seems to have no text-to-speech models with particular acclaim, either. I have been using OpenAI's gpt-4o-mini model, but that seems to be approaching end-of-life.

Is there an LLM (or non-LLM) system that you would recommend?

comments:
add comment
loading comments...