Currently, voice frameworks are either fully hosted with limited customizability (e.g. Vapi, Retell), or fully customizable while requiring you to host and scale the agent yourself (Livekit, Pipecat). We weren’t satisfied with these options, so we built Jay.
Jay makes it easy for you to add custom logic such as a RAG pipeline, an arbitrary LLM provider, or anything else that controls the LLM’s response. It’s built on top of the standard STT → LLM → TTS pipeline, and handles things like voice interruptions automatically. It also supports function calling (i.e. tool calls).
You can deploy your first agent to production in just a few minutes.
Try it out here, and let us know what you think! https://jay.so/