3 months ago
Mon Mar 31, 2025 2:07pm PST
Show HN: Speak, capture, type - my voice assistant works anywhere on your system
I built Vibevoice, a tool that lets you dictate text and run voice commands with screen context anywhere on your system. It works like this:

For regular dictation, hold the right Ctrl key while speaking, then release to have your words typed automatically wherever your cursor is - perfect for coding, emails, or chat apps.

The more interesting feature is the AI command mode: hold the Scroll Lock key, speak a prompt, and a local LLM responds based on both your words AND a screenshot of what you're looking at. The AI's response gets typed directly into your application as if you typed it yourself.

Everything runs locally using Whisper for transcription and Ollama for the LLM (I recommend gemma3:27b for best results). No cloud services or API costs.

The tool was inspired by Karpathy's "vibe coding" and builds upon Vlad's whisper-keyboard project. I extended it to work with local models and added the screenshot context feature, which makes the AI much more useful for everyday tasks like writing e-mails.

Let me know what you think!

read article
comments:
add comment
loading comments...