- I often still prefer Google, because I feel like I can get an answer quicker - I'd rather ask a smaller LLM a few questions than gpt-4 just one - The latency of LLMs is often enough to lose your momentum or abort the generation
So I asked myself how I could built the fastest LLM prompt for the CLI? My best guess is to use the fastest language (Rust ) and the fastest LLM (Mixtral powered by https://groq.com)
And it's a game changer for me! At this speed it can replace most Googling, reading man pages, looking stuff up, … I can't wait to extend it with more features! =)
Do you any ideas how to get it even faster?