1 year ago
Thurs May 2, 2024 3:05pm PST
Show HN: Analyzing GPT-4 Tokens with Llama3
Inspired by Andrej Karpathy's excellent YouTube video on tokenizers, I used Llama3 to analyze all 100,000 GPT-4 tokens. The results were somewhat expected — a strong focus on English and code. Interestingly, only 124 tokens were dedicated to my native Dutch, which might explain why it underperforms in that language.
read article
comments:
add comment
loading comments...