hckrnws
back
ModelForge
Wed Dec 18, 2024 1:36pm PST
Karma:
332
submitted
Wed Dec 31, 2025 3:40pm PST
The State of LLMs 2025: Progress, Problems, and Predictions
@ModelForge
3
Tues Nov 4, 2025 3:00pm PST
A Researcher's Field Guide to Non-Standard LLM Architectures
@ModelForge
2
Mon Nov 3, 2025 4:59pm PST
Explanation of Gated DeltaNet (Qwen3-Next and Kimi Linear)
@ModelForge
3
Mon Oct 27, 2025 3:40pm PST
The Core Components of Modern LLMs and the Models Beyond Transformers [video]
@ModelForge
3
Wed Oct 15, 2025 2:17pm PST
Popular Attention Alternatives: GQA, MLA, SWA
@ModelForge
4
Mon Oct 13, 2025 6:24pm PST
Multi-Head Latent Attention
@ModelForge
4
Sat Oct 11, 2025 7:57pm PST
Thinking Machines Lab Co-Founder Departs for Meta
@ModelForge
7
Fri Oct 10, 2025 8:41pm PST
OpenAI's internal Slack messages could cost it billions in copyright suit
@ModelForge
1
1
8
Sun Oct 5, 2025 3:55pm PST
LLM Evaluation from Scratch: Multiple Choice, Verifiers, Leaderboards, LLM Judge
@ModelForge
4
Wed Aug 20, 2025 2:01pm PST
Gemma 3 270M re-implemented in pure PyTorch for local tinkering
@ModelForge
14
57
417
Sun Aug 10, 2025 3:06pm PST
GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2
@ModelForge
16
97
490
Wed Dec 18, 2024 2:17pm PST
LLM Research Papers: The 2024 List
@ModelForge
5
Wed Dec 18, 2024 1:37pm PST
Scaling Test-Time Compute with Open LLM Models
@ModelForge
3