gpjt
Mon Jan 12, 2009 10:44am PST
Karma:
1711
about
https://www.gilesthomas.com/
submitted
Wed Apr 29, 2026 1:15pm PST
10Gb/s Ethernet: what I did to get it working in my home
@gpjt
30
154
213
Tues Apr 28, 2026 5:55pm PST
10Gb Ethernet: what I had to (re)learn
@gpjt
1
1
1
Wed Apr 22, 2026 4:38pm PST
LLM from scratch, part 33 – what I learned from the appendices
@gpjt
5
Tues Apr 21, 2026 12:14am PST
LLM from scratch (32l) – Interventions: updated instruction fine-tuning results
@gpjt
1
Fri Apr 17, 2026 10:45pm PST
How an LLM becomes more coherent as we train it
@gpjt
3
Wed Apr 15, 2026 8:32pm PST
LLM from scratch, part 32k – Interventions: gradient accumulation
@gpjt
2
Fri Apr 10, 2026 2:06pm PST
Provision: LLM-powered server setup from Markdown
@gpjt
2
Thurs Apr 9, 2026 7:08pm PST
LLM from scratch, part 32j – trying to train a better model in the cloud
@gpjt
2
Tues Apr 7, 2026 7:59pm PST
Writing an LLM from scratch, part 32i – Interventions: what is in the noise?
@gpjt
1
Fri Apr 3, 2026 11:08pm PST
Writing an LLM from scratch, part 32h – Interventions: full fat float32
@gpjt
7
Tues Mar 24, 2026 7:53pm PST
Writing an LLM from scratch, part 32g – Interventions: weight tying
@gpjt
2
Tues Mar 24, 2026 12:05am PST
Writing an LLM from scratch, part 32f – Interventions: weight decay
@gpjt
6
Tues Mar 10, 2026 5:04pm PST
Writing an LLM from scratch, part 32e – Interventions: the learning rate
@gpjt
3
Sat Feb 7, 2026 12:12am PST
Writing an LLM from scratch, part 32d – Interventions: adding attention bias
@gpjt
6
Thurs Feb 5, 2026 11:39pm PST
Writing an LLM from scratch, part 32c – Interventions: removing dropout
@gpjt
1
Thurs Feb 5, 2026 1:22am PST
Writing an LLM from scratch, part 32B – Interventions: gradient clipping
@gpjt
2
Wed Feb 4, 2026 2:09am PST
Writing an LLM from scratch, part 32a – Interventions: training a baseline model
@gpjt
1
Wed Jan 28, 2026 11:00pm PST
Getting a Custom PyTorch LLM onto the Hugging Face Hub
@gpjt
1
Sat Jan 17, 2026 7:58pm PST
Writing an LLM from scratch, part 31 – the models are now on Hugging Face
@gpjt
2
Fri Jan 9, 2026 1:17am PST
Writing an LLM from scratch, part 30 – digging into the LLM-as-a-judge results
@gpjt
1
Wed Jan 7, 2026 8:45pm PST
LLM from scratch, part 29 – using DDP to train a base model in the cloud
@gpjt
2
Tues Dec 2, 2025 6:17pm PST
LLM from scratch, part 28 – training a base model from scratch on an RTX 3090
@gpjt
22
121
540
Tues Nov 4, 2025 12:51am PST
Writing an LLM from scratch, part 27 – what's left, and what's next?
@gpjt
1
Mon Nov 3, 2025 7:41pm PST
Writing an LLM from scratch, part 26 – evaluating the fine-tuned model
@gpjt
4
Wed Oct 29, 2025 9:05pm PST
Writing an LLM from scratch, part 25 – instruction fine-tuning
@gpjt
2
Tues Oct 28, 2025 8:17pm PST
Writing an LLM from scratch, part 24 – the transcript hack
@gpjt
1
Fri Oct 24, 2025 6:55pm PST
Retro Language Models: Rebuilding Karpathy's RNN in PyTorch
@gpjt
3
Wed Oct 22, 2025 11:04pm PST
Writing an LLM from scratch, part 23 – fine-tuning for classification
@gpjt
1
Wed Oct 15, 2025 11:42pm PST
Writing an LLM from scratch, part 22 – training our LLM
@gpjt
5
10
254
Sat Oct 11, 2025 1:02am PST
Revisiting Karpathy's 'Unreasonable Effectiveness of Recurrent Neural Networks'
@gpjt
2
Tues Oct 7, 2025 7:05pm PST
Writing an LLM from scratch, part 21 – perplexed by perplexity
@gpjt
1
Thurs Oct 2, 2025 9:14pm PST
Writing an LLM from scratch, part 20 – starting training, and cross entropy loss
@gpjt
2
3
41
Wed Sep 17, 2025 2:31pm PST
How Do LLMs Work?
@gpjt
1
1
2
Tues Sep 2, 2025 11:10pm PST
The maths you need to start understanding LLMs
@gpjt
31
120
616
Fri Aug 29, 2025 7:06pm PST
What AI chatbots are doing under the hood
@gpjt
2
Mon Aug 18, 2025 7:25pm PST
LLM from scratch, part 18 – residuals, shortcut connections, and the Talmud
@gpjt
2
Thurs Aug 14, 2025 10:46pm PST
The fixed length bottleneck and the feed forward network
@gpjt
1
Tues Aug 12, 2025 10:08pm PST
Writing an LLM from scratch, part 17 – the feed-forward network
@gpjt
8
Tues Jul 8, 2025 7:17pm PST
Writing an LLM from scratch, part 16 – layer normalisation
@gpjt
1
Thurs Jun 5, 2025 6:15pm PST
Leaving PythonAnywhere
@gpjt
3
Sat May 31, 2025 11:25pm PST
Writing an LLM from scratch, part 15 – from context vectors to logits
@gpjt
1
7
Wed May 14, 2025 8:28pm PST
Writing an LLM from scratch, part 14 – the complexity of self-attention at scale
@gpjt
1
1
Thurs May 8, 2025 9:06pm PST
Writing an LLM from scratch, part 13 – attention heads are dumb
@gpjt
10
67
351
Mon Apr 21, 2025 10:55pm PST
Writing an LLM from scratch, part 12 – multi-head attention
@gpjt
3
Sat Apr 19, 2025 10:10pm PST
Writing an LLM from scratch, part 11 – batches
@gpjt
2
Mon Apr 14, 2025 9:47pm PST
The Business of the AI Labs
@gpjt
1
3
19
Thurs Mar 20, 2025 1:25am PST
Writing an LLM from scratch, part 10 – dropout
@gpjt
4
8
90
Tues Mar 18, 2025 10:36pm PST
Adding /Llms.txt
@gpjt
1
Mon Mar 10, 2025 1:46am PST
Writing an LLM from scratch, part 9 – causal attention
@gpjt
4
Wed Mar 5, 2025 1:41am PST
Writing an LLM from scratch, part 8 – trainable self-attention
@gpjt
6
31
380