gpjt

Mon Jan 12, 2009 10:44am PST

Karma:

1711

about

https://www.gilesthomas.com/

submitted

Wed Apr 29, 2026 1:15pm PST

10Gb/s Ethernet: what I did to get it working in my home

@gpjt

30

154

213

Tues Apr 28, 2026 5:55pm PST

10Gb Ethernet: what I had to (re)learn

@gpjt

1

1

1

Wed Apr 22, 2026 4:38pm PST

LLM from scratch, part 33 – what I learned from the appendices

@gpjt

5

Tues Apr 21, 2026 12:14am PST

LLM from scratch (32l) – Interventions: updated instruction fine-tuning results

@gpjt

1

Fri Apr 17, 2026 10:45pm PST

How an LLM becomes more coherent as we train it

@gpjt

3

Wed Apr 15, 2026 8:32pm PST

LLM from scratch, part 32k – Interventions: gradient accumulation

@gpjt

2

Fri Apr 10, 2026 2:06pm PST

Provision: LLM-powered server setup from Markdown

@gpjt

2

Thurs Apr 9, 2026 7:08pm PST

LLM from scratch, part 32j – trying to train a better model in the cloud

@gpjt

2

Tues Apr 7, 2026 7:59pm PST

Writing an LLM from scratch, part 32i – Interventions: what is in the noise?

@gpjt

1

Fri Apr 3, 2026 11:08pm PST

Writing an LLM from scratch, part 32h – Interventions: full fat float32

@gpjt

7

Tues Mar 24, 2026 7:53pm PST

Writing an LLM from scratch, part 32g – Interventions: weight tying

@gpjt

2

Tues Mar 24, 2026 12:05am PST

Writing an LLM from scratch, part 32f – Interventions: weight decay

@gpjt

6

Tues Mar 10, 2026 5:04pm PST

Writing an LLM from scratch, part 32e – Interventions: the learning rate

@gpjt

3

Sat Feb 7, 2026 12:12am PST

Writing an LLM from scratch, part 32d – Interventions: adding attention bias

@gpjt

6

Thurs Feb 5, 2026 11:39pm PST

Writing an LLM from scratch, part 32c – Interventions: removing dropout

@gpjt

1

Thurs Feb 5, 2026 1:22am PST

Writing an LLM from scratch, part 32B – Interventions: gradient clipping

@gpjt

2

Wed Feb 4, 2026 2:09am PST

Writing an LLM from scratch, part 32a – Interventions: training a baseline model

@gpjt

1

Wed Jan 28, 2026 11:00pm PST

Getting a Custom PyTorch LLM onto the Hugging Face Hub

@gpjt

1

Sat Jan 17, 2026 7:58pm PST

Writing an LLM from scratch, part 31 – the models are now on Hugging Face

@gpjt

2

Fri Jan 9, 2026 1:17am PST

Writing an LLM from scratch, part 30 – digging into the LLM-as-a-judge results

@gpjt

1

Wed Jan 7, 2026 8:45pm PST

LLM from scratch, part 29 – using DDP to train a base model in the cloud

@gpjt

2

Tues Dec 2, 2025 6:17pm PST

LLM from scratch, part 28 – training a base model from scratch on an RTX 3090

@gpjt

22

121

540

Tues Nov 4, 2025 12:51am PST

Writing an LLM from scratch, part 27 – what's left, and what's next?

@gpjt

1

Mon Nov 3, 2025 7:41pm PST

Writing an LLM from scratch, part 26 – evaluating the fine-tuned model

@gpjt

4

Wed Oct 29, 2025 9:05pm PST

Writing an LLM from scratch, part 25 – instruction fine-tuning

@gpjt

2

Tues Oct 28, 2025 8:17pm PST

Writing an LLM from scratch, part 24 – the transcript hack

@gpjt

1

Fri Oct 24, 2025 6:55pm PST

Retro Language Models: Rebuilding Karpathy's RNN in PyTorch

@gpjt

3

Wed Oct 22, 2025 11:04pm PST

Writing an LLM from scratch, part 23 – fine-tuning for classification

@gpjt

1

Wed Oct 15, 2025 11:42pm PST

Writing an LLM from scratch, part 22 – training our LLM

@gpjt

5

10

254

Sat Oct 11, 2025 1:02am PST

Revisiting Karpathy's 'Unreasonable Effectiveness of Recurrent Neural Networks'

@gpjt

2

Tues Oct 7, 2025 7:05pm PST

Writing an LLM from scratch, part 21 – perplexed by perplexity

@gpjt

1

Thurs Oct 2, 2025 9:14pm PST

Writing an LLM from scratch, part 20 – starting training, and cross entropy loss

@gpjt

2

3

41

Wed Sep 17, 2025 2:31pm PST

How Do LLMs Work?

@gpjt

1

1

2

Tues Sep 2, 2025 11:10pm PST

The maths you need to start understanding LLMs

@gpjt

31

120

616

Fri Aug 29, 2025 7:06pm PST

What AI chatbots are doing under the hood

@gpjt

2

Mon Aug 18, 2025 7:25pm PST

LLM from scratch, part 18 – residuals, shortcut connections, and the Talmud

@gpjt

2

Thurs Aug 14, 2025 10:46pm PST

The fixed length bottleneck and the feed forward network

@gpjt

1

Tues Aug 12, 2025 10:08pm PST

Writing an LLM from scratch, part 17 – the feed-forward network

@gpjt

8

Tues Jul 8, 2025 7:17pm PST

Writing an LLM from scratch, part 16 – layer normalisation

@gpjt

1

Thurs Jun 5, 2025 6:15pm PST

Leaving PythonAnywhere

@gpjt

3

Sat May 31, 2025 11:25pm PST

Writing an LLM from scratch, part 15 – from context vectors to logits

@gpjt

1

7

Wed May 14, 2025 8:28pm PST

Writing an LLM from scratch, part 14 – the complexity of self-attention at scale

@gpjt

1

1

Thurs May 8, 2025 9:06pm PST

Writing an LLM from scratch, part 13 – attention heads are dumb

@gpjt

10

67

351

Mon Apr 21, 2025 10:55pm PST

Writing an LLM from scratch, part 12 – multi-head attention

@gpjt

3

Sat Apr 19, 2025 10:10pm PST

Writing an LLM from scratch, part 11 – batches

@gpjt

2

Mon Apr 14, 2025 9:47pm PST

The Business of the AI Labs

@gpjt

1

3

19

Thurs Mar 20, 2025 1:25am PST

Writing an LLM from scratch, part 10 – dropout

@gpjt

4

8

90

Tues Mar 18, 2025 10:36pm PST

Adding /Llms.txt

@gpjt

1

Mon Mar 10, 2025 1:46am PST

Writing an LLM from scratch, part 9 – causal attention

@gpjt

4

Wed Mar 5, 2025 1:41am PST

Writing an LLM from scratch, part 8 – trainable self-attention

@gpjt

6

31

380