Hacker News: LLM from scratch, part 28 – training a base model from scratch on an RTX 3090

LLM from scratch, part 28 – training a base model from scratch on an RTX 3090

489 points • gpjt • 7 days ago • 101 comments