Hacker News

LLM from scratch, part 28 – training a base model from scratch on an RTX 3090