llm.nim

This is a port of Andrej Karpathy’s llm.c project (the CPU version). I toyed around with it the day he released the initial version for a couple of hours, but only continued with it on a train / plane trip I had last weekend (<2024-04-20 Sat>). In particular we started from commit a22c22b. In particular this means the tokenizer is not there at the moment. I might add it one of these days.

Performance is ~comparable to the C version.

Note: the port was done in a bit of a hurry, so who knows what bugs lurk compared to the original! :)

Differences to the C version

We have a very shallow abstraction of the raw pointer buffer interface from C in the form of a MView[T] type (which is just a ptr UncheckedArray[T] in Nim lang + a few goodies, notably a {} accessor to do pointer arithmetic for a more ‘natural’ access to another buffer.
We use Nim’s CT features to automatically assign the correct buffer views for the *Tensor fields based on fieldPairs, their order in the object and the params/act_sizes input.
Generally less pointer handling.
We use destructors for the GPT2 and DataLoader objects so that we don’t have to free manually (and copying these is disallowed)
Instead of relying on OpenMP’s collapse primitive to fuse multiple nested loops, we use a custom Nim CT based loop fusion macro, see ./fuse_loops.nim. The issue is that because Nim converts for loops into while statements, it doesn’t play nice with nested loops for OpenMP. :) So I wrote a macro that wraps around for loops and performs the loop fusion manually (note that it only works for for i in 0 ..< X style loops (i.e. lower index 0 and using ..<).

Notes about Nim

We have to compile with --exceptions:quirky, because otherwise the inserted Nim error checks break the OpenMP compilation. We could disable checks locally in the code, but for this here it’s fine.

Compilation

The important compilation arguments are defined in a local nim.cfg and at the top of the train_gpt2.nim file (fast-math and OpenMP related).

Otherwise just compile with:

nim c -d:danger -d:openmp -d:lto --passC:"-march=native" train_gpt2.nim

Otherwise follow the CPU instructions from the original repo to get started: https://github.com/karpathy/llm.c?tab=readme-ov-file#quick-start-cpu

Why?

Similar to my Nim port of his llama2.c, I had time to kill on a trip! And doing such ‘dumb’ ports is kind of meditative… lol

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.org		README.org
fuse_loops.nim		fuse_loops.nim
prepro_tinyshakespeare.py		prepro_tinyshakespeare.py
prepro_tinystories.py		prepro_tinystories.py
requirements.txt		requirements.txt
train_gpt2.nim		train_gpt2.nim
train_gpt2.py		train_gpt2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llm.nim

Differences to the C version

Notes about Nim

Compilation

Why?

About

Releases

Packages

Languages

Vindaar/llm.nim

Folders and files

Latest commit

History

Repository files navigation

llm.nim

Differences to the C version

Notes about Nim

Compilation

Why?

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages