Skip to content

Latest commit

 

History

History
43 lines (38 loc) · 1.87 KB

README.md

File metadata and controls

43 lines (38 loc) · 1.87 KB

Mechanistic Interpretability: A Meta repository

Iteration Heads

In a first paper, we study chain-of-thoughts in a controlled setting, illustrating the usefulness of iteration heads.

Installation

The code requires Python 3.10+ (for case matching). Here is some installation instruction:

  • Install miniconda.
  • Install python in a new conda environment: be mindful to install a version of python that is compatible with PyTorch 2 (e.g., PyTorch 2.0.1 requires python 3.10-, and PyTorch 2.1 requires python 3.11- to use torch.compile).
conda create -n llm
conda activate llm
conda install pip
  • Install Pytorch and check CUDA support: be mindful to install a version that is compatible with your CUDA driver (example) (use nvidia-smi to check your CUDA driver)
pip install torch --index-url https://download.pytorch.org/whl/cu118
python -c "import torch; print(torch.cuda.is_available())"
True
  • Install this repo
git clone <repo url>
cd <repo path>
pip install -e .

Development

For formatting, I recommand using black, flake8, and isort. Consider automatic formatting when saving files (easy to setup in VSCode, ask ChatGPT to get set up if not confortable with VSCode configuration).

Organization

The main code is in the src folder. Other folders include:

  • data: contains data used in the experiments.
  • launchers: contains bash scripts to launch experiments
  • models: saves models' weights.
  • notebooks: used for exploration and visualization.
  • scripts: contains python scripts to run experiments.
  • tests: contains tests for the code.
  • tutorial: contains tutorial notebooks to get started with LLMs' training.