This repository contains the code for multiple Preparedness evals that use nanoeval and alcatraz.
- Python 3.11 (3.12 is untested; 3.13 will break chz)
forprojin nanoeval alcatraz nanoeval_alcatraz;do pip install -e project/"$proj"done
- PaperBench
- SWELancer (Forthcoming)
- MLE-bench (Forthcoming)