USAAIO Prep — Barry's AI Olympiad Workbench

New — past-problem archive. 4 real past problems (IOAI 2024 NLP, IOAI 2025 CV, NOAI China 2024 tabular ML, USAAIO 2025 Round 1 theory) with full walkthroughs: official statement, baseline code, the techniques that historically lifted scores, and a follow-up drill. Start here after the stratum pages.

What this site covers

Each module is a self-contained page. The study plan stitches them into a weekly progression from math foundations through transformer fine-tuning.

Orient

About the contest

Format, eligibility, the three-stage structure (Round 1 → Round 2 → USAAIO Camp), Team USA selection for IOAI / IAIO.

Read briefing →

Foundations

Math you need

Linear algebra, probability and statistics, multivariable calculus, convex optimization — only the parts that actually show up in ML.

Review math →

Tooling

Python data stack

NumPy, pandas, matplotlib, seaborn, scikit-learn — environment setup, idioms, common pitfalls.

Open toolkit →

Learn

Classical ML

Regression, classification, ensembles, cross-validation, clustering, dimensionality reduction — the scikit-learn surface.

Browse models →

Deep Learning

PyTorch & neural nets

Tensors, autograd, MLPs, the standard layers, forward/backpropagation, training loops, regularization.

Open notebook →

Modern AI

Attention & transformers

Tokenization, embeddings, self-attention, transformer blocks, pre-training and fine-tuning, NLP and vision applications.

See architecture →

Drill

Round 2 theory drills

Thirty short-answer problems with full collapsible solutions — softmax Jacobians, Hoeffding bounds, receptive fields, KV-cache math, the DDPM forward process.

Open drill bank →

Vision

U-Net & segmentation

Encoder-decoder CNN with skip connections — biomedical and semantic segmentation, and the denoiser inside every diffusion model.

Open U-Net →

Generative

Variational autoencoders

Probabilistic autoencoder with a Gaussian latent. ELBO, reparameterisation, KL divergence — and the latent compressor inside Stable Diffusion.

Open VAE →

Generative

Diffusion models (DDPM)

Forward noising, reverse denoising, the simplified epsilon-prediction loss, and how Stable Diffusion extends DDPM with latent space + cross-attention.

Open DDPM →

Graphs

Graph neural networks

Message passing, GCN normalised adjacency, GAT attention, readout pooling — molecule property prediction and citation classification.

Open GNN →

Sequential decisions

Reinforcement learning

MDPs, Bellman, Q-learning, REINFORCE, actor-critic, PPO clipped objective. Tabular gridworld + DQN sketch + CartPole policy gradient in PyTorch.

Open RL →

Transformers

Attention variants

MQA, GQA, FlashAttention, linear & sparse attention, sliding-window, RoPE — cost-cutting toolbox for long-context and fast-inference transformers.

Open variants →

Adapt

Fine-tuning · LoRA & QLoRA

Adapt pretrained models on a Colab budget — linear probing, adapters, LoRA, QLoRA, SFT, and DPO without a reward model.

Open fine-tuning →

Practical

Data augmentation

Image, text, tabular & audio aug — RandAugment, MixUp/CutMix, EDA, back-translation, SMOTE, SpecAugment, and TTA at inference.

Open augmentation →

Validate

Model evaluation

Metrics (F1, AUC, MAE, BLEU, FID), validation strategies (stratified / group / rolling-origin / nested CV), bias-variance, leaderboard tactics — probing, ensembling, calibration, threshold tuning.

Open evaluation →

Explain

Interpretability

SHAP, LIME, Grad-CAM, Integrated Gradients, attention rollout. The Round-2 toolkit for explaining what your model learned.

Open interpretability →

MLOps

MLOps & submission packaging

Seeds, env pinning, atomic checkpoints, inference scripts, submission-CSV gotchas (BOM, line endings, sort order), Dockerfile basics, 10-min pre-submit checklist.

Ship clean submissions →

Notebooks

End-to-end Colab notebooks

Four runnable pipelines: Titanic tabular ML, CIFAR-10 PyTorch CNN, DistilBERT IMDB fine-tune, Bayesian A/B test.

Open notebooks →

Mocks

Mock contests

Two full USAAIO-style mocks: a 90-min tabular ML problem (Round-1 style) and a 180-min small-vision problem (Round-2 style). Each ships with a constraints panel, scoring rubric, and a separate reference-solution page.

Sit a mock →

Past contest walkthroughs

Four real IOAI / USAAIO past problems (NLP, CV, tabular, theory) with baseline code, improvement playbooks, and source links.

Open archive →

Set · USAAIO 2026

USAAIO 2026 Round 1

Just-released 9-problem set. Forum link-only for now; walkthroughs to be written as Barry sits each problem.

Open forum index →

Set · USAAIO 2025

USAAIO 2025 Round 2

3 problems: 12-part theory, 14-part pipeline, and the 20-part CLIP / flickr30k multimodal mega-task.

Open forum index →

Set · IOAI 2025

IOAI 2025 Individual Contest

Beijing 2025 individual tasks: chicken counting (CV), radar (ML), concepts (NLP). 3 of 6 featured.

Open set →

Set · USAAIO 2025

USAAIO 2025 Round 1

Online qualifier, 5 problems: recurrence, affine NN, EDA/ML pipeline, CNN, transformer attention.

Open set →

Set · IAIO 2024

IAIO 2024 (Russia-led)

6 questions, 22 sub-question forum threads. Distinct from UNESCO-backed IOAI; useful tabular/CV reps.

Open forum index →

Set · IOAI 2024

IOAI 2024 Scientific Round

At-home + on-site: ML matrix feature gen, NLP ciphered language, CV zebra/giraffe weight swap, on-site cow+hydrant. 4 task walkthroughs.

Open set →

Set · IOAI 2024

IOAI 2024 Practical Round

4-hour team challenge: album cover + music video from one song. Brief, cover, video walkthroughs.

Open set →

Set · USAAIO 2024

USAAIO 2024 Round 1 [reconstructed]

2024 PDF not yet archived; syllabus-driven reconstruction: probability, regression, trees, MLP, embeddings.

Open set →

Survival

Engineering survival

Fourteen common ML/DL pitfalls (data leakage, train/inference mismatches, seeding) plus a Colab/Kaggle playbook — runtimes, checkpoints, profiling, submission template.

Avoid the bugs →

Reference

Contest cheatsheets

One dense page: NumPy/pandas/matplotlib/sklearn idioms, PyTorch shape ops and training loop boilerplate, layer shape rules, Hugging Face Trainer, and a complexity table.

Open cheatsheets →

Plan

Six-month plan

Week-by-week schedule from math review through deep learning fluency, calibrated for a Grade 9 ramp.

Open calendar →

Library

Resources

Textbooks, courses, datasets, paper recommendations, and competitive practice grounds (Kaggle, AIcrowd).

Browse links →

Why an AI olympiad is different

Theory and code are both graded. Pure math kids who can't ship a notebook fail. Pure coders who can't reason about gradients fail. You need both.
Datasets are part of the problem. Loading, cleaning, splitting, and feature engineering carry as many points as model choice.
Compute is constrained. Final-round problems often run on a fixed CPU budget — a 100M-parameter transformer isn't the right answer; a 200K-parameter MLP often is.
Reproducibility matters. Random seeds, deterministic training, and version pinning are part of the deliverable.

Big-picture reminders

Three stages. Round 1 (online qualifier, ~300+ participants) → Round 2 (in-person, threshold-based, ~19% advance — 76 finalists in 2025) → USAAIO Camp (June, at MIT in 2026; Harvard hosted in prior years). All answers submit through Google Colab. Round 1 is CPU-only; Round 2 may use L4 GPUs.

Language. Python is the de facto language. The required libraries — NumPy, pandas, matplotlib, scikit-learn, PyTorch — are all standard, but you must be comfortable enough to debug them under contest pressure.

The international path. Top scorers in Round 2 are invited to the USAAIO Camp (held at MIT in June). Camp team-selection tests — not Round 2 itself — pick Team USA for IOAI and IAIO. The camp is the real gate.

My notebook for the USA AI Olympiad.

What this site covers

About the contest

Math you need

Python data stack

Classical ML

PyTorch & neural nets

Attention & transformers

Round 2 theory drills

U-Net & segmentation

Variational autoencoders

Diffusion models (DDPM)

Graph neural networks

Reinforcement learning

Attention variants

Fine-tuning · LoRA & QLoRA

Data augmentation

Model evaluation

Interpretability

MLOps & submission packaging

End-to-end Colab notebooks

Mock contests

Past contest walkthroughs

USAAIO 2026 Round 1

USAAIO 2025 Round 2

IOAI 2025 Individual Contest

USAAIO 2025 Round 1

IAIO 2024 (Russia-led)

IOAI 2024 Scientific Round

IOAI 2024 Practical Round

USAAIO 2024 Round 1 [reconstructed]

Engineering survival

Contest cheatsheets

Six-month plan

Resources

Suggested reading order

Why an AI olympiad is different

Big-picture reminders