What this site covers
Each module is a self-contained page. The study plan stitches them into a weekly progression from math foundations through transformer fine-tuning.
About the contest
Format, eligibility, the three-stage structure (Round 1 → Round 2 → USAAIO Camp), Team USA selection for IOAI / IAIO.
Math you need
Linear algebra, probability and statistics, multivariable calculus, convex optimization — only the parts that actually show up in ML.
Python data stack
NumPy, pandas, matplotlib, seaborn, scikit-learn — environment setup, idioms, common pitfalls.
Classical ML
Regression, classification, ensembles, cross-validation, clustering, dimensionality reduction — the scikit-learn surface.
PyTorch & neural nets
Tensors, autograd, MLPs, the standard layers, forward/backpropagation, training loops, regularization.
Attention & transformers
Tokenization, embeddings, self-attention, transformer blocks, pre-training and fine-tuning, NLP and vision applications.
Round 2 theory drills
Thirty short-answer problems with full collapsible solutions — softmax Jacobians, Hoeffding bounds, receptive fields, KV-cache math, the DDPM forward process.
U-Net & segmentation
Encoder-decoder CNN with skip connections — biomedical and semantic segmentation, and the denoiser inside every diffusion model.
Variational autoencoders
Probabilistic autoencoder with a Gaussian latent. ELBO, reparameterisation, KL divergence — and the latent compressor inside Stable Diffusion.
Diffusion models (DDPM)
Forward noising, reverse denoising, the simplified epsilon-prediction loss, and how Stable Diffusion extends DDPM with latent space + cross-attention.
Graph neural networks
Message passing, GCN normalised adjacency, GAT attention, readout pooling — molecule property prediction and citation classification.
Reinforcement learning
MDPs, Bellman, Q-learning, REINFORCE, actor-critic, PPO clipped objective. Tabular gridworld + DQN sketch + CartPole policy gradient in PyTorch.
Attention variants
MQA, GQA, FlashAttention, linear & sparse attention, sliding-window, RoPE — cost-cutting toolbox for long-context and fast-inference transformers.
Fine-tuning · LoRA & QLoRA
Adapt pretrained models on a Colab budget — linear probing, adapters, LoRA, QLoRA, SFT, and DPO without a reward model.
Data augmentation
Image, text, tabular & audio aug — RandAugment, MixUp/CutMix, EDA, back-translation, SMOTE, SpecAugment, and TTA at inference.
Model evaluation
Metrics (F1, AUC, MAE, BLEU, FID), validation strategies (stratified / group / rolling-origin / nested CV), bias-variance, leaderboard tactics — probing, ensembling, calibration, threshold tuning.
Interpretability
SHAP, LIME, Grad-CAM, Integrated Gradients, attention rollout. The Round-2 toolkit for explaining what your model learned.
MLOps & submission packaging
Seeds, env pinning, atomic checkpoints, inference scripts, submission-CSV gotchas (BOM, line endings, sort order), Dockerfile basics, 10-min pre-submit checklist.
End-to-end Colab notebooks
Four runnable pipelines: Titanic tabular ML, CIFAR-10 PyTorch CNN, DistilBERT IMDB fine-tune, Bayesian A/B test.
Mock contests
Two full USAAIO-style mocks: a 90-min tabular ML problem (Round-1 style) and a 180-min small-vision problem (Round-2 style). Each ships with a constraints panel, scoring rubric, and a separate reference-solution page.
Past contest walkthroughs
Four real IOAI / USAAIO past problems (NLP, CV, tabular, theory) with baseline code, improvement playbooks, and source links.
USAAIO 2026 Round 1
Just-released 9-problem set. Forum link-only for now; walkthroughs to be written as Barry sits each problem.
USAAIO 2025 Round 2
3 problems: 12-part theory, 14-part pipeline, and the 20-part CLIP / flickr30k multimodal mega-task.
IOAI 2025 Individual Contest
Beijing 2025 individual tasks: chicken counting (CV), radar (ML), concepts (NLP). 3 of 6 featured.
USAAIO 2025 Round 1
Online qualifier, 5 problems: recurrence, affine NN, EDA/ML pipeline, CNN, transformer attention.
IAIO 2024 (Russia-led)
6 questions, 22 sub-question forum threads. Distinct from UNESCO-backed IOAI; useful tabular/CV reps.
IOAI 2024 Scientific Round
At-home + on-site: ML matrix feature gen, NLP ciphered language, CV zebra/giraffe weight swap, on-site cow+hydrant. 4 task walkthroughs.
IOAI 2024 Practical Round
4-hour team challenge: album cover + music video from one song. Brief, cover, video walkthroughs.
USAAIO 2024 Round 1 [reconstructed]
2024 PDF not yet archived; syllabus-driven reconstruction: probability, regression, trees, MLP, embeddings.
Engineering survival
Fourteen common ML/DL pitfalls (data leakage, train/inference mismatches, seeding) plus a Colab/Kaggle playbook — runtimes, checkpoints, profiling, submission template.
Contest cheatsheets
One dense page: NumPy/pandas/matplotlib/sklearn idioms, PyTorch shape ops and training loop boilerplate, layer shape rules, Hugging Face Trainer, and a complexity table.
Six-month plan
Week-by-week schedule from math review through deep learning fluency, calibrated for a Grade 9 ramp.
Resources
Textbooks, courses, datasets, paper recommendations, and competitive practice grounds (Kaggle, AIcrowd).
Suggested reading order
- Get oriented. Read About the contest to understand the three-stage format (Round 1 → Round 2 → USAAIO Camp) and how Team USA is actually picked.
- Lock in the math. Work through the math review. You don't need all of multivariable calculus — just the slice that powers gradient descent and PCA.
- Get fluent in the Python data stack. The Python toolkit covers NumPy + pandas + matplotlib until you can manipulate arrays without thinking.
- Sweep classical ML. The classical ML page covers every scikit-learn family in one sitting: linear models, trees, ensembles, clustering.
- Build a neural net from scratch. The deep learning page walks through a manual MLP in NumPy, then the same thing in PyTorch.
- Understand attention. The transformers page goes from tokenizer to scaled dot-product attention to a working transformer block.
Why an AI olympiad is different
- Theory and code are both graded. Pure math kids who can't ship a notebook fail. Pure coders who can't reason about gradients fail. You need both.
- Datasets are part of the problem. Loading, cleaning, splitting, and feature engineering carry as many points as model choice.
- Compute is constrained. Final-round problems often run on a fixed CPU budget — a 100M-parameter transformer isn't the right answer; a 200K-parameter MLP often is.
- Reproducibility matters. Random seeds, deterministic training, and version pinning are part of the deliverable.