Rubric self-check

Grade your own draft the way a COMAP judge would. Each row scores 0–5 with explicit anchors at 0, 2, and 5 drawn from COMAP's published judges' commentaries on Outstanding and Finalist papers. The total updates as you type; the band on the right gives a rough mapping to the COMAP recognition tiers.

How this maps to real judging. COMAP judges read in two passes — a fast triage pass (does the summary make sense? are the assumptions sane? is there a model?) and a slow content pass on the papers that survive triage. The first six rows below are the triage signals. If those are weak, the rest does not get read carefully. Score yourself honestly on those six first.

I · Triage signals (judge's first 5 minutes)

1. Summary Sheet quality. A judge should be able to read only the one-page summary and know what you did, what you found, and what you recommend.
  • 0: Restates the prompt; no results; no recommendation.
  • 2: Describes the method but not the answer.
  • 5: Restates problem, names approach, lists concrete findings with numbers, and gives 3–4 bullet recommendations.
2. Problem restatement. In your own words — not paraphrased from the prompt.
  • 0: Missing, or copy-pasted from the COMAP PDF.
  • 2: Lightly paraphrased.
  • 5: Re-stated with the team's own framing, scope explicitly bounded, sub-questions enumerated.
3. Assumptions, each with a justification. Numbered list; every assumption explains why and where it's used.
  • 0: No assumptions section, or an unjustified bullet list.
  • 2: Assumptions present but generic ("we assume the model is linear").
  • 5: 6–12 numbered assumptions, each with a "why" and a forward reference to the equation that uses it.

II · Modelling content

4. Model validity. Is the math right and appropriate for the problem?
  • 0: Wrong technique, or right technique used incorrectly.
  • 2: Reasonable technique but undefended choices.
  • 5: Technique justified against alternatives, derivations shown, units consistent, no algebra errors.
5. Variables and notation table.
  • 0: Missing.
  • 2: Present, but incomplete or no units.
  • 5: Every symbol used in equations is defined, with units and a one-line meaning.
6. Numbered equations with derivations.
  • 0: Equations dropped in without explanation.
  • 2: Equations named but not derived.
  • 5: Each key equation is numbered, derived from stated assumptions, and referenced by number later.
7. Application to required scenarios. Does the model answer every part of the prompt?
  • 0: Only the easiest sub-question addressed.
  • 2: Most sub-questions touched but some dropped.
  • 5: Every requirement in the prompt addressed, in order, with explicit cross-references.
8. Results presentation (tables and figures).
  • 0: Walls of text, no figures, raw output dumps.
  • 2: Figures present but no captions or unclear axes.
  • 5: Every figure has a number, caption, and is referenced from the body; every table is summarized in one sentence.

III · Robustness

9. Sensitivity analysis. What happens when you perturb the inputs?
  • 0: Missing.
  • 2: One parameter swept, no interpretation.
  • 5: Multiple parameters perturbed (Monte Carlo or grid), results tabulated, rank stability or output stability quantified, conclusions changed where appropriate.
10. Strengths and weaknesses, both honest.
  • 0: Only strengths listed, or only weaknesses.
  • 2: Both present but weaknesses are throwaway ("we could use more data").
  • 5: Specific named weaknesses with explanation of how they affect results; specific strengths tied to the modelling choices.
11. Verification / validation. Does the model produce sensible answers on known cases?
  • 0: No verification.
  • 2: Sanity check mentioned but not executed.
  • 5: Concrete known-answer cases run through the model and shown to match.

IV · Communication

12. Non-technical letter / article / blog. Different voice, different audience.
  • 0: Missing.
  • 2: Copy-paste of the executive summary.
  • 5: Written for the named stakeholder, jargon stripped, conclusions stated plainly, length 1–2 pages.
13. Prose quality and structure.
  • 0: Disorganised; section headings missing or inconsistent.
  • 2: Standard sections present but uneven prose.
  • 5: Clear narrative thread; sections connect; figures referenced naturally; under 25 pages with margin.
14. References and citations.
  • 0: No references, or fabricated ones.
  • 2: Some references but inconsistent format and uncited in body.
  • 5: Real, verifiable sources, cited at point of use, formatted consistently.

V · Compliance

15. AI usage report. Required by current COMAP rules; sits outside the 25-page count.
  • 0: Missing, despite AI having been used.
  • 2: Present but vague ("we used ChatGPT for editing").
  • 5: Tool, prompt, where used, and verification step listed for every use; full logs in an appendix.
16. Anonymity and formatting. Team number on every page, no names anywhere, page limit respected.
  • 0: Names or school visible; over 25 pages; missing team number.
  • 2: One slip (e.g., a stray name in metadata).
  • 5: Fully anonymous; team-number/page-N-of-25 header on every page; PDF properties scrubbed.

Weighted total

Each criterion is weighted 1–3 (shown via the data-w attribute). Modelling, sensitivity, and "did you answer the question" are weighted highest. Max possible total = .

0 /

What the bands mean

The bands below are calibrated against the rough recognition rates COMAP publishes each year. They are approximate — judges read in panels and a high total does not guarantee Outstanding — but they are a useful self-check.

Score (% of max)Likely tierWhat's typically true at this level
≥ 90%Outstanding territoryAll four sections strong; sensitivity and verification done; non-technical letter is real.
75–89%Finalist rangeModel solid and sensitivity present, but communication or robustness has gaps.
60–74%Meritorious rangeModel works, addresses prompt, but missing one of: sensitivity, verification, polished letter.
45–59%Honorable Mention rangeEffort visible, but two or more of the triage signals are weak.
< 45%Successful ParticipantSubmitted but with major gaps.

How to use this rubric during the contest

  1. Day 10 of 14: First self-grade. You should be at 60% or more. If not, the model itself is the problem — stop polishing and re-do the model.
  2. Day 12 of 14: Second self-grade. Rows 1, 7, 9, and 12 are the easiest wins at this stage — they cost hours, not days, and they swing the score most.
  3. Day 14 morning: Final self-grade, plus the formatting/anonymity row (16). Print to PDF, check the PDF properties, then submit.