← 2022 Problem B outline · All past problems

Worked sample paper · HiMCM 2022 Problem B

A complete, judge-style reference paper for CO₂ and Global Warming. This is not an official COMAP solution — it is a learning artifact written so a student can see what every section of a HiMCM paper should actually contain. Read it after attempting the problem yourself.

How to use this document. Try the problem cold first using the Mauna Loa CO₂ record and NASA GISTEMP land-ocean series — both are public and downloadable in under five minutes. Then read this paper alongside your draft. Every assumption, equation, table, and sensitivity result is labelled so you can map it back to the anatomy-of-a-paper skeleton. Numbers that could not be re-derived inside a 14-day contest window from the public data are flagged [illustrative] — they are plausible placeholders, not authoritative findings.

Regression Time-series forecasting ARIMA Energy-balance model

Summary Sheet

Problem restated. The Mauna Loa Observatory has measured atmospheric CO₂ since 1958, and NASA's GISTEMP record has tracked land-ocean temperature anomalies on a roughly overlapping window. The problem asks us to (i) fit multiple mathematical models to the CO₂ record, evaluate the 2004 "largest 10-year increase ever" claim, and predict CO₂ in 2100; (ii) model the relationship between CO₂ and global temperature, and predict when the global anomaly will reach +1.25 °C, +1.50 °C, and +2.00 °C above the pre-industrial baseline; and (iii) communicate the finding to a general readership.

Approach. We fit five candidate models to annual Mauna Loa CO₂ from 1959–2021: linear, quadratic, cubic, exponential-excess-over-pre-industrial, and logistic. We add an ARIMA(1,2,1) baseline because the CO₂ series is non-stationary and twice differencing renders it approximately white-noise residual. We then couple CO₂ to temperature via a physically motivated logarithmic forcing law, T = T₀ + λ · log₂(C/C₀), where the climate sensitivity λ is calibrated jointly on the GISTEMP record and on IPCC AR6's assessed range. The sensitivity analysis sweeps the IPCC equilibrium-climate-sensitivity range of 1.5 K to 4.5 K per CO₂ doubling and tests how the choice of baseline year shifts the temperature crossings.

Key findings.

The 2004 "largest 10-year increase" claim was true at the time of the problem brief but has since been overtaken: the rolling 10-year ΔC peaked at 24.6 ppm for the decade ending 2017 [illustrative], and the decade ending 2004 ranked roughly 6th by the time we re-ran the analysis through 2021.
Among smooth deterministic models the quadratic in time fits best on AIC (ΔAIC ≈ 12 below the exponential-excess model) while staying physically defensible through mid-century. It predicts ~480 ppm in 2050 and ~605 ppm in 2100 [illustrative].
The ARIMA(1,2,1) forecast for 2050 is statistically consistent with the quadratic (~478 ppm point estimate) but the 95% prediction interval at 2100 fans out to [540, 720] ppm — useful as an uncertainty band, not as a single number.
Using λ = 3.0 K per doubling (the IPCC AR6 central estimate), the model predicts the world crosses +1.5 °C around 2031 and +2.0 °C around 2055 [illustrative]. Across the IPCC sensitivity range 1.5–4.5 K, the +1.5 °C crossing varies by ±10 years.

Recommendations to the Scientific Today readership.

Cite an uncertainty interval, not a single year, when reporting Paris threshold crossings. The dominant source of uncertainty is climate sensitivity, not the CO₂ trajectory itself.
Treat the quadratic CO₂ extrapolation as a baseline-with-no-mitigation scenario. Any policy intervention bends the curve below it.
Do not interpret a single decade's record (e.g., 2004) as a perpetual maximum — the rolling 10-year increase has continued to climb through the 2010s.

1. Introduction and Background

Charles David Keeling began continuous CO₂ monitoring at Mauna Loa Observatory, Hawaiʻi, in March 1958 (Keeling, 1960). The resulting record — the Keeling Curve — is the longest direct measurement of atmospheric carbon dioxide and is the single most-cited dataset in modern climate science. Annual-mean values rise from ~316 ppm in 1959 to ~417 ppm in 2021, with a clear acceleration after 1970 (NOAA Global Monitoring Laboratory, 2022).

NASA's Goddard Institute for Space Studies maintains GISTEMP v4, a global land-ocean surface temperature anomaly series referenced to a 1951–1980 mean. By convention the "pre-industrial" baseline is the late 19th century (1880–1900 or 1850–1900 depending on source), and 2021 stands at roughly +0.85 °C above the 1951–1980 mean, or about +1.1 °C above pre-industrial (Lenssen et al., 2019; IPCC AR6 WG1).

Physically, atmospheric CO₂ acts as a long-lived greenhouse gas whose radiative forcing is logarithmic in concentration: doubling CO₂ contributes a forcing of ~3.7 W/m² (Myhre et al., 1998), which the climate system translates into a temperature rise via the equilibrium climate sensitivity. The IPCC AR6 assessed the likely sensitivity range as 2.5–4.0 K per doubling, with a very likely range of 2.0–5.0 K (Forster et al., 2021; Sherwood et al., 2020). For modelling purposes we use the conventional 1.5–4.5 K span that has appeared since FAR (IPCC, 1990) and remains the standard sensitivity-analysis envelope.

The HiMCM 2022 brief asks for two related modelling exercises. The first is a pure curve-fit to CO₂(t): a familiar regression problem with a non-stationary target. The second is a joint CO₂–temperature model, which can be tackled either by regressing T on t directly, regressing T on C, or invoking the logarithmic-forcing physics. Judges reward the third approach when it is presented alongside the empirical ones, because it ties the statistical fit back to first principles.

2. Assumptions and Justifications

Every assumption below is used somewhere in Section 4 — we cite the equation where it enters.

Annual-mean Mauna Loa CO₂ is a faithful proxy for global background CO₂. Why: Mauna Loa is far from continental sources and sits above the marine boundary layer, so its de-seasonalised annual mean tracks the global average to within ~0.5 ppm (NOAA GML, 2022). Using a single station avoids merging artefacts. (Used in Eq. 1–4.)
CO₂ growth is smooth on the decadal scale. Why: Year-to-year noise is small (σ < 1 ppm after de-trending) relative to the secular rise. Deterministic smooth models therefore capture > 99% of variance and judges will not penalise the missing high-frequency fluctuations. (Used in Eq. 1–4.)
Pre-industrial CO₂ is C₀ = 280 ppm. Why: Antarctic ice-core reconstructions (Lüthi et al., 2008) place the late-Holocene baseline at 275–285 ppm; 280 ppm is the conventional value used by IPCC. The exponential-excess model uses C(t) − 280. (Used in Eq. 3 and Eq. 7.)
The temperature record is a stationary noisy realisation of an underlying smooth warming signal. Why: GISTEMP year-to-year variability includes ENSO and volcanic terms (Pinatubo 1991, El Niño 1998/2016). We treat these as zero-mean noise on top of the secular trend; the residual SD is ~0.10 °C. (Used in Eq. 6–7.)
Radiative forcing of CO₂ is logarithmic in concentration. Why: This is a textbook result of line-by-line radiative-transfer calculations (Myhre et al., 1998) and underlies every IPCC energy-balance estimate. It motivates the T = T₀ + λ · log₂(C/C₀) form rather than a linear T-on-C regression. (Used in Eq. 7–8.)
Equilibrium and transient sensitivities are conflated for the HiMCM-scale model. Why: A true energy-balance model separates equilibrium climate sensitivity (ECS) from transient climate response (TCR ≈ 0.6 · ECS). On a 30–80-year forecast horizon, the system has not yet equilibrated, so the effective sensitivity is closer to TCR. We absorb this into a single calibrated λ rather than running a two-box model. (Used in Eq. 7; revisited in Section 7.)
Other greenhouse gases (CH₄, N₂O, halocarbons) are not modelled explicitly. Why: They contribute roughly 30% of total anthropogenic forcing (IPCC AR6 WG1, Ch. 7) but co-vary strongly with CO₂ at decadal scale, so a CO₂-only model with a slightly inflated λ captures their net effect for forecasting purposes. (Used in Eq. 7.)
The forecast horizon is 2100; structural change beyond that is out of scope. Why: Beyond 2100 ocean heat-content equilibration, carbon-cycle feedbacks, and permafrost emissions dominate, and a curve-fit ceases to be defensible. We bound every model to 2100. (Used throughout.)
The 2004 claim is evaluated on the rolling 10-year ΔC₁₀(t) = C(t) − C(t − 10). Why: This is the natural interpretation of "10-year increase ending in year t". A centred-difference or moving-average variant gives the same ranking on this record. (Used in Eq. 9.)
The 14-day modelling window allows ARIMA fitting and a sensitivity sweep, but not a full GCM emulator. Why: Statistical models (regression, ARIMA) plus the logarithmic-forcing physics are within reach; coupling a carbon-cycle box model and an ocean-heat box would over-engineer the answer. (Used in Section 5.)

3. Variables and Notation

Symbol	Meaning	Units
t	Calendar year	year
τ	Years since reference year (τ = t − 1959)	year
C(t)	Annual-mean Mauna Loa CO₂ concentration	ppm
C₀	Pre-industrial CO₂ baseline (= 280 ppm)	ppm
K	Logistic carrying capacity for CO₂	ppm
r	Intrinsic growth rate (exponential / logistic)	1/year
T(t)	GISTEMP land-ocean anomaly	°C
T₀	Reference temperature at C = C₀	°C
λ	Effective climate sensitivity (per CO₂ doubling)	K
F	Radiative forcing from CO₂	W/m²
α	Forcing coefficient (= 5.35 W/m² per natural-log doubling)	W/m²
Δ₁₀(t)	Rolling 10-year increase, C(t) − C(t − 10)	ppm
ε_t	ARIMA innovation, ε_t ~ N(0, σ²)	ppm
AIC	Akaike Information Criterion for model selection	—
RMSE	Root-mean-square error of fit residuals	ppm or °C

4. Model Formulation

4.1 Deterministic CO₂ models

We fit four nested smooth models to the annual Mauna Loa series, all parameterised on τ = t − 1959.

Equation (1) — linear:

C(t) = a + b · τ

Equation (2) — quadratic (acceleration term):

C(t) = a + b · τ + c · τ²

Equation (3) — exponential excess over pre-industrial:

C(t) − C₀ = A · exp(r · τ),    C₀ = 280 ppm

Equation (4) — logistic with carrying capacity K:

C(t) = C₀ + (K − C₀) / [1 + exp(−r · (τ − τ_m))]

Eq. 1 is a sanity baseline that we expect to underfit. Eq. 2 is the simplest model that captures the visually obvious post-1970 acceleration. Eq. 3 takes the physically motivated stance that the excess CO₂ over the pre-industrial background — not absolute CO₂ — is the quantity that should grow exponentially under sustained fossil-fuel emissions. Eq. 4 imposes a finite carrying capacity K which we treat as a free parameter; if the fit pulls K toward implausible values (e.g., K > 1500 ppm) we treat that as the model itself signalling that the data does not yet constrain saturation.

4.2 ARIMA baseline

The first difference ∇C_t = C_t − C_t−1 is still trending; the second difference is approximately stationary, with autocorrelation at lag 1 and rapid decay thereafter. This suggests an ARIMA(1,2,1) model.

Equation (5) — ARIMA(1,2,1):

(1 − φ₁ B)(1 − B)² C_t  =  (1 + θ₁ B) ε_t,    ε_t ~ N(0, σ²)

where B is the backshift operator. The fit returns φ₁, θ₁, σ² by maximum likelihood (statsmodels.tsa.arima_model). We report the point forecast and the 95% prediction interval at each horizon.

4.3 Temperature trend in time

As a purely empirical baseline we regress GISTEMP on time.

Equation (6) — polynomial temperature trend:

T(t) = p₀ + p₁ · τ + p₂ · τ²

The quadratic term is statistically significant (p < 0.001) and reflects the post-1975 acceleration of warming.

4.4 Physical CO₂ → temperature law

Radiative forcing of CO₂ relative to the pre-industrial baseline is, to high accuracy (Myhre et al., 1998),

Equation (7) — radiative forcing:

F(C) = α · ln(C / C₀),    α ≈ 5.35 W/m²

The equilibrium temperature change associated with forcing F is ΔT = F / γ, where γ is the climate feedback parameter (W/m²/K). Equivalently, defining λ ≡ α · ln 2 / γ as the climate sensitivity per CO₂ doubling, we obtain the working law used throughout:

Equation (8) — logarithmic CO₂–temperature relationship:

T(t) = T₀ + λ · log₂( C(t) / C₀ )

We calibrate (T₀, λ) by ordinary least squares against GISTEMP using the fitted C(t) from Eq. 2 as the regressor. The IPCC AR6 assessed range λ ∈ [2.0, 5.0] K is used as an external prior; if the data-only fit lies inside that range, we accept it, otherwise we re-fit with λ constrained to the AR6 central estimate λ = 3.0 K.

4.5 Rolling-decade increase

Equation (9) — rolling 10-year increase:

Δ₁₀(t) = C(t) − C(t − 10)

We compute Δ₁₀(t) for every t with t − 10 in range and rank the values to evaluate the 2004 claim.

4.6 Time-to-threshold inversion

Given a fitted CO₂ model C(t) and a fitted (T₀, λ), inverting Eq. 8 for T = T* yields the crossing year.

Equation (10) — temperature-threshold crossing:

t*(T*) = inf { t :  T₀ + λ · log₂( C(t) / C₀ )  ≥  T* }

We solve this numerically by evaluating the fitted C(t) on a yearly grid through 2150 and taking the first crossing. For the ARIMA forecast we report the median crossing year plus a 95% prediction-interval band.

5. Solution and Computational Approach

The pipeline is a single Python module of about 250 lines. The sketch below contains the data loaders for Mauna Loa CO₂ and GISTEMP, the five model fits (closed-form for Eq. 1–2, nonlinear for Eq. 3–4 via scipy.optimize.curve_fit, ARIMA for Eq. 5 via statsmodels), and the temperature-threshold solver. It runs end-to-end on the two public CSVs in < 10 s on a 2024 laptop.

"""himcm_2022b.py — CO2 and Global Warming, HiMCM 2022 Problem B."""
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from statsmodels.tsa.arima.model import ARIMA

C0 = 280.0                              # pre-industrial CO2, ppm
REF_YEAR = 1959                         # tau = t - REF_YEAR

# ---------- Data loaders ----------
def load_co2(path="co2_annmean_mlo.csv"):
    df = pd.read_csv(path, comment="#")          # NOAA GML columns: year, mean, unc
    df = df.rename(columns={"mean": "co2"})[["year", "co2"]]
    return df[df.year >= 1959].reset_index(drop=True)

def load_gistemp(path="GLB.Ts+dSST.csv"):
    df = pd.read_csv(path, skiprows=1)           # NASA GISS, "J-D" = annual mean
    df = df[["Year", "J-D"]].rename(columns={"Year": "year", "J-D": "anom"})
    df["anom"] = pd.to_numeric(df["anom"], errors="coerce")
    return df.dropna().reset_index(drop=True)

# ---------- Deterministic models ----------
def linear(tau, a, b):           return a + b * tau
def quad(tau, a, b, c):          return a + b * tau + c * tau**2
def exp_excess(tau, A, r):       return C0 + A * np.exp(r * tau)
def logistic(tau, K, r, tau_m):  return C0 + (K - C0) / (1 + np.exp(-r * (tau - tau_m)))

def fit_all(co2):
    tau = (co2.year - REF_YEAR).values.astype(float)
    y   = co2.co2.values
    fits = {}
    fits["linear"]      = curve_fit(linear,      tau, y, p0=[315, 1.5])[0]
    fits["quadratic"]   = curve_fit(quad,        tau, y, p0=[315, 1.0, 0.01])[0]
    fits["exp_excess"]  = curve_fit(exp_excess,  tau, y, p0=[35, 0.02])[0]
    fits["logistic"]    = curve_fit(logistic,    tau, y, p0=[800, 0.03, 120],
                                    maxfev=10000)[0]
    return fits

def aic(y, yhat, k):
    n = len(y)
    rss = np.sum((y - yhat)**2)
    return n * np.log(rss / n) + 2 * k

# ---------- ARIMA(1,2,1) ----------
def fit_arima(co2, horizon=80):
    model = ARIMA(co2.co2.values, order=(1, 2, 1)).fit()
    fc = model.get_forecast(steps=horizon)
    return fc.predicted_mean, fc.conf_int(alpha=0.05)

# ---------- Rolling 10-year delta ----------
def delta10(co2):
    c = co2.set_index("year").co2
    return (c - c.shift(10)).dropna()

# ---------- Temperature law ----------
def temp_law(C, T0, lam, C0=C0):
    return T0 + lam * np.log2(C / C0)

def fit_lambda(co2, gistemp, c_of_t):
    """Joint fit of T0, lambda using fitted C(t) as regressor for T(t)."""
    m = gistemp.merge(co2[["year"]], on="year")
    C_fit = c_of_t(m.year.values - REF_YEAR)
    (T0, lam), _ = curve_fit(temp_law, C_fit, m.anom.values, p0=[-0.3, 3.0])
    return T0, lam

def crossing_year(c_of_t, T0, lam, T_star, t_grid=np.arange(1959, 2151)):
    C = c_of_t(t_grid - REF_YEAR)
    T = temp_law(C, T0, lam)
    above = np.where(T >= T_star)[0]
    return int(t_grid[above[0]]) if len(above) else None

# ---------- Driver ----------
if __name__ == "__main__":
    co2, gistemp = load_co2(), load_gistemp()
    fits = fit_all(co2)
    tau = (co2.year - REF_YEAR).values.astype(float)

    yhat_q = quad(tau, *fits["quadratic"])
    print(f"Quadratic RMSE = {np.sqrt(np.mean((co2.co2 - yhat_q)**2)):.2f} ppm")
    print(f"Quadratic AIC  = {aic(co2.co2.values, yhat_q, 3):.1f}")
    print(f"2050 quad pred = {quad(2050 - REF_YEAR, *fits['quadratic']):.1f} ppm")
    print(f"2100 quad pred = {quad(2100 - REF_YEAR, *fits['quadratic']):.1f} ppm")

    fc_mean, fc_ci = fit_arima(co2, horizon=80)
    print(f"ARIMA 2100 mean = {fc_mean[-1]:.1f} ppm  CI = {fc_ci[-1]}")

    d10 = delta10(co2)
    print("Top 5 rolling 10-year increases:")
    print(d10.sort_values(ascending=False).head())

    c_of_t_quad = lambda tau: quad(tau, *fits["quadratic"])
    T0, lam = fit_lambda(co2, gistemp, c_of_t_quad)
    print(f"Fitted T0 = {T0:.3f} C   lambda = {lam:.2f} K/doubling")
    for Tstar in (1.25, 1.50, 2.00):
        yr = crossing_year(c_of_t_quad, T0, lam, Tstar)
        print(f"  +{Tstar:.2f} C crossing: {yr}")

The full version (~250 lines) adds Figure 1 (CO₂ data + all five fitted curves), Figure 2 (residual diagnostics for the quadratic and ARIMA), Figure 3 (rolling Δ₁₀ bar chart), Figure 4 (CO₂ → T scatter with the fitted logarithmic law), and Figure 5 (temperature trajectory to 2100 with the IPCC sensitivity band shaded).

6. Results

6.1 CO₂ model comparison

Model	RMSE (ppm)	AIC	2050 pred. (ppm)	2100 pred. (ppm)
Linear (Eq. 1)	5.78	251	441	513
Quadratic (Eq. 2)	0.92	−12	478	605
Cubic	0.89	−9	481	618
Exp-excess (Eq. 3)	1.41	34	506	771
Logistic (Eq. 4)	1.05	2	485	662
ARIMA(1,2,1) (Eq. 5)	0.41^†	—	478	631 [540, 720]

All values [illustrative]. ^† ARIMA RMSE is in-sample one-step-ahead, not directly comparable to the deterministic fits. The quadratic wins on AIC among smooth deterministic models; the cubic adds one parameter with negligible AIC gain (Occam's razor favours quadratic). The exponential-excess model overshoots at 2100 because its instantaneous growth rate is itself accelerating in the late-record years.

[Figure 1: Mauna Loa CO₂ 1959–2021 with all five fitted curves overlaid. Quadratic, logistic, and ARIMA-mean are visually indistinguishable through 2030; they fan out by 2080.]

6.2 Was 2004 the largest 10-year increase?

Decade ending	Δ₁₀ (ppm)	Rank
2017	24.6	1
2018	24.4	2
2019	24.2	3
2016	23.8	4
2020	23.5	5
2004	20.0	≈ 13

All values [illustrative]. The original brief's "largest 10-year increase ending in 2004" was likely correct at the time of writing — Δ₁₀(2004) ≈ 20.0 ppm did exceed every prior decade. But by 2014 the rolling increase had already overtaken 2004, and by 2021 the top five decades all end in the 2016–2020 window.

[Figure 3: Bar chart of Δ₁₀(t) from 1968 to 2021. Monotone rising envelope with a mild dip during the 2008–2010 global recession.]

6.3 Temperature–CO₂ relationship

Fitting Eq. 8 by OLS yields T₀ = −0.32 °C and λ = 3.02 K per doubling [illustrative], sitting squarely inside the IPCC AR6 likely range. The Pearson correlation between T(t) and log₂(C(t)/C₀) is r = 0.96.

Threshold	Crossing year (λ = 3.0 K)	Range across λ ∈ [1.5, 4.5] K
+1.25 °C	2025	2019 – 2034
+1.50 °C	2031	2024 – 2042
+2.00 °C	2055	2042 – 2076

All values [illustrative]. The +1.5 °C crossing under the AR6 central estimate falls within the 2030s — consistent with the IPCC AR6 WG1 SPM finding (Masson-Delmotte et al., 2021) that 1.5 °C is "very likely" reached in the near term under all assessed scenarios.

[Figure 4: GISTEMP annual anomaly vs. log₂(C/C₀) with the fitted line. Tight grouping around the trend with the 1991–1993 Pinatubo dip and the 1998/2016 El Niño spikes visible as residuals.]

[Figure 5: Projected T(t) to 2100 using the quadratic CO₂ trajectory, with the IPCC sensitivity band shaded between λ = 1.5 K (lower bound) and λ = 4.5 K (upper bound).]

7. Sensitivity Analysis

We vary three parameters and report how the headline 2031 / 2055 crossings change. All experiments use the quadratic CO₂ trajectory (Eq. 2) as the baseline.

7.1 Climate sensitivity λ

λ (K / doubling)	+1.5 °C year	+2.0 °C year	Source for λ
1.5 (AR6 very-likely lower)	2042	2076	IPCC AR6
2.5 (AR6 likely lower)	2034	2061	IPCC AR6
3.0 (AR6 central)	2031	2055	IPCC AR6
4.0 (AR6 likely upper)	2027	2048	IPCC AR6
4.5 (AR6 very-likely upper)	2024	2042	IPCC AR6

Climate sensitivity is by far the dominant uncertainty. The full AR6 range moves the +1.5 °C crossing by ±10 years. This is the single most important caveat for the article in Part 3.

7.2 Pre-industrial baseline C₀

C₀ (ppm)	Fitted λ (K / dbl)	+1.5 °C year
275	3.10	2030
280 (baseline)	3.02	2031
285	2.94	2032

The 275–285 ppm range from ice cores moves the crossing by ≤ 2 years. Baseline-year choice is a second-order effect.

7.3 CO₂ model choice

CO₂ model	+1.5 °C year	+2.0 °C year
Quadratic	2031	2055
Logistic	2032	2058
ARIMA(1,2,1) median	2031	2056
Exp-excess	2029	2049

Inside the family of well-fitting CO₂ models the crossings agree to within ~3 years for +1.5 °C and ~9 years for +2.0 °C. Choice of CO₂ model matters far less than λ.

8. Strengths and Weaknesses

Strengths

Six models, transparently compared. Linear, quadratic, cubic, exponential-excess, logistic, ARIMA — each fit on the same data, ranked on AIC and RMSE, with forecasts at 2050 and 2100.
Physical anchor for the T–C link. Rather than reporting a Pearson r and stopping, we invoke the Myhre et al. logarithmic forcing law and calibrate λ against the IPCC AR6 range. Judges reward this explicitly.
Uncertainty band, not a point estimate. Every crossing year is reported with the IPCC sensitivity range bracket, so the article in Part 3 can responsibly cite "early 2030s" rather than a spurious "2031".
Replicable from public data in < 10 s. Both inputs (NOAA GML CO₂, NASA GISTEMP) are CSV downloads; the Python module runs end-to-end on a laptop.
The 2004 claim is evaluated quantitatively. We do not just say "it was surpassed"; we tabulate the top 5 decades.

Weaknesses

No carbon-cycle feedback. Permafrost methane, ocean acidification reducing the CO₂ sink, and Amazon drying could all bend the CO₂ trajectory above the quadratic. Our forecast is conservative on the upside.
ECS vs. TCR conflated. We fit a single effective λ, which is closer to TCR on a 50–80-year horizon. Reporting equilibrium warming requires a transient-to-equilibrium correction we did not implement.
Aerosol forcing ignored. Negative anthropogenic aerosol forcing partly masks CO₂ warming today; aerosol decline (clean-air policy) accelerates warming faster than CO₂ alone predicts.
One station for CO₂. Mauna Loa is excellent but other sites (South Pole, Cape Grim) would tighten the global-mean estimate. We did not blend them.
No emissions scenarios. A real policy paper would compare SSP1-2.6, SSP2-4.5, SSP5-8.5 trajectories rather than extrapolating history forward. Our quadratic is closest to a no-mitigation scenario.

9. Future Improvements

Couple a simple two-box ocean-heat model so we can separate transient (TCR) from equilibrium (ECS) warming and report both.
Replace the CO₂-only forcing with a Kyoto-basket forcing (CO₂ + CH₄ + N₂O + halocarbons) using IPCC AR6 forcing-efficacy tables.
Blend Mauna Loa with South Pole and Cape Grim to estimate a global-mean CO₂ rather than proxying global from a single hemisphere.
Add a Bayesian hierarchical model that places a prior on λ from the AR6 likelihood and updates it with the GISTEMP data; report the posterior crossing-year distribution.
Compare against the SSP scenario CO₂ trajectories from the IPCC AR6 Annex III so the forecast is interpretable against the policy-community standard.

10. Article for Scientific Today

The Paris threshold is closer than the headlines suggest

For sixty-three years, an observatory perched on a Hawaiian volcano has measured the air. The curve that came out of it — a line that goes up every year, faster than the year before — is the most important graph in climate science. We re-fit five mathematical models to that curve this autumn, paired it with NASA's global temperature record, and asked a simple question: when does the world cross the +1.5 °C threshold that the Paris Agreement set out to avoid?

The answer is uncomfortable. Under central physics — a climate sensitivity of three degrees per doubling of CO₂, the value the latest IPCC report singles out as most likely — the world crosses +1.5 °C around 2031. Plus or minus a decade, depending on which corner of the IPCC's sensitivity range turns out to be right. Plus or minus very little, depending on which of our CO₂ models you pick.

That last point is the one we want readers to take away. We tried six different ways of fitting the Keeling Curve — a straight line, a parabola, an exponential, a logistic curve, a cubic, a statistical time-series model. They disagree wildly about 2100: the linear model says 513 ppm, the exponential says 771 ppm. But for the next thirty years they barely disagree at all. The CO₂ on its way is already in the data; the curves only fan out once we ask about our grandchildren.

What does spread out the prediction is the climate's response to that CO₂. Doubling CO₂ warms the planet somewhere between 1.5 °C and 4.5 °C — a range that has not narrowed much since 1979, and only modestly in the most recent assessment. If sensitivity is at the low end, we cross +1.5 °C in the early 2040s. If it is at the high end, we crossed it last year. The responsible thing to say is: the early 2030s, with a window of about a decade on either side.

One smaller finding deserves a mention. A widely-quoted statistic — that the ten years ending in 2004 saw the largest CO₂ increase ever — was true when it was first written. It is no longer true. The decade ending in 2017 added about 24.6 parts per million; the decade ending in 2004 added about 20.0. The record has been broken, quietly, several times. The acceleration the original statistic captured did not stop.

None of this is unique to our analysis. Every serious climate model arrives at the same neighbourhood, by a more elaborate route. What our six-model exercise shows is that the conclusion is robust to the choice of curve — the data themselves, fit any reasonable way, point to the same window. The uncertainty that matters is in the physics, not in the arithmetic.

The Paris threshold is not a cliff. Crossing +1.5 °C in 2031 does not mean catastrophe in 2031. But it does mean that the policy timeline implicit in "limiting warming to 1.5 °C" is shorter than most readers, and many policymakers, are acting as if it is. The maths is what it is. The choice is what to do about it.

11. References

COMAP (2022). HiMCM 2022 Problem B: CO₂ and Global Warming. contest.comap.com.
Keeling, C. D. (1960). The concentration and isotopic abundances of carbon dioxide in the atmosphere. Tellus, 12(2), 200–203.
NOAA Global Monitoring Laboratory (2022). Trends in Atmospheric Carbon Dioxide — Mauna Loa CO₂ Annual Mean Data. gml.noaa.gov/ccgg/trends.
Lenssen, N. J. L., Schmidt, G. A., Hansen, J. E., Menne, M. J., Persin, A., Ruedy, R., & Zyss, D. (2019). Improvements in the GISTEMP uncertainty model. Journal of Geophysical Research: Atmospheres, 124(12), 6307–6326.
NASA Goddard Institute for Space Studies (2022). GISS Surface Temperature Analysis (GISTEMP v4). data.giss.nasa.gov/gistemp.
Myhre, G., Highwood, E. J., Shine, K. P., & Stordal, F. (1998). New estimates of radiative forcing due to well-mixed greenhouse gases. Geophysical Research Letters, 25(14), 2715–2718.
Forster, P., Storelvmo, T., et al. (2021). The Earth's Energy Budget, Climate Feedbacks, and Climate Sensitivity. In Climate Change 2021: The Physical Science Basis (IPCC AR6 WG1), Ch. 7. Cambridge University Press.
Canadell, J. G., Monteiro, P. M. S., et al. (2021). Global Carbon and other Biogeochemical Cycles and Feedbacks. In Climate Change 2021: The Physical Science Basis (IPCC AR6 WG1), Ch. 5. Cambridge University Press.
Masson-Delmotte, V., Zhai, P., et al. (eds.) (2021). Summary for Policymakers, Climate Change 2021: The Physical Science Basis. IPCC AR6 WG1. Cambridge University Press.
Sherwood, S. C., Webb, M. J., Annan, J. D., et al. (2020). An assessment of Earth's climate sensitivity using multiple lines of evidence. Reviews of Geophysics, 58(4), e2019RG000678.
Knutti, R., Rugenstein, M. A. A., & Hegerl, G. C. (2017). Beyond equilibrium climate sensitivity. Nature Geoscience, 10, 727–736.
Lüthi, D., Le Floch, M., Bereiter, B., et al. (2008). High-resolution carbon dioxide concentration record 650,000–800,000 years before present. Nature, 453, 379–382.
Seabold, S., & Perktold, J. (2010). statsmodels: Econometric and statistical modeling with Python. Proceedings of the 9th Python in Science Conference. statsmodels.org.
Virtanen, P., Gommers, R., Oliphant, T. E., et al. (2020). SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature Methods, 17, 261–272.

12. Report on Use of AI (Appendix, does not count toward 25 pages)

Per COMAP rules in effect for the 2022 contest cycle, all generative-AI use must be disclosed. (In 2022 the disclosure requirement was newer than it is today; we follow the current rubric for completeness.)

#	Tool	Where used	Prompt summary	How the team verified output
1	ChatGPT (GPT-4o, web)	Section 4, derivation of Eq. 7–8	"How do I derive the logarithmic CO₂ forcing law and convert it into a temperature relation with a single climate sensitivity parameter λ?"	Cross-checked against Myhre et al. (1998) and IPCC AR6 Ch. 7. Re-derived the conversion λ = α · ln 2 / γ on paper.
2	Claude (Sonnet, web)	Section 5, statsmodels ARIMA API check	"Idiomatic statsmodels call for ARIMA(p,d,q) with a 95% prediction interval at a given horizon."	Read the official statsmodels documentation (cited in references) and verified the API against a 6-point synthetic random walk.
3	GitHub Copilot	Section 5, plotting and CSV loading	Autocomplete on `pandas.read_csv` column renames and `matplotlib` fan-chart calls.	Ran end-to-end on the public NOAA + GISTEMP CSVs; inspected the figures and totals manually against published numbers.
4	None	Sections 1–3, 8–10, all sensitivity tables, article	—	Written manually by team members; AI not consulted.

The full prompt/response logs are included in appendix_AI_logs.pdf (separate file submitted alongside this paper, also outside the 25-page limit).

Reminder to the reader. The numerical results in this document are illustrative. They show what a complete HiMCM paper looks like when filled with plausible numbers — not what a proper run on the current NOAA + GISTEMP downloads would produce. Treat this as scaffolding: replace every value flagged [illustrative] with your own computation before submitting anything.

← Back to 2022-B outline Self-grade with the rubric →