← 2023 Problem A outline · All past problems
Worked sample paper · HiMCM 2023 Problem A
A complete, judge-style reference paper for Dandelions: Friend? Foe? Both? Neither? This is not an official COMAP solution — it is a learning artifact written so a student can see what every section of a HiMCM paper should actually contain. Read it after attempting the problem yourself.
[illustrative] — they are plausible
placeholders, not authoritative findings.
Summary Sheet
Problem restated. A single dandelion (Taraxacum officinale) sits in puffball stage on the edge of a 1-hectare (100 m × 100 m) empty plot. We are asked to (i) model the spatial spread of dandelions across that plot at 1, 2, 3, 6, and 12 months in three contrasting climates — temperate, arid, and tropical; (ii) build an "impact factor" (II) that combines the plant-biology characteristics of an invasive species with the harm it does to humans and to ecosystems; and (iii) apply that impact factor to dandelions plus two further invasive plants in regions of our choice.
Approach. Spread is modelled as a Fisher–KPP reaction–diffusion PDE on a 100 × 100 grid with 1 m cells, integrated by explicit finite differences and confirmed against the analytical Fisher spreading-speed c* = 2√(rD). Climate sets the triple (D, r, K): wind-driven dispersal coefficient, intrinsic growth rate, and carrying capacity. The impact factor combines six normalized indicators — reproductive rate, dispersal range, native-biomass displacement, economic cost, allergen/toxicity, ecosystem disruption — minus a benefit term for medicinal / edible / soil-aeration value. Weights come from the Entropy Weight Method (EWM) applied to a 10-species training set so the index is data-driven, not author-driven; we then validate the index against three benchmark cases (kudzu in the US Southeast — expected high; water hyacinth in Lake Victoria — expected very high; dandelion itself in a North American lawn — expected moderate).
Key findings.
- Predicted dandelion-covered area at 12 months
[illustrative]: temperate ≈ 0.71 ha (71% of the plot), tropical ≈ 0.38 ha, arid ≈ 0.09 ha. The temperate front advances at roughly 4.6 m/month, in agreement with the analytical Fisher speed 2√(rD) for our parameter triple. - The radial spread is symmetric in the diffusion-only model but becomes elongated downwind (eccentricity
≈ 1.8) when we replace the isotropic Laplacian with an advection–diffusion operator using a 2 m/s prevailing
wind
[illustrative]. - The EWM weights load most heavily on economic cost (0.23) and ecosystem disruption (0.21), with the benefit subtraction term contributing about 0.09 in absolute weight — i.e., harm-side factors dominate the index by roughly 4:1, consistent with how regulators score invasives.
- Impact-factor scores on a 0–100 scale
[illustrative]: water hyacinth (Lake Victoria basin) = 81 (very high), kudzu (US Southeast) = 67 (high), dandelion (US temperate lawn) = 34 (moderate-low). Ordering matches the consensus invasive-species literature (CABI ISC; USDA NISIC), giving face validity to the index. - Sensitivity analysis: the index ranking is robust under ±30% perturbation of any single weight; only a joint perturbation of "economic cost" and "ecosystem disruption" simultaneously above +40% causes kudzu and water hyacinth to swap positions. The dandelion classification stays "moderate" in every Sobol trial.
Recommendations.
- Use the Fisher–KPP framework, with climate-dependent (D, r, K), as the baseline spread model in any HiMCM-style invasive plant problem. It is analytically tractable, fast, and the parameters map to data the team can find.
- Construct the impact factor with EWM weights on at least 6 normalized indicators, with a separate subtracted benefit term. AHP without EWM is too author-dependent for this problem.
- Always validate the index against at least three external benchmarks before reporting any score. A score that says "kudzu is mild" is a red flag that the weighting is wrong, not that kudzu is mild.
- For regulators: prioritise control budget against the top-decile of impact-factor species in each ecoregion, and re-fit weights every 5 years as economic-cost estimates and ecosystem data update.
1. Introduction and Background
Invasive plant species cost the global economy an estimated USD 26 billion per year in agricultural losses and control spending alone (Diagne et al., 2021), and they are now considered one of the top five drivers of biodiversity loss (IPBES, 2019). Quantifying which invasives matter most — and where — is a perennial operations-research problem at the intersection of ecology, economics, and public health.
The dandelion (Taraxacum officinale) is the iconic ambiguous case. It is widely naturalised, highly prolific (a single plant produces up to 5,000 wind-dispersed achenes per year, with maximum dispersal distance exceeding 100 m under modest wind), and competes with turf grass and some native flora. But it is also edible, medicinal, an early-spring nectar source for pollinators, and a deep-taproot soil aerator. Calling it "invasive" or "beneficial" depends on which functional dimension is being measured (Stewart-Wade et al., 2002).
This paper builds a two-part model. Part 1 simulates the spatial spread of a single puffball-stage dandelion into a 1-hectare empty plot over 12 months, in temperate, arid, and tropical climates, using a Fisher–KPP reaction–diffusion PDE. Part 2 constructs a six-indicator impact factor with data-driven (EWM) weights and applies it to dandelions, kudzu, and water hyacinth in three named regions.
The framing matters for the contest: judges expect spatial modelling (not a non-spatial logistic ODE) for Part 1, and they expect validated, normalized, multi-criteria scoring for Part 2. Both expectations are explicit in the COMAP problem statement.
2. Assumptions and Justifications
Every assumption below is used somewhere in Section 4 — we cite the equation where it enters.
- The 1-hectare plot is flat, homogeneous bare soil with no other plants present at t = 0. Why: The problem statement says "empty plot." This lets us collapse competition with native flora into the single logistic carrying capacity K rather than modelling an explicit second species. (Used in Eq. 1.)
- Seed dispersal is well-approximated as Fickian diffusion at the scale of one hectare. Why: Dandelion seed-shadow data fit a roughly Gaussian / log-normal kernel with characteristic length of a few metres (Tackenberg et al., 2003). Over a 100 m plot and monthly time steps, Fickian diffusion is the continuum limit of repeated short-range jumps and captures the front shape well; an explicit kernel model gives only ~5% different front position at 12 months. We test the wind-advection extension in Section 7. (Used in Eq. 1.)
- Per-capita reproduction is logistic with a single climate-dependent rate r. Why: Dandelion populations in field plots show classic S-shaped growth curves; mortality, germination, and seed-set lumped into r is the standard simplification (Murray, 2002, §13). (Used in Eq. 1.)
- Climate enters the model only through (D, r, K). Why: Temperature, rainfall, and wind affect growth, dispersal, and carrying capacity respectively. Folding them into three calibrated scalars per climate is the smallest model that respects the problem's three-climate structure. (Used in Section 4.2.)
- Boundary conditions are zero-flux on the plot edge. Why: The problem isolates a single 1-ha plot; seeds leaving the plot do not return. Zero-flux (Neumann) at the edges matches "what is inside this hectare" rather than the infinite-plane analytical solution. (Used in Eq. 1.)
- The puffball at t = 0 is one mature plant with seed density localised in one 1 m × 1 m cell. Why: The problem says "a single dandelion." We seed cell (50, 50) with density P = 1.0 / K and zero elsewhere. (Used in Eq. 1.)
- Six indicators are sufficient for the impact factor. Why: The problem lists "plant characteristics" and "harm" as the two halves of the index. We split each into three indicators (reproductive rate, dispersal, ecosystem-disruption; economic cost, health impact, native displacement) plus one subtracted benefit term. More indicators dilute weights without adding discriminating information; fewer collapse incommensurate harms. (Used in Eq. 8.)
- EWM weights derived from a 10-species training set generalize to dandelion, kudzu, and water hyacinth. Why: The 10 species span the published invasive-severity spectrum (from mild — white clover — to severe — Japanese knotweed, water hyacinth, kudzu), so the entropy of each indicator across the set is representative. We acknowledge this as the largest source of model error and test sensitivity in Section 7. (Used in Eq. 7.)
- All six indicators are normalized to [0, 1] before weighting. Why: Reproductive rate (achenes/plant/year), economic cost (USD/ha/year), and ecosystem disruption (a 0–5 expert Likert score from CABI ISC) have incommensurable units. Min-max normalization is the standard MCDM pre-processing step (Hwang & Yoon, 1981). (Used in Eq. 6.)
- The benefit term V enters with a negative sign at the same weight as a harm term. Why: Treating value/benefit as a separately weighted positive term would double-count: the index is meant to be "net harm." We subtract V after weighting on the same 0–1 scale. (Used in Eq. 8.)
3. Variables and Notation
| Symbol | Meaning | Units |
|---|---|---|
| P(x, y, t) | Dandelion population density at position (x, y), time t | plants / m² |
| D | Effective seed-dispersal diffusion coefficient | m² / month |
| r | Intrinsic per-capita growth rate | 1 / month |
| K | Carrying capacity (max sustainable density) | plants / m² |
| c* | Asymptotic front speed (Fisher), 2√(rD) | m / month |
| u(x, y, t) | Non-dimensional density, u = P / K, u ∈ [0, 1] | — |
| A(t) | Covered area at time t, ∫∫ 1[u > 0.1] dx dy | m² |
| v | Wind advection velocity (Section 7 extension) | m / month |
| j ∈ {1..N} | Index over invasive species in training / test set | — |
| i ∈ {1..6} | Index over impact-factor indicators | — |
| xj,i | Raw indicator value for species j | varies |
| zj,i | Normalized indicator, zj,i ∈ [0, 1] | — |
| ei | Shannon entropy of column i | — |
| wi | EWM weight for indicator i | — |
| Vj | Subtracted benefit term for species j | — |
| IIj | Final impact factor (0–100) | — |
4. Model Formulation
4.1 Part 1 — Fisher–KPP spread on the 1-hectare plot
Let P(x, y, t) denote dandelion density on the square Ω = [0, 100] × [0, 100] m. We model spatial spread as a reaction–diffusion equation: short-range Fickian dispersal of wind-borne achenes plus logistic local growth.
Equation (1) — Fisher–KPP reaction–diffusion model with zero-flux boundary:
∂P/∂t = D · ∇²P + r · P · (1 − P/K) on Ω × (0, T]
∇P · n = 0 on ∂Ω
P(x, y, 0) = K · δ(x − 50, y − 50) (single puffball at plot centre)
Non-dimensionalise with u = P/K, τ = rt, ξ = x√(r/D). The classical result (Fisher, 1937; Kolmogorov–Petrovsky–Piskunov, 1937) is that fronts propagate at asymptotic speed
Equation (2) — Fisher–KPP analytical front speed:
c* = 2 · √(r · D)
so doubling either r or D increases the front speed by √2. This is our sanity-check benchmark against the finite-difference simulation.
4.2 Climate-parameterised triples (D, r, K)
We calibrate (D, r, K) per climate from the dandelion ecology literature (Honek & Martinkova, 2005; Tackenberg et al., 2003; Stewart-Wade et al., 2002) and from the prevailing windspeed climatology (NOAA NCEI Global Surface Summary of the Day).
| Climate | D (m²/month) | r (1/month) | K (plants/m²) | c* (m/month) |
|---|---|---|---|---|
| Temperate (e.g., Ohio, USA) | 5.0 | 1.05 | 40 | 4.6 |
| Arid (e.g., NM high desert) | 3.0 | 0.20 | 8 | 1.5 |
| Tropical (e.g., Costa Rica mid-elev.) | 4.0 | 0.55 | 15 | 3.0 |
All values [illustrative] — calibrated to give literature-consistent front speeds of order
1–5 m/month. Temperate gets the largest r and K because the dandelion is mesic-adapted; arid
gets a low r from moisture-limited germination; tropical sits in between but loses competitive
ground to faster native flora encoded by a depressed K.
4.3 Numerical scheme
We integrate Eq. (1) by explicit finite differences on a 100 × 100 grid (Δx = 1 m). The CFL-style stability condition for the diffusion piece is
Equation (3) — explicit-FD stability condition:
Δt ≤ Δx² / (4 · D)
Worst case (temperate, D = 5) gives Δt ≤ 0.05 month; we use Δt = 0.02 month (≈ 14 hours) for headroom. At each step we apply
Equation (4) — discretised update rule:
P^{n+1}_{i,j} = P^n_{i,j}
+ Δt · D · (P^n_{i+1,j} + P^n_{i−1,j} + P^n_{i,j+1} + P^n_{i,j−1} − 4·P^n_{i,j}) / Δx²
+ Δt · r · P^n_{i,j} · (1 − P^n_{i,j} / K)
with mirror boundary cells to enforce ∇P · n = 0.
4.4 Covered-area metric
We report not a raw mean density but the covered area, the standard metric in invasion biology (Shigesada & Kawasaki, 1997). A cell counts as "covered" once density crosses 10% of carrying capacity:
Equation (5) — covered area at time t:
A(t) = Δx² · #{ (i, j) : P^n_{i,j} ≥ 0.10 · K }
4.5 Part 2 — Impact factor (II)
For each species j in a training set of N = 10 invasives we collect six raw indicators:
- xj,1 = reproductive rate (seeds per plant per year)
- xj,2 = dispersal range (max documented seed dispersal distance, m)
- xj,3 = native-biomass displacement (% reduction in native biomass in invaded plots, meta-analysed)
- xj,4 = economic cost (USD per ha per year, control + agricultural-loss)
- xj,5 = health impact (composite: allergen score + toxicity score, 0–10)
- xj,6 = ecosystem disruption (CABI ISC expert score, 0–5)
and a benefit indicator Vj = composite of (medicinal use, edibility, soil-aeration, pollinator value) on 0–1.
Equation (6) — min-max normalisation of indicators (all are benefit-direction for an impact factor — a higher raw value means a larger contribution to harm):
z_{j,i} = (x_{j,i} − min_j x_{j,i}) / (max_j x_{j,i} − min_j x_{j,i}), i = 1..6
v_j = (V_j − min_j V_j) / (max_j V_j − min_j V_j)
4.6 EWM weights
For each indicator i, treat its column as a probability distribution over species and compute the Shannon entropy:
Equation (7) — entropy weight method:
p_{j,i} = z_{j,i} / Σ_j z_{j,i}
e_i = −(1 / ln N) · Σ_j p_{j,i} · ln p_{j,i}
w_i = (1 − e_i) / Σ_i (1 − e_i), Σ_i w_i = 1
An indicator on which the 10 training species differ a lot (low entropy) gets a high weight; one on which they all score similarly (high entropy) gets a low weight. This makes the impact factor data-driven rather than rhetorical.
4.7 Final impact factor
Equation (8) — impact-factor formula, scaled to 0–100:
II_j = 100 · ( Σ_{i=1..6} w_i · z_{j,i} − w_V · v_j )
where w_V is the benefit-side weight, fixed at the median harm-side weight
(here ≈ 0.10) so the benefit term enters at typical-harm-weight scale, not
dominating and not zero.
Equivalent to the problem's I = w1R + w2D − w3V + ... formulation, but with the weights derived rather than chosen.
4.8 Validation rule
Equation (9) — qualitative validation against expert classification:
If II_j ≥ 70 → "Severe" (e.g., kudzu, water hyacinth, knotweed expected here)
If 40 ≤ II_j < 70 → "High / Moderate-high"
If 20 ≤ II_j < 40 → "Moderate / Moderate-low" (dandelion expected here)
If II_j < 20 → "Minor"
If the index assigns kudzu or water hyacinth to anything below "Severe," the weighting must be re-examined.
5. Solution and Computational Approach
The full pipeline fits in two Python modules: one for the PDE spread simulation (Part 1) and one for the EWM-weighted impact factor (Part 2). The sketch below contains the core routines that a HiMCM team can realistically write and debug inside a 14-day window. It runs end-to-end on three small CSVs assembled from the references (climate parameters, species-indicator table, benefit values).
"""himcm_2023a.py — Fisher-KPP spread + EWM impact factor for HiMCM 2023 Problem A."""
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# ----- Part 1: Fisher-KPP spread on a 100x100 grid (1m cells, 1 hectare) -----
def simulate_spread(D, r, K, T_months, dx=1.0, dt=0.02, n=100):
"""Explicit finite-difference Fisher-KPP with zero-flux (Neumann) boundary."""
P = np.zeros((n, n))
P[n // 2, n // 2] = K # one puffball at plot centre
n_steps = int(T_months / dt)
snapshots = {}
save_at = {int(t / dt): t for t in (1, 2, 3, 6, 12) if t <= T_months}
for step in range(n_steps):
# Mirror-pad for zero-flux boundary
Pp = np.pad(P, 1, mode="edge")
lap = (Pp[2:, 1:-1] + Pp[:-2, 1:-1]
+ Pp[1:-1, 2:] + Pp[1:-1, :-2] - 4 * Pp[1:-1, 1:-1]) / dx**2
P = P + dt * (D * lap + r * P * (1 - P / K))
P = np.clip(P, 0, K)
if step in save_at:
snapshots[save_at[step]] = P.copy()
return snapshots
def covered_area(P, K, threshold=0.10, dx=1.0):
return float(np.sum(P >= threshold * K)) * dx**2 # m^2
def fisher_speed(D, r):
return 2.0 * np.sqrt(D * r) # m / month
CLIMATES = {
"temperate": dict(D=5.0, r=1.05, K=40),
"arid": dict(D=3.0, r=0.20, K=8),
"tropical": dict(D=4.0, r=0.55, K=15),
}
def run_part1():
rows = []
for name, par in CLIMATES.items():
snaps = simulate_spread(T_months=12, **par)
for t, P in snaps.items():
rows.append(dict(climate=name, months=t,
area_m2=covered_area(P, par["K"]),
c_star=fisher_speed(par["D"], par["r"])))
return pd.DataFrame(rows)
# ----- Part 2: Entropy-Weight impact factor on 10 invasives + 3 test cases -----
INDICATORS = ["repro", "dispersal", "displacement",
"econ_cost", "health", "ecosystem"]
def normalize(df, cols):
Z = df[cols].copy()
for c in cols:
lo, hi = Z[c].min(), Z[c].max()
Z[c] = (Z[c] - lo) / (hi - lo + 1e-12)
return Z
def ewm_weights(Z):
P = Z.values / (Z.values.sum(axis=0, keepdims=True) + 1e-12)
P = np.clip(P, 1e-12, 1)
e = -(P * np.log(P)).sum(axis=0) / np.log(len(Z))
w = (1 - e) / (1 - e).sum()
return w
def impact_factor(df_train, df_test):
Zt = normalize(df_train, INDICATORS)
w = ewm_weights(Zt)
Vt = (df_train["benefit"] - df_train["benefit"].min()) \
/ (df_train["benefit"].max() - df_train["benefit"].min() + 1e-12)
w_V = np.median(w) # benefit-side weight = typical harm weight
Z = normalize(pd.concat([df_train, df_test]), INDICATORS).iloc[-len(df_test):]
V = (df_test["benefit"] - df_train["benefit"].min()) \
/ (df_train["benefit"].max() - df_train["benefit"].min() + 1e-12)
II = 100 * (Z.values @ w - w_V * V.values)
return pd.DataFrame(dict(species=df_test["species"].values,
II=np.round(II, 1))), dict(zip(INDICATORS, w.round(3)))
if __name__ == "__main__":
part1 = run_part1()
print(part1.to_string(index=False))
train = pd.read_csv("invasives_train.csv") # 10-species training set
test = pd.read_csv("invasives_test.csv") # dandelion, kudzu, water hyacinth
res, w = impact_factor(train, test)
print(res.to_string(index=False))
print("EWM weights:", w)
A Sobol-sensitivity wrapper (Section 7) is ~40 additional lines using SALib.sample.saltelli and
SALib.analyze.sobol over the EWM weight vector. The figures referenced below are produced by
matplotlib.contourf on the snapshot dictionaries.
6. Results
6.1 Part 1 — spread over time, by climate
| Climate | 1 mo | 2 mo | 3 mo | 6 mo | 12 mo | c* |
|---|---|---|---|---|---|---|
| Temperate | 0.008 ha | 0.031 ha | 0.071 ha | 0.28 ha | 0.71 ha | 4.6 m/mo |
| Tropical | 0.006 ha | 0.021 ha | 0.044 ha | 0.17 ha | 0.38 ha | 3.0 m/mo |
| Arid | 0.003 ha | 0.007 ha | 0.013 ha | 0.041 ha | 0.09 ha | 1.5 m/mo |
All values [illustrative]. Covered area is computed at the 10%-of-K threshold (Eq. 5). The
front speed measured from successive snapshots (Δr / Δt for the 10%-of-K contour) agrees with the analytical
c* = 2√(rD) to within ~6%, which is the expected discretisation error for an explicit
scheme at Δx = 1 m.
[Figure 1: 2 × 3 grid of P(x, y, t) contour plots, columns = climates, rows = (3 months, 12 months). The temperate column shows a dense central disk reaching the plot edges by month 12; tropical shows a mid-radius disk; arid shows a tight central patch with most of the plot untouched.]
[Figure 2: covered area A(t) vs. t on a single log-y plot, three curves coloured by climate, with the analytical Fisher-front growth A(t) ≈ π(c*t)² overplotted as a dashed line for each climate. Simulation and analytical agree until the front hits the plot boundary, after which the simulation saturates and the analytical line continues — a clean visual sanity check.]
6.2 EWM weights from the training set
| Indicator | EWM weight | Comment |
|---|---|---|
| Economic cost | 0.23 | Largest single weight; species differ by orders of magnitude in USD/ha/yr. |
| Ecosystem disruption | 0.21 | CABI scores span the full 0–5 range across training set. |
| Native-biomass displacement | 0.18 | Meta-analyses report 10–80% reductions. |
| Reproductive rate | 0.15 | Wide spread (10²–10⁶ seeds / plant / year). |
| Dispersal range | 0.13 | Tens of metres to thousands of km (waterborne). |
| Health impact | 0.10 | Smallest spread — most invasives are not directly toxic to humans. |
All values [illustrative]. Benefit-side weight wV = median = 0.15 (we
report II both with this default and with the smaller 0.10 alternative in Section 7).
6.3 Part 2 — Impact factor on three test species
| Species & region | IIj | Class | Top contributing indicators |
|---|---|---|---|
| Water hyacinth Eichhornia crassipes, Lake Victoria basin (Kenya/Uganda/Tanzania) | 81 | Severe | Economic cost (fisheries collapse); ecosystem disruption (anoxia); dispersal (waterborne, ~unbounded) |
| Kudzu Pueraria montana, US Southeast (Georgia / Alabama / Mississippi) | 67 | High | Ecosystem disruption (canopy smothering); economic cost (forestry); native displacement |
| Dandelion Taraxacum officinale, North American temperate lawn | 34 | Moderate-low | Reproductive rate & dispersal pull up; benefit term (medicinal, pollinator) pulls down; economic cost low (cosmetic, not agricultural) |
All values [illustrative]. Ordering matches the consensus of the CABI Invasive Species
Compendium and the USDA National Invasive Species Information Center. Dandelion's classification as
"moderate-low" — not "severe," not "minor" — captures the "friend? foe? both? neither?" ambiguity the contest
asks the team to articulate.
[Figure 3: horizontal bar chart of IIj for the 10 training species plus the 3 test species, sorted descending. Severe-threshold (70) and Minor-threshold (20) drawn as vertical lines.]
7. Sensitivity Analysis
We vary three parameter families and report how the conclusions change.
7.1 Diffusion coefficient D (Part 1)
| D (temperate) | 12-month covered area | c* |
|---|---|---|
| 2.5 (½×) | 0.46 ha | 3.2 m/mo |
| 5.0 (baseline) | 0.71 ha | 4.6 m/mo |
| 7.5 (1.5×) | 0.88 ha | 5.6 m/mo |
| 10.0 (2×) | 0.96 ha | 6.5 m/mo |
All [illustrative]. Conclusion: by 12 months the temperate plot saturates for any
D ≥ 7.5; the spread is r-limited there, not D-limited. In the arid plot the opposite
holds — area scales nearly linearly with D.
7.2 Wind advection (Part 1 extension)
Replace D∇²P in Eq. (1) with D∇²P − v·∇P, prevailing
wind 2 m/s eastward (≈ 5,200 m/month). The covered region elongates downwind to an ellipse of eccentricity
≈ 1.8 by month 12 in the temperate case; total covered area increases only ~7% but the seed-rain hitting the
downwind property boundary increases ~3× [illustrative]. Policy implication: a downwind-neighbour
notification rule is well-justified.
7.3 EWM weight perturbation (Part 2)
Sobol sensitivity (10,000 quasi-random samples) over independent ±30% perturbations of each EWM weight. We record whether the classification ("Severe" / "High" / "Moderate" / "Minor") changes for each test species.
| Species | P(class unchanged) | Most influential weight (first-order Sobol) |
|---|---|---|
| Water hyacinth | 0.98 | Economic cost (S1 = 0.41) |
| Kudzu | 0.91 | Ecosystem disruption (S1 = 0.36) |
| Dandelion | 0.94 | Benefit weight wV (S1 = 0.29) |
All [illustrative]. The dandelion's classification is most sensitive to how generously we
weight the benefit term — exactly the contest's "friend or foe" question expressed quantitatively.
8. Strengths and Weaknesses
Strengths
- Spatial PDE, not a logistic-ODE shortcut. Judges explicitly want spread modelling; the Fisher–KPP framing gives a continuous spatial field that can be sliced at any of the requested time points.
- Analytical sanity check. The simulated front speed matches 2√(rD) to within ~6%. Few HiMCM Part-1 spread papers report a quantitative numerical-vs-analytical check — this is a discriminator.
- Data-driven weights (EWM). The impact factor avoids the standard pitfall of weights chosen by the authors to produce the answers the authors want.
- External validation. Ordering of water hyacinth > kudzu > dandelion matches the CABI/USDA expert consensus, which the judges can verify in two clicks.
- Subtracted benefit term. Captures the "friend?" half of the title and lets dandelion's classification move with the benefit weight, illuminating the ambiguity rather than hiding it.
Weaknesses
- Single-species PDE. No explicit native-flora competitor; competitive displacement is bundled into K.
- Isotropic diffusion is a strong assumption. Real seed dispersal is anisotropic (downwind-elongated, gust-driven). We test this in Section 7.2 but the headline Part-1 results assume isotropy.
- Only 10 species in the EWM training set. Entropy of an indicator is sensitive to how many species cluster near each end of [0, 1]. A 50-species set (entirely feasible from CABI ISC) would give more stable weights.
- Health-impact and ecosystem-disruption indicators are Likert-scale expert scores. Replacing them with hard counts (number of allergy-related ER admissions per ha; number of native species with documented decline) would be better but is not feasible inside 14 days.
- No temporal dimension in the impact factor. A young invasion behaves differently from an established one; the index is static and applies to the current state.
9. Future Improvements
- Couple the Part-1 PDE to a native-competitor PDE (Lotka–Volterra–diffusion system) so that the "competitive displacement" indicator in Part 2 can be read off the PDE output rather than supplied externally.
- Replace Fickian diffusion with an explicit log-normal seed-dispersal kernel; convolve with daily wind fields from ERA5 reanalysis to get a realistic anisotropic spread.
- Expand the EWM training set to 50+ species from CABI ISC; report bootstrap confidence intervals on each weight.
- Add a time-varying impact factor IIj(t) that integrates economic cost over the invasion's age, so newly arrived species are not under-scored.
- Build an interactive dashboard (Streamlit) where regulators slide the benefit weight and instantly see how the dandelion's classification changes — the contest's "friend vs. foe" question made decision-relevant.
10. References
- COMAP (2023). HiMCM 2023 Problem A: Dandelions: Friend? Foe? Both? Neither? contest.comap.com.
- Fisher, R. A. (1937). The wave of advance of advantageous genes. Annals of Eugenics, 7(4), 355–369.
- Kolmogorov, A. N., Petrovsky, I. G., & Piskunov, N. S. (1937). A study of the diffusion equation with increase in the amount of substance. Bull. Moscow Univ., Math. Mech., 1, 1–25.
- Murray, J. D. (2002). Mathematical Biology I: An Introduction, 3rd ed. Springer. (Ch. 13: reaction–diffusion, spread of populations.)
- Shigesada, N., & Kawasaki, K. (1997). Biological Invasions: Theory and Practice. Oxford University Press.
- Stewart-Wade, S. M., Neumann, S., Collins, L. L., & Boland, G. J. (2002). The biology of Canadian weeds. 117. Taraxacum officinale G. H. Weber ex Wiggers. Canadian Journal of Plant Science, 82(4), 825–853.
- Honek, A., & Martinkova, Z. (2005). Pre-dispersal predation of Taraxacum officinale (dandelion) seed. Journal of Ecology, 93(2), 335–344.
- Tackenberg, O., Poschlod, P., & Kahmen, S. (2003). Dandelion seed dispersal — the horizontal wind speed does not matter for long-distance dispersal — it is updraft! Plant Biology, 5(5), 451–454.
- Diagne, C., Leroy, B., Vaissière, A.-C., et al. (2021). High and rising economic costs of biological invasions worldwide. Nature, 592, 571–576.
- IPBES (2019). Global assessment report on biodiversity and ecosystem services. Bonn: IPBES secretariat.
- Hwang, C.-L., & Yoon, K. (1981). Multiple Attribute Decision Making: Methods and Applications. Springer.
- Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423.
- CABI (2024). Invasive Species Compendium. cabi.org/isc.
- USDA National Invasive Species Information Center (2024). invasivespeciesinfo.gov.
- NOAA NCEI (2024). Global Surface Summary of the Day (GSOD). ncei.noaa.gov.
- Herr, A., Iglesias-Carrasco, M., et al. (2022). Water hyacinth and Lake Victoria fisheries: a synthesis. Aquatic Botany, 178, 103510.
- Forseth, I. N., & Innis, A. F. (2004). Kudzu (Pueraria montana): history, physiology, and ecology combine to make a major ecosystem threat. Critical Reviews in Plant Sciences, 23(5), 401–413.
- Saltelli, A., Ratto, M., Andres, T., et al. (2008). Global Sensitivity Analysis: The Primer. Wiley.
- Herman, J., & Usher, W. (2017). SALib: an open-source Python library for sensitivity analysis. Journal of Open Source Software, 2(9), 97.
11. Report on Use of AI (Appendix, does not count toward 25 pages)
Per COMAP rules in effect for the 2023 contest cycle, all generative-AI use must be disclosed.
| # | Tool | Where used | Prompt summary | How the team verified output |
|---|---|---|---|---|
| 1 | ChatGPT (GPT-4) | Section 4.1, PDE setup | "Refresh: derivation of Fisher–KPP front speed 2√(rD) and the standard non-dimensionalisation." | Cross-checked against Murray (2002, §13). Re-derived the change-of-variables by hand. |
| 2 | ChatGPT (GPT-4) | Section 5, code skeleton | "Skeleton for an explicit-FD 2D Fisher–KPP solver in NumPy with Neumann boundaries." | Read line-by-line; added the mirror-padding for Neumann BC ourselves; verified stability empirically by halving Δt. |
| 3 | GitHub Copilot | Section 5, plotting | Autocomplete on matplotlib.contourf and bar-chart calls. | Inspected each figure visually; compared simulated covered-area to π(c*t)². |
| 4 | None | Sections 1–3, 6 narrative, 8–9 | — | Written manually by team members; AI not consulted. |
The full prompt/response logs are included in appendix_AI_logs.pdf (separate file submitted
alongside this paper, also outside the 25-page limit).
[illustrative] with your own computation before submitting anything.