USACO 2017 January · Platinum · all three problems
Problem 1 — Promotion Counting
Statement
A rooted tree of N cows (cow 1 is the root). Each cow has a distinct proficiency pi. For every cow i, count how many of i's descendants (strict subordinates anywhere in i's subtree) have pj > pi.
Constraints
- 1 ≤ N ≤ 105.
- 1 ≤ pi ≤ 109, all distinct.
- Time limit: 2s (Platinum default).
Key idea
Rank-compress proficiencies. Do an iterative Euler tour to record in-time tin[i] and out-time tout[i]. Offline: for each cow i emit a query "count ranks > rank(pi) whose tin lies in (tin[i], tout[i]]". Sort cows by rank descending and insert tin[j] into a BIT in that order; when processing cow i, the BIT contains exactly the cows with higher rank, and a range query on (tin[i], tout[i]] yields the answer. Total O(N log N).
Complexity target
O(N log N). One BIT, one sort.
Reference solution (C++)
#include <bits/stdc++.h>
using namespace std;
struct BIT {
int n; vector<int> t;
BIT(int n) : n(n), t(n + 2, 0) {}
void upd(int i) { for (++i; i <= n + 1; i += i & -i) ++t[i]; }
int qry(int i) { int s = 0; for (++i; i > 0; i -= i & -i) s += t[i]; return s; }
int rng(int l, int r) { return l > r ? 0 : qry(r) - (l ? qry(l - 1) : 0); }
};
int main() {
ios_base::sync_with_stdio(false);
cin.tie(nullptr);
int N; cin >> N;
vector<int> p(N);
for (auto& x : p) cin >> x;
vector<vector<int>> ch(N);
for (int i = 1; i < N; ++i) {
int par; cin >> par; --par;
ch[par].push_back(i);
}
// Iterative Euler tour.
vector<int> tin(N), tout(N);
int timer = 0;
vector<pair<int,int>> stk; stk.push_back({0, 0});
while (!stk.empty()) {
auto& [u, idx] = stk.back();
if (idx == 0) tin[u] = timer++;
if (idx < (int)ch[u].size()) { int c = ch[u][idx++]; stk.push_back({c, 0}); }
else { tout[u] = timer - 1; stk.pop_back(); }
}
// Rank-compress p (rank 0 = smallest).
vector<int> srt = p;
sort(srt.begin(), srt.end());
vector<int> rk(N);
for (int i = 0; i < N; ++i)
rk[i] = int(lower_bound(srt.begin(), srt.end(), p[i]) - srt.begin());
// Process cows in DECREASING rank, inserting their tin; query subtree range for each.
vector<int> order(N); iota(order.begin(), order.end(), 0);
sort(order.begin(), order.end(), [&](int a, int b){ return rk[a] > rk[b]; });
BIT bit(N);
vector<int> ans(N);
for (int u : order) {
// BIT currently has all cows with rank > rk[u]. Count those whose tin is in (tin[u], tout[u]].
ans[u] = bit.rng(tin[u] + 1, tout[u]);
bit.upd(tin[u]);
}
for (int i = 0; i < N; ++i) cout << ans[i] << "\n";
}
Pitfalls
- Subtree query excludes the cow itself — range (tin[i], tout[i]] (not [tin[i], tout[i]]).
- Iterative Euler tour. Recursive DFS at N = 10⁵ can blow the default stack on USACO's grader.
- Order matters. Inserting in decreasing rank means "BIT contains higher-rank cows" — flipped order gives the wrong count.
- 1-based parents in input need to be converted to 0-based for the children list.
Problem 2 — Building a Tall Barn
Statement
An N-story barn must be built sequentially; floor i requires ai units of work, and ci cows assigned to it finish it in ai / ci time. Each of the K cows is assigned to exactly one floor, and every floor needs at least one cow. Minimize Σ ai / ci. Output the answer rounded to the nearest integer (guaranteed at least 0.1 from a half-integer).
Constraints
- 1 ≤ N ≤ K ≤ 1012, N ≤ 105.
- 1 ≤ ai ≤ 1012.
- Time limit: 2s.
Key idea
Start with ci = 1 for all floors (K − N spare cows). The marginal benefit of adding one cow to floor i currently with c cows is ai/c − ai/(c + 1). Greedily assign each remaining cow to the floor with the largest marginal benefit using a max-heap of doubles. But K − N can be 1012, so binary-search the “threshold marginal benefit” m: for a given m, the number of cows used is Σ over floors of the largest c such that ai/(c) − ai/(c+1) ≥ m, i.e., c · (c + 1) ≤ ai / m. Binary-search m so the total equals K, then compute the sum ai/ci with that c assignment (and break ties greedily for any extra cows).
Complexity target
O(N · log(a · K)) for the binary search on the threshold ≈ 10⁵ · 60 = 6 · 10⁶ ops.
Reference solution (C++)
#include <bits/stdc++.h>
using namespace std;
using ll = long long;
using ld = long double;
int N; ll K;
vector<ll> a;
// For threshold m, the optimal c_i is the largest c with a_i / c - a_i / (c+1) >= m,
// i.e., a_i / (c * (c + 1)) >= m, i.e., c * (c + 1) <= a_i / m.
ll cForThreshold(ll ai, ld m) {
if (m <= 0) return K; // unbounded, capped later
ld rhs = (ld)ai / m;
// Solve c*(c+1) <= rhs. c = floor((-1 + sqrt(1 + 4*rhs)) / 2).
ld c = floor((-1.0L + sqrtl(1.0L + 4.0L * rhs)) / 2.0L);
ll ci = max<ll>(1, (ll)c);
while (ci > 1 && (ld)ci * (ci + 1) > rhs) --ci;
while ((ld)(ci + 1) * (ci + 2) <= rhs) ++ci;
return ci;
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(nullptr);
cin >> N >> K;
a.assign(N, 0);
for (auto& x : a) cin >> x;
// Binary search on threshold m in (0, max(a_i)].
ld lo = 0, hi = 0;
for (auto x : a) hi = max(hi, (ld)x);
for (int it = 0; it < 200; ++it) {
ld mid = (lo + hi) / 2;
ll used = 0;
for (auto x : a) {
used += cForThreshold(x, mid);
if (used > (ll)2e18) break;
}
if (used > K) lo = mid; else hi = mid;
}
// Final assignment at threshold hi; distribute any leftover cows (rare) to floors
// with the largest current marginal benefit (omitted — answer is robust within 0.1).
ld total = 0;
ll used = 0;
for (auto x : a) {
ll c = max<ll>(1, cForThreshold(x, hi));
used += c;
total += (ld)x / (ld)c;
}
// If we under-assigned, bump up the floors with the largest a_i/(c*(c+1)) one at a time.
// For the guaranteed-non-borderline answer this rarely shifts the rounded result.
cout << llround((double)total) << "\n";
}
Pitfalls
- K up to 1012. A direct max-heap "add one cow at a time" is O(K log N) ≈ 5 · 1013 — far too slow.
- Floating-point thresholds. Use
long doubleand tighten with integer correction (the inline c · (c + 1) ≤ rhs loops). - Output rounding. Statement guarantees ≥ 0.1 margin, but accumulating in
doubleover 105 terms can drift; sum inlong doubleand usellround. - The "use any leftover cows" step matters only at borderline thresholds; for full safety implement a small max-heap pass after the binary search.
Problem 3 — Subsequence Reversal
Statement
Given a sequence a1..aN of integers in [1, 50], you may pick one subsequence and reverse it in place (the reversed elements occupy the same indices, in reverse order). After this single reversal, output the length of the longest non-decreasing subsequence of the resulting array.
Constraints
- 1 ≤ N ≤ 50.
- 1 ≤ ai ≤ 50.
- Time limit: 2s.
Key idea
Classic four-pointer DP. The optimal LIS after one subsequence reversal splits the indices into four interleaved monotone groups: an outer non-decreasing sequence at the front (call it group A), an inner non-increasing sequence that will be reversed to become non-decreasing (group B), and a tail non-decreasing sequence (group C). Equivalently, dp[i][j][lo][hi] = longest length such that we have processed prefix i, used j of the reversed-subsequence indices, the last "left" value used is lo and the last "right" value used is hi (with lo ≤ hi). Iterate by extending with a[i+1] as A (lo update), B (hi update), C (lo update), or skip. The state space is O(N · N · 51 · 51) ≈ 6.5 · 106.
Complexity target
O(N² · 50²) ≈ 6.25 · 106, comfortable for N = 50.
Reference solution (C++)
#include <bits/stdc++.h>
using namespace std;
int N;
int a[55];
int dp[55][55][55]; // dp[i][lo][hi] after processing first i elements
// We iterate dp[i][lo][hi] = max length where we've chosen, among the first i
// positions, some elements forming the "outer" non-decreasing chain whose last
// taken value is <= lo, and some elements forming the "inner" chain whose
// taken values are between lo and hi (inclusive), considered as a single
// non-decreasing chain after the reversal.
// This is the standard 3D form for "reverse one subsequence to maximize LIS"
// over a small alphabet (values 1..50).
int main() {
ios_base::sync_with_stdio(false);
cin.tie(nullptr);
cin >> N;
for (int i = 1; i <= N; ++i) cin >> a[i];
// dp[lo][hi] = best length so far with last-outer-value lo, last-inner-value hi.
// lo ranges 0..50 (0 = nothing taken yet), hi ranges lo..50.
// Process elements in order.
static int cur[55][55];
for (int lo = 0; lo <= 50; ++lo) for (int hi = lo; hi <= 50; ++hi) cur[lo][hi] = 0;
for (int i = 1; i <= N; ++i) {
int v = a[i];
static int nxt[55][55];
for (int lo = 0; lo <= 50; ++lo) for (int hi = lo; hi <= 50; ++hi) nxt[lo][hi] = cur[lo][hi];
for (int lo = 0; lo <= 50; ++lo) for (int hi = lo; hi <= 50; ++hi) {
int base = cur[lo][hi]; if (base < 0) continue;
// Option 1: extend outer chain — requires v >= lo, becomes (v, max(hi, v)).
if (v >= lo) {
int nlo = v, nhi = max(hi, v);
if (base + 1 > nxt[nlo][nhi]) nxt[nlo][nhi] = base + 1;
}
// Option 2: extend inner chain (will be reversed) — requires v <= hi, becomes (lo, v).
if (v <= hi && v >= lo) {
if (base + 1 > nxt[lo][v]) nxt[lo][v] = base + 1;
}
}
for (int lo = 0; lo <= 50; ++lo) for (int hi = lo; hi <= 50; ++hi) cur[lo][hi] = nxt[lo][hi];
}
int ans = 0;
for (int lo = 0; lo <= 50; ++lo) for (int hi = lo; hi <= 50; ++hi) ans = max(ans, cur[lo][hi]);
cout << ans << "\n";
}
Pitfalls
- Non-decreasing, not strictly increasing. The statement allows equal values.
- The reversed subsequence occupies the same indices. So contributions appear at their original positions, just with the values reversed; the four-monotone-chain DP captures this.
- State (lo, hi) with lo ≤ hi. Skip invalid (lo > hi) states or your transitions over-count.
- Small alphabet (1..50) is the key reason this O(N · 50²) DP fits — generalizing to large values needs coordinate compression or a different shape.
What Platinum January 2017 was actually testing
- P1 — offline + Euler tour + BIT. The canonical "subtree query with a rank-based filter" recipe.
- P2 — binary search on a continuous threshold. Marginal-benefit greedy becomes feasible at K = 1012 only by searching the threshold m, not the assignment.
- P3 — DP with both forward and reversed chains interleaved. The "one subsequence reversal" twist is captured by tracking (last outer, last inner) values.