2013 · Problem B — Bank Service Problem

Queueing theory Discrete-event simulation Empirical distributions Staffing optimization

The prompt, restated

A bank branch has collected one shift's worth of data on customer arrivals and service times. The data are given as histograms — empirical probability distributions of inter-arrival times and per-customer service durations. Branch management has stated two service-level targets:

Average customer wait time < 2 minutes;
Average queue length ≤ 2 persons.

Teams are asked: (1) Build a model of the branch's queueing system, treating arrivals and service times as draws from the empirical distributions. (2) Estimate the current average wait and queue length under the existing teller staffing and decide whether the targets are met. (3) If the targets are not met, find the minimum staffing change (add tellers, change shift schedule, or both) that meets both targets simultaneously. (4) Write a letter to bank management with the recommendation, the expected wait/queue numbers under the proposed plan, and a brief discussion of cost trade-offs (tellers cost money, customers cost loyalty).

Key modeling idea

This is a classical M/G/c (or G/G/c) queueing problem with a single shared queue feeding $c$ identical servers — the canonical bank-teller pattern. The arrival process is approximately Poisson (memorylessness checked empirically), service times are general (not exponential — they show a clear right tail). Because $c$ is small and the service distribution is non-exponential, closed-form steady-state formulas (Erlang-C, Pollaczek–Khinchine) are good back-of-envelope checks but the headline numbers should come from a discrete-event simulation sampling directly from the empirical histograms.

The optimization is one-dimensional: increment $c$ from the current staffing until both KPIs (mean wait, mean queue) clear their thresholds. The smallest such $c$ is the recommendation.

Suggested approach

Step 1 — Characterize the inputs. Compute mean and variance of the empirical inter-arrival and service-time histograms. Test whether arrivals look Poisson via a chi-square fit of exponential inter-arrivals (technique 4 — regression / GoF).
Step 2 — Closed-form sanity check. Compute traffic intensity $\rho = \lambda / (c\mu)$. Apply Erlang-C (M/M/c) and Pollaczek–Khinchine (M/G/1) bounds for average wait $W_q$ and queue length $L_q = \lambda W_q$ (Little's law). These give you a target ballpark before you trust the simulation.
Step 3 — Discrete-event simulation. Sample inter-arrivals and service times from the empirical distributions, run a long shift (≥ 10,000 customers) to wash out warm-up effects, and record per-customer wait and time-average queue length (technique 10 — Monte Carlo, technique 9 — queueing).
Step 4 — Find the minimum $c$. Sweep $c$ from the current staffing upward; report the smallest $c$ where both $W_q < 2$ min and $L_q \le 2$ hold across ≥ 30 replications (so the conclusion isn't a noisy single-run artifact).
Step 5 — Consider time-varying staffing. Real bank arrivals have a midday spike; a flat $c$ that handles peak hour overstaffs the rest of the day. Propose a two- or three-shift schedule and report the lower total teller-hours (technique 11 — optimization).

Data sources to consider

Source	What you get
Kendall (1953) & Erlang's original 1909 paper	Foundational M/M/c and Erlang-C formulas
Pollaczek–Khinchine formula	Closed-form mean wait for M/G/1 — handy when service variance is large
Brown et al. (2005), Statistical Analysis of a Telephone Call Center	Empirical evidence that real arrival processes are inhomogeneous Poisson with weekday/hourly seasonality
Federal Reserve / FDIC branch-traffic survey reports	Order-of-magnitude transactions-per-teller-hour benchmarks for U.S. retail branches
SimPy documentation	Battle-tested discrete-event framework for the simulation step

Common pitfalls

Forcing exponential service times. Bank service is bimodal — fast deposits and slow loan inquiries. M/M/c will understate the wait. Use the empirical distribution directly or fit a hyperexponential / lognormal.
Single-run conclusion. One simulation replicate with 200 customers has enormous variance. Run ≥ 30 replications and report a 95% confidence interval around mean wait.
Ignoring warm-up. The first ~500 customers are biased by the empty start. Discard or use a steady-state initialization.
Adding tellers without checking saturation. If $\rho \ge 1$ at the current $c$, the queue is unstable — wait grows without bound and no amount of running longer "averages out". Diagnose $\rho$ first.
Single-queue vs. multi-queue confusion. Banks almost always use one shared queue feeding all tellers. Modeling $c$ independent M/M/1 queues makes the wait look much worse — state your queueing discipline clearly.
No cost trade-off in the letter. Management wants a dollar figure. Multiply marginal teller-hours by a fully-loaded wage (~\$30/hr [illustrative]) and compare to the loyalty-loss estimate.

Python sketch

Discrete-event M/G/c simulation drawing from empirical histograms; sweep $c$ to find the minimum staffing meeting both KPIs.

import numpy as np
import heapq

rng = np.random.default_rng(0)

# --- Empirical histograms from the prompt [illustrative] ---
# inter-arrival times (minutes) and their probabilities
IA_VALUES = np.array([0.5, 1.0, 1.5, 2.0, 3.0, 5.0])
IA_PROBS  = np.array([0.20, 0.30, 0.25, 0.15, 0.07, 0.03])
SV_VALUES = np.array([1.0, 2.0, 3.0, 4.0, 6.0, 10.0])
SV_PROBS  = np.array([0.10, 0.30, 0.30, 0.15, 0.10, 0.05])

def draw(values, probs, n):
    return rng.choice(values, size=n, p=probs)

def simulate(c, n_cust=20_000, warmup=500):
    ia = draw(IA_VALUES, IA_PROBS, n_cust)
    sv = draw(SV_VALUES, SV_PROBS, n_cust)
    arrival = np.cumsum(ia)
    teller_free = [0.0] * c           # min-heap of next-free times
    heapq.heapify(teller_free)
    waits = np.zeros(n_cust)
    for i in range(n_cust):
        earliest = heapq.heappop(teller_free)
        start = max(arrival[i], earliest)
        waits[i] = start - arrival[i]
        heapq.heappush(teller_free, start + sv[i])
    w = waits[warmup:]
    # time-average queue length via Little's law: L_q = lambda * W_q
    lam = 1.0 / IA_VALUES.dot(IA_PROBS)
    return w.mean(), lam * w.mean()

# --- Step 4: sweep c until both targets met ---
for c in range(1, 8):
    Wq, Lq = simulate(c)
    ok = (Wq < 2.0) and (Lq <= 2.0)
    print(f"c={c}: mean wait = {Wq:5.2f} min, mean queue = {Lq:4.2f}   {'OK' if ok else ''}")
    if ok:
        print(f"--> recommend c = {c}")
        break

Sensitivity & validation checklist

Compare simulated $W_q$ at the current $c$ against the Pollaczek–Khinchine value for a single-server bound — they should agree within ~10% at low $\rho$.
Vary $\lambda$ by ±20% (e.g., a busier or slower branch) — does the recommended $c$ change by more than one teller? Report the breakpoint.
Inflate service-time variance by 2× (e.g., more loan customers) and recheck — variance, not mean, is what makes the M/G/c queue blow up.
Run 50 independent simulations at the recommended $c$; the 95% CI on mean wait should sit entirely below 2 minutes [illustrative ~1.2–1.7 min].
Validate Little's law: independently measure $L_q$ by integrating queue length over time, then check that $L_q \approx \lambda W_q$ — within 5%.
Stress-test a midday spike: double $\lambda$ for one hour and confirm the steady recommendation doesn't catastrophically degrade.

← 2013 Problem A All past problems