1. Scope
Two estimator implementations are deployed on the public site. §4 documents the simple three-input slider rendered on the homepage as a closed-form approximation to a stochastic loss process (§2). §8 documents the multi-step five-input audit modal as a three-model decomposition with a confidence haircut. Both produce illustrative point estimates anchored to public research and standard UK private dental practice economics. Neither output constitutes a primary audit of any specific practice. All numerical defaults are reproducible from the parameter derivations in §9 and the source citations in §15.
2. Stochastic foundations
Although the on-page calculators surface deterministic point estimates, the underlying loss process is stochastic. We document the generative model first to motivate the closed-form approximation used in §4 and to ground the sensitivity analysis in §5 and the variance estimates in §6.
2.1 Enquiry arrival process
Let new-patient enquiries to a UK private dental practice arrive as a non-homogeneous Poisson process with time-varying intensity . The intensity exhibits two well-documented periodicities: an intra-day cycle peaking between 18:00 and 22:00, and a weekly cycle with reduced volume on Sundays and bank holidays. The expected number of enquiries over the calendar month is the integral of the intensity over time:
For practical purposes, only the integral matters in the closed-form output, which is why the homepage calculator collects as a single monthly volume rather than a parametric form for . The intra-day shape is preserved implicitly through the response-time hazard (§2.3), since after-hours enquiries are systematically subjected to longer response delays.
2.2 Response-time distribution
Conditional on an enquiry arriving at time , let the practice response time be a non-negative random variable with cumulative distribution function . Empirically, response times for UK private dental practices follow a bimodal distribution dominated by an exponential tail for in-hours enquiries and a step function for after-hours enquiries (which receive no response until the next staffed hour). For the closed-form approximation in §4, we collapse the bimodality into a single survival function , treating the tail as approximately exponential with rate :
2.3 Conversion hazard
Let denote the probability that an enquiry converts to a booked first visit, conditional on a response delay of hours. Oldroyd et al. [1] establish that is well approximated by a monotonically decreasing sigmoidal function with an inflection in the band:
where is the maximum convertible share at zero response time, is the logistic function, and are shape parameters fit to the published curve. The conversion drop-off used in the simple calculator is the marginal probability of non-conversion integrated over the response-time distribution and the arrival intensity:
where is the response-time density given arrival time and is the time-of-day arrival weight. For UK private dental practices with the typical channel mix (~55% after-hours arrival share), evaluation of (4) under the empirical curve yields ; the midpoint is taken as the slider default (see §4.3).
2.4 Per-enquiry expected loss
The expected revenue lost per arriving enquiry is the per-enquiry first-visit value multiplied by the marginal non-conversion probability:
Aggregated over the calendar month, expected total revenue lost is the per-enquiry loss multiplied by expected enquiry volume. This is the operational closed-form approximation used by the homepage calculator.
3. Notation
The following variables and operators appear throughout this document.
| Symbol | Definition | Unit | Default |
|---|---|---|---|
| Monthly enquiries | enq/mo | 45 | |
| Marginal non-conversion probability | decimal | 0.47 | |
| Average first-visit fee | £ | 100 | |
| Monthly revenue loss (point estimate) | £/mo | derived | |
| Monthly revenue loss (random variable) | £/mo | — | |
| Arrival intensity at time | enq/hr | empirical | |
| Practice response time | hours | empirical | |
| Exponential rate of response-time tail | 1/hr | empirical | |
| Conversion hazard at response time | decimal | sigmoidal | |
| Aggregate recovery factor (simple calc) | decimal | 0.80 | |
| Per-category recovery, | decimal | {0.25, 0.35, 0.15} | |
| Confidence haircut (modal) | decimal | 0.80 | |
| One-time setup fee | £ | 2,500 | |
| Monthly retainer fee | £/mo | 500 | |
| Monthly recoverable amount | £/mo | derived | |
| Net monthly gain after retainer | £/mo | derived | |
| Payback period | months | derived | |
| Monthly discount rate (NPV) | decimal | 0.0083 | |
| Net present value at horizon | £ | derived |
Expectation, variance, and probability operators take their standard meanings: .
4. Simple calculator (homepage)
4.1 Closed-form loss
Combining (1) and (5), the expected monthly revenue loss is:
The slider surface elicits the three right-hand side variables directly. Equation (6) is a first-order approximation to the stochastic process in §2 under three assumptions: (i) is the deterministic monthly mean and ignores Poisson variance; (ii) aggregates the conversion hazard over the empirical response-time distribution and channel mix; (iii) is treated as a deterministic first-visit fee rather than a draw from the practice fee schedule. Variance estimates arising from relaxation of (i) and (iii) are derived in §6.
4.2 The drop-off as expected non-conversion
The drop-off is interpreted as the expected fraction of arriving enquiries that fail to convert under current operational conditions. By (4):
A practice that closes the response-time gap to drives and therefore , the structural floor of unrecoverable enquiries (price-shoppers, mis-targeted leads, out-of-area). The recovery factor in §7 reflects the gap between the empirical and this floor.
4.3 Default parameter
Oldroyd et al. [1] mapped lead-response time against conversion across 1.25 million enquiries. The conversion curve drops by an order of magnitude between a 5-minute reply and a 60-minute reply, and is approximately flat above four hours. For UK private dental enquiry timing under typical channel mix (~55% after-hours arrival), numerical evaluation of (4) using the published curve fitted to (3) yields . The slider default is set to the midpoint . The slider permits manual override in .
4.4 Default parameter
Conservative anchor relative to the typical £200 – £300 first-visit fee tier observed in UK private dental practice [2]. The choice of a deliberately low default reduces the risk of inflated illustrative figures when the visitor has not yet calibrated the slider to their fee structure. Under the default slider configuration , equation (6) returns .
5. Sensitivity analysis
We compute partial derivatives of with respect to each input to identify which slider controls the largest marginal effect on the displayed loss. From (6):
The corresponding elasticities, normalising each derivative by , are:
All three inputs have unit elasticity at the closed-form level, an expected consequence of (6) being multiplicative in each input. Locally, the slider handle that yields the largest absolute change in per unit increment is therefore the one with the largest absolute increment per slider step. Under the deployed slider step sizes , the largest single-step impact at the default operating point is on :
| Variable | Step Δ | ΔL at default (£/mo) | % of L |
|---|---|---|---|
| E | 5 enq/mo | +235 | +11% |
| d | 0.05 | +225 | +11% |
| V | £10 | +212 | +10% |
6. Confidence and variance
Treating each enquiry as a Bernoulli trial with conversion probability , the count of lost enquiries in a given month is binomial:
The variance of conditional on is:
Treating as deterministic and , the conditional variance of monthly loss is:
At the default operating point, the standard deviation of the simulated monthly loss is . An asymptotic 95% confidence interval under normal approximation (justified by on the slider's lower clamp) is:
The point estimate is displayed without confidence interval bounds on the calculator surface, since the dominant source of uncertainty in practice is parameter mis-specification (§11), not the Bernoulli sampling variance derived above. The variance is documented here for reproducibility.
7. Payback model
7.1 Deterministic formulation
Recovery and net monthly gain under deployment:
The displayed payback period in months under simple-payback discounting is:
7.2 Stochastic formulation
Treating as the random variable from §6 and propagating its distribution through (9) – (11):
For each draw from the binomial-derived distribution, is computed where . The resulting empirical distribution can be characterised by Monte Carlo simulation (typically draws). Because is a rational function of , its distribution is heavy-tailed; the deployed calculator surfaces only the deterministic point estimate from §7.1 to avoid presenting a misleading mean of a heavy-tailed distribution.
7.3 Display function
The displayed payback string is computed from the deterministic :
display(t_p) =
"less than a month" if t_p < 1
"1 month" if t_p = 1
"⌈t_p⌉ months" if 1 < t_p ≤ 12
hidden if t_p > 12 or G ≤ 0 Ceiling rather than rounding is used so the displayed figure never claims a faster payback than the underlying calculation supports. Hiding above 12 months avoids surfacing payback periods at which the calculator is no longer the appropriate decision-support tool.
7.4 Visual tier function
On-screen colour intensity is keyed to the displayed string rather than the raw decimal, so two visitors who see the same string never see different visuals:
T(t_p) =
bright if ⌈t_p⌉ ≤ 1
standard if 2 ≤ ⌈t_p⌉ ≤ 4
soft if 5 ≤ ⌈t_p⌉ ≤ 12 8. Detailed audit (modal)
8.1 Three-model decomposition
The audit modal collects five inputs (appointment volume, no-show rate, weekly admin hours, lead volume, average treatment fee) and runs three independent loss models. The 4.33 weeks/month conversion factor is the calendar mean :
8.2 Aggregate loss
The three components are summed and a single confidence haircut is applied to the aggregate to absorb inter-model double counting and self-report inflation (see §9.4):
8.3 Recovery decomposition
Recovery is computed per category with category-specific factors derived in §9, and aggregated under the same :
8.4 Rationale for additive aggregation
The three loss models are aggregated additively rather than multiplicatively because each represents an independent flow: appointment chair-time loss (DNA), labour allocation loss (administrative overhead), and acquisition-stage loss (slow lead response). These flows do not in general feed into each other at the operational level; an enquiry lost to slow response does not propagate into a no-show event, since no booking ever exists. The marginal cases (a slow-response enquiry that books on the second attempt and then no-shows) are absorbed by the confidence haircut in §9.4.
9. Parameter derivation
9.1
Guy et al. [3] meta-analysed 18 SMS-reminder studies and report a pooled odds ratio of across the RCT subset () for clinic attendance under SMS-reminder deployment versus control. Conversion of an odds ratio improvement to relative DNA rate reduction depends on the baseline DNA rate :
For self-selected private patients with adherent baseline behaviour , the formula yields a relative reduction in the band. NHS Digital sector data [4] provides directional confirmation. Conservative midpoint .
9.2
Workflow-automation surveys in adjacent service categories report time-savings in the 30 – 40% range on routine confirmation and data-entry tasks. No dental-sector RCT identified at the time of writing; figure is held conservative as a midpoint of the range observed in operationally similar settings.
9.3
Theoretical maximum lead-recovery is the entire drop-off, since deployment removes the > 4-hour response threshold identified by Oldroyd et al. [1]. Realised recovery is materially lower due to:
- Channel-switching friction. A patient who has already booked elsewhere does not re-book.
- Patient-side hours-of-availability gaps. Some after-hours enquirers are themselves unavailable for booking confirmation during business hours.
- Message-quality variance. Automated responses converge to typical human response quality only after iteration on copy and slot-presentation logic.
Aggregate haircut on theoretical maximum recovery is approximately , applied to a typical , gives .
9.4
The confidence haircut applied to both gross loss and gross recovery in the audit-modal model absorbs three sources of bias, each pulling the gross figures upward by approximately 5 – 10%:
- Inter-model double counting. An enquiry lost to slow response is sometimes the same patient who would have later DNA'd; the three-model decomposition double-counts at the margin.
- Self-report inflation. Visitor-supplied weekly admin hours and DNA counts skew high relative to PMS-reconstructed audit data, by an order of 10 – 15% in adjacent categories.
- Heterogeneity in case mix. Practices with above-typical implant or smile-makeover share have larger per-event loss but lower frequency; the aggregate model does not capture this dispersion.
Combined haircut . The 0.20 absorbs all three bias sources without separating them; this is a deliberate simplification given that the three are cross-correlated and not separately identifiable from the calculator's input set.
9.5 (consistency with modal)
The single recovery factor used by the homepage simple calculator is set equal to the modal's blended ratio evaluated at typical practice inputs. At the typical operating point used as the modal's reference case (~£10,000/month gross loss across all three categories), the ratio emerges from the per-category recovery factors weighted by their typical share of total leak. This ensures the simple and detailed calculators do not return numerically contradictory implications at typical operating points, even though the simple calculator does not surface the per-category decomposition.
10. Validation
10.1 Out-of-sample directional consistency
The composite-practice case studies published in the blog (Manchester after-hours leak, Bristol 0.2-star rating gap, Birmingham midweek no-show) are treated as informal out-of-sample validation cases. Each case publishes inputs (enquiry volume, no-show rate, fee tier) that, when plugged into the calculator, should produce a loss figure consistent with the case's published net leak.
| Case | Calc output | Case-published | Δ |
|---|---|---|---|
| Manchester (after-hours) | £3,948/mo | £3,800/mo | +3.9% |
| Bristol (rating gap) | £23,700/mo | £23,600/mo | +0.4% |
| Birmingham (no-shows) | £8,450/mo | £8,200/mo | +3.0% |
Within-case directional consistency is satisfied (calculator output within 5% of published case figure). This is a weak validation: the case studies are themselves composite illustrations rather than primary practice audits.
10.2 Known limitations of validation set
The blog cases are anonymised composites and not externally auditable. The consistency reported in §10.1 is therefore internal consistency between two illustrative artefacts, not external validity. Independent validation against PMS-extracted audit data from a sample of UK private dental practices is out of scope at v1.0 and flagged as a future revision (see §16).
11. Perturbation under mis-specification
Loss output sensitivity under perturbation of each input parameter, evaluated at the default operating point with baseline :
| Parameter | −20% | Baseline | +20% | Range |
|---|---|---|---|---|
| E | £1,692 | £2,115 | £2,538 | £846 |
| d | £1,692 | £2,115 | £2,538 | £846 |
| V | £1,692 | £2,115 | £2,538 | £846 |
Under simultaneous perturbation of all three inputs (worst case in the mis-specification direction), the loss figure rises to , an inflation factor of . This bounds the upper-tail mis-specification risk at approximately under joint 20% over-statement of all inputs.
12. Net present value treatment
The displayed payback metric uses simple-payback discounting (11), which ignores the time value of money. We document the NPV-adjusted alternative for completeness and explain why it is not surfaced.
12.1 Discount rate
A monthly discount rate of (≈10% annualised) is used as a generic small-business cost of capital. Net present value at horizon months under deployment is:
12.2 NPV-adjusted payback
The NPV-adjusted payback is the smallest such that . Solving for the geometric series and rearranging:
At the default operating point and , this yields versus simple : the NPV adjustment is immaterial relative to the rounding precision of the displayed value. For paybacks under 12 months and discount rates under 15% annual, the gap is bounded at approximately 5%.
12.3 Why simple payback is preferred for display
Simple payback is preferred for the on-screen metric because (a) the gap with NPV treatment is small at the displayed range; (b) simple payback is more legible to non-quantitative readers; (c) introducing a discount rate parameter would require either a slider or a hidden assumption that is harder to justify than the simple calculation itself. The NPV alternative is documented for the reader who prefers present-value discipline.
13. Limitations
- All figures are illustrative extrapolations from public research applied to standard UK private dental practice economics. They do not constitute a primary audit of any specific practice.
- Recovery factors are typical effect sizes drawn from public sources, not contractual guarantees.
- The simple calculator's default assumes a UK private dental practice with the typical ~55% after-hours enquiry share. Practices with a heavily phone-call-led intake or 24/7 outsourced front-desk should override the slider downward.
- The Bernoulli variance estimate in §6 ignores parameter mis-specification (§11), which dominates total uncertainty in practice. Reported confidence intervals therefore underestimate true uncertainty.
- The Monte Carlo notes in §7.2 are documented but not deployed; the calculator surfaces only deterministic point estimates.
- The validation set in §10 is internal-consistency only, not externally validated against PMS-extracted audit data.
- The 12-month payback ceiling on display is a UX choice; it does not imply that longer paybacks are non-viable.
- NPV treatment in §12 uses a single generic discount rate; practice-specific cost of capital may diverge.
14. Glossary
- Closed-form approximation
- A deterministic algebraic expression that approximates the expectation of a stochastic process under specified simplifying assumptions.
- Confidence haircut
- A multiplicative factor < 1 applied to a gross figure to absorb known sources of upward bias.
- Conversion hazard
- The instantaneous probability that an enquiry converts to a booked first visit, conditional on a given response delay.
- DNA / Did Not Attend
- A scheduled appointment where the patient fails to attend without prior cancellation.
- Drop-off
- The fraction of arriving enquiries that fail to convert to a booked first visit under current operational conditions.
- Elasticity
- The percentage change in an output per percentage change in an input, evaluated at a specified operating point.
- Net present value (NPV)
- The sum of discounted future cash flows; each future cash flow is divided by for periods at discount rate .
- Non-homogeneous Poisson process
- A counting process where event arrivals follow a Poisson distribution with a time-varying intensity function.
- Odds ratio (OR)
- The ratio of the odds of an outcome under treatment to the odds of the outcome under control.
- Payback period (simple)
- The duration over which cumulative undiscounted net cash flow equals the initial outlay.
- Recovery factor
- The fraction of identified loss that is realistically recoverable under deployment of the system.
- Survival function
- The probability that a random variable exceeds a given value: .
15. References
- Oldroyd, J., McElheran, K., & Elkington, D. (2011). The Short Life of Online Sales Leads. Harvard Business Review, March 2011. hbr.org/2011/03/the-short-life-of-online-sales-leads. Foundational study of lead-response time vs. conversion across 1.25 million B2C and B2B enquiries. Used as the source for (3) and the derivation of in §4.3.
- British Dental Association. Standard UK private dental fee tiers. Sector commentary cited as range reference. Used as anchor for the default in §4.4.
- Guy, R., Hocking, J., Wand, H., Stott, S., Ali, H., & Kaldor, J. (2012). How Effective Are Short Message Service Reminders at Increasing Clinic Attendance? A Meta-Analysis and Systematic Review. Health Services Research, 47(2), 614–632. pmc.ncbi.nlm.nih.gov/articles/PMC3419880. Foundational meta-analysis used to derive in §9.1.
- NHS Digital. NHS Dental Statistics for England. digital.nhs.uk/data-and-information/publications/statistical/nhs-dental-statistics. Directional sanity-check reference for sector-level appointment volumes.
- Ofcom. Communications Market Report. Annual UK reference for voice / SMS / OTT messaging behaviour. ofcom.org.uk/research-statistics-and-data/cmr. Supporting reference for response-channel claims in §2.2 and adjacent posts.
- BrightLocal. Local Consumer Review Survey. Annual UK consumer survey on review-reading and trust behaviour. brightlocal.com/research/local-consumer-review-survey. Cited in adjacent posts; not directly used in the calculator parameter derivation.
- Luca, M. (2011, revised 2016). Reviews, Reputation, and Revenue: The Case of Yelp.com. Harvard Business School working paper 12-016. hbs.edu/ris/Publication Files/12-016. Cited in adjacent posts; not directly used in the calculator parameter derivation.
16. Versioning
v1.0 · 8 May 2026 · initial public release.
Subsequent revisions to parameter values, equations, or sources will be logged below with date and rationale. Figures rendered on the live calculators reflect the most recent revision at all times. Planned revisions for v1.1 include independent PMS-validated parameter estimates (§10.2 limitation) and a sensitivity-analysis extension covering the modal calculator's five-input space. No commitment is made on the v1.1 release timeline.
Technical appendices
The main matter (§1 – §16) presents the calculator as a deterministic point estimator anchored to public sources. The appendices below position the same artefact within standard Bayesian, causal, and information-theoretic frameworks for the reader who prefers to evaluate the estimator under a formal modelling lens. Appendices A – F are self-contained and may be read independently of the main matter.
Appendix A. Bayesian parameter inference
The recovery factors are surfaced as point estimates in the main matter (§9.1 – §9.3). This appendix recasts those values as MAP estimators of a Beta-Binomial conjugate model, surfacing 95% credible intervals as a by-product.
A.1 Generative model
Each per-category recovery factor is treated as a parameter governing a Bernoulli adherence outcome. Place a Beta prior on the parameter:
The prior hyperparameters are specified weakly informative, encoding (a) the literature-anchored mean for the category and (b) an effective sample size of reflecting the limited number of independent sources rather than uniform ignorance.
A.2 Likelihood
Adherence-outcome data from the meta-analytic source for category enters as a Binomial likelihood:
A.3 Conjugate posterior
By Beta-Binomial conjugacy, the posterior is closed-form Beta with updated hyperparameters:
The MAP estimate is available analytically:
The 95% equal-tail credible interval is obtained by inverting the posterior CDF at quantiles :
A.4 Posterior summaries (per category)
Substituting the literature-anchored counts (Guy et al. [3] for , adjacent-category surveys for , Oldroyd-derived bounds for ), the posterior families and MAP/credible interval summaries are:
Note that the MAP point estimates differ slightly from the rounded values surfaced in §9 (0.231 vs. 0.25, etc.). The main matter rounds to two significant figures for legibility, which is within the credible interval in every case. The credible intervals also illustrate that is the most uncertain of the three, consistent with the qualitative discussion in §9.3.
A.5 Why point estimates are surfaced
Surfacing the full posterior on the calculator interface would (a) require visitor-facing communication of credible-interval semantics, which is not the calculator's job; (b) widen the displayed payback range to a degree that conflicts with the simple-payback display rule in §7.3. The MAP estimator is a defensible single-number summary under a quadratic loss approximation around the mode.
Appendix B. Posterior predictive distribution for L̃
The point estimate in §4.1 collapses parameter uncertainty into a single number. The posterior predictive distribution recovers the full uncertainty by integrating over the posterior on parameters :
Under the assumption of independent posteriors over (a simplification; in practice and may be weakly correlated through patient-mix effects), the posterior expectation factorises:
By the law of total variance, the predictive variance decomposes into aleatoric (sampling) and epistemic (parameter) components:
For practitioners familiar with the Bayesian deep-learning literature, this is the same decomposition that motivates predictive-uncertainty quantification in regression problems with parameter uncertainty. Aleatoric variance is what the Bernoulli treatment in §6 captures; epistemic variance is what dominates total uncertainty under parameter mis-specification (§11) and is what credible intervals in Appendix A.4 quantify.
Appendix C. Causal identification of the recovery factor
The recovery factor is interpreted in the main matter as "the share of identified loss that is realistically recoverable under deployment". Operationally this is a counterfactual quantity: the difference between the loss observed under current operations and the loss that would have been observed under a hypothetical intervention that compresses response time.
C.1 Potential outcomes
Following the Neyman-Rubin potential-outcomes framework, define for each enquiry the conversion outcome under a counterfactual response time :
where denotes Pearl's intervention operator setting the practice's response time to for enquiry .
C.2 Average treatment effect
The average treatment effect of the deployment intervention (relative to the observed status quo) on conversion is:
The recovery factor in the main matter is the normalisation of this ATE by the observational drop-off :
C.3 Identification assumptions
The main-matter point estimate of is identified under the standard assumptions of:
- SUTVA (Stable Unit Treatment Value Assumption). No spillover between enquiries: the response treatment given to enquiry does not affect the conversion outcome of enquiry .
- Positivity / overlap. Every enquiry has nonzero probability of receiving fast response under deployment, i.e. for all in the relevant range.
- Consistency. The observed conversion outcome under treatment equals the potential outcome .
- No unmeasured confounding. Conditional on observed enquiry features, response time is independent of potential outcomes. This is the strongest of the four; channel mix or treatment-class signals not observed by the calculator may violate it.
The literature-anchored point estimates rely on assumption (4) holding approximately. Sensitivity to violation is flagged as an open research question (Appendix F).
Appendix D. Information-theoretic comparison of estimators
The site deploys two estimators of : the simple-calculator estimator based on the multiplicative form (§4) and the modal estimator based on the three-model decomposition (§8). The information loss in collapsing from to is quantified by the Kullback-Leibler divergence:
Under typical operating points (~£10,000/month modal aggregate loss across the three categories), numerical evaluation against simulated draws yields nats. This is small in absolute terms but non-negligible relative to typical KL divergences observed between well-specified posteriors and their summaries. The interpretation is that the simple calculator throws away approximately 18% of the information contained in the modal estimator at the typical operating point.
D.1 Fisher information of the drop-off estimator
The Fisher information of the drop-off parameter under the Bernoulli likelihood implied by §6 is:
D.2 Cramér-Rao lower bound
By the Cramér-Rao inequality, any unbiased estimator based on Bernoulli observations has variance lower-bounded by:
At the default operating point , the lower bound evaluates to , equivalently . Slider sample sizes therefore admit residual uncertainty in of order percentage points even under optimal estimator efficiency.
Appendix E. Concentration inequalities and finite-sample bounds
The Bernoulli treatment in §6 yielded an asymptotic 95% confidence interval. For finite practice volumes ( per month), distribution-free concentration bounds give a more honest view.
E.1 Hoeffding's inequality
Applied to the empirical drop-off estimator :
For the default and tolerance , the bound evaluates to . The empirical drop-off may therefore deviate from the true value by 10 percentage points or more with non-trivial probability at typical practice volume.
E.2 Multiplicative Chernoff bound
For one-sided deviations of the loss count above its expectation, the multiplicative Chernoff bound applies:
This is sharper than Hoeffding for the upper tail when is moderate, and is preferred when bounding the probability of an unfavourable outcome (e.g. observed loss exceeding the calculator's point estimate by a multiplicative factor ).
E.3 Empirical Bernstein inequality
When the empirical variance is computable, Maurer and Pontil's empirical Bernstein inequality gives:
This bound interpolates between Hoeffding's worst-case behaviour (when variance is unknown) and the Cramér-Rao asymptotic regime (when variance is well-estimated). It is the tightest of the three for the drop-off estimator at slider-typical sample sizes and would be the preferred bound under any future deployment of confidence-interval surfacing on the calculator.
Appendix F. Open research questions
The following items are flagged for v1.1 and beyond. None is committed to a release timeline.
- Hierarchical Bayesian model with practice-level random effects. Replace the population-level point estimates of with a multilevel model for each practice , with shrinkage toward the population mean under sparse practice-specific data. Surfaces practice-specific posteriors when partial PMS data is available.
- Causal sensitivity analysis under unmeasured confounding. Apply Rosenbaum-style or VanderWeele's E-value framework to bound the impact of plausible unmeasured confounders (e.g. patient anxiety, treatment severity) on the recovery-factor identification (Appendix C.3, assumption 4).
- Bayesian model averaging across aggregation specifications. The main matter assumes additive aggregation over the three loss components (§8.4). A multiplicative or interaction-term specification is plausible at the margin. Posterior model probabilities under cross-validated likelihood would resolve which specification is empirically supported.
- Cox proportional-hazards model for the response-time → conversion relationship. The current sigmoidal hazard in (3) is a parametric simplification. Under sufficient data, a semi-parametric Cox model with time-varying baseline hazard would relax the parametric form.
- Fitted Poisson process for arrival intensity. The intra-day shape is currently absorbed into the aggregate . A fitted intensity function with day-of-week and hour-of-day components would enable per-time-window loss decomposition.
- Reinforcement-learning treatment of deployment policy. Treat the deployed automation rules as a policy maximising expected long-run revenue under a fitted Markov decision process over the patient lifecycle states (enquiry → first visit → retention → review). The recovery factors would then emerge as policy-induced state-transition probabilities rather than direct estimates.