This chapter provides operational instructions for analysts executing the Hospital-TTD-Mod pipeline. The codebase is designed to be highly modular and controlled via central configuration files, ensuring that analysts do not need to alter the core epidemiological or economic scripts to evaluate different service configurations.
9.1 Environment configuration and data security
Before executing the model, analysts must configure the master environment file (00_config.R). This file controls the geographic scope of the simulation and, critically, enforces data security protocols.
To prevent the accidental processing or compilation of restricted Hospital Episode Statistics (HES) data in unauthorised environments (e.g., local laptops or continuous integration servers), the model utilises a strict data routing toggle:
Code
# Extract from 00_config.R# SECURITY TOGGLE:# Set to FALSE ONLY when executing on the secure heta_study VMUSE_DUMMY_DATA <-TRUE# GEOGRAPHIC SCOPE:# Define the target geography (e.g., "england" or "south_yorkshire")TARGET_GEOGRAPHY <-"south_yorkshire"
When USE_DUMMY_DATA is set to TRUE, the pipeline routes all inputs through the synthetic data generation modules (scripts/02_data_prep_dummy/). Analysts must explicitly set this to FALSE when operating on the secure heta_study virtual machine to evaluate real patient data.
9.2 Scenario definition
The engine evaluates a basecase scenario against alternative service improvement configurations. Rather than hard-coding these parameters, analysts define scenarios using the external user_inputs/scenarios_control_panel.csv file.
This control panel allows analysts to adjust parameters such as:
Clinical pathways: Screening rates, specialist assessment capacities, and prescribing proportions.
Economic inputs: Unit costs for varying forms of nicotine replacement therapy (NRT), and the proportion of readmission savings deemed cashable.
The scripts/00_build_scenarios.R script ingests this CSV and translates it into a list of configuration objects, which are then passed sequentially through the simulation engine.
9.3 Defining rollout and coverage scenarios (the macro-trends sheet)
The file that answers “what happens to costs and benefits if the service scales up faster, slower, or contracts?” is user_inputs/macro_trends_panel.csv. This section explains what each column means, how the engine uses it, and how to build alternative rollout scenarios. It is written so that analysts and service colleagues can agree, in plain English, what is being modelled before any numbers change.
9.3.1 The two coverage levers
Service coverage is described by two independent dimensions, deliberately separated because they answer different questions and carry different costs:
Trust_Coverage_Pct – the extensive margin: the fraction of in-scope acute trusts that have a live tobacco-dependence service in a given year. This is “how many front doors are open.”
Within_Trust_Coverage_Pct – the intensive margin: among trusts that have a service, the average fraction of eligible inpatients the service actually reaches. This is “how deep the service goes behind each open door.”
A worked intuition: if 80% of trusts have a service (Trust_Coverage_Pct = 0.80) and, within those trusts, the service reaches 50% of eligible inpatients (Within_Trust_Coverage_Pct = 0.50), the service is reaching 0.80 x 0.50 = 40% of the national eligible inpatient population.
9.3.2 How the engine uses them
That product is the quantity that drives the model. The engine derives it internally:
Code
# Derived inside the engine - do not supply this as an input column:Rollout_Pct <- Trust_Coverage_Pct * Within_Trust_Coverage_Pct
Activity – and therefore health benefits and all activity-driven costs – scales with the productTrust_Coverage_Pct x Within_Trust_Coverage_Pct. Patients screened, assessed and treated, quits generated, and admissions prevented all move with this combined reach.
The fixed per-trust running cost (the band-8a senior-management overhead a trust incurs simply by operating a service, regardless of how many patients it reaches) scales with Trust_Coverage_Pct alone. Opening a service in more trusts adds fixed cost even if within-trust reach is still low.
This split matters for cost-effectiveness. A scenario that opens services in many trusts but keeps within-trust reach low carries a heavy fixed-cost burden for relatively little activity; a scenario that drives depth within a smaller set of trusts is more efficient per quit. Separating the two levers is what lets the model see that difference.
9.3.3 History is fixed; the future is what you vary
The panel spans calendar years 2019–2045 (plus the post-service follow-up tail). The Phase column marks two eras:
A_historical (2019–2025) – the observed period. Here the productTrust_Coverage_Pct x Within_Trust_Coverage_Pct is calibrated so that modelled specialist assessments reproduce the actual national referral data. This period is therefore identical in every scenario: activity that has already happened is not a modelling choice. Changing the historical numbers cannot change historical benefits – it can only re-allocate historical fixed cost (see the subtlety below).
B_forward (2026 onward) – the projection. This is where scenarios genuinely diverge. From the current point forward you assert different coverage trajectories; their product changes activity (hence benefits), and Trust_Coverage_Pct changes the fixed-cost base.
This is the model’s intended use going forward: fix the past, vary the future, and read off the difference. A rollout scenario is simply a forward (2026+) trajectory for the two levers; comparing two scenarios isolates the effect of that forward choice on prevented admissions, costs, the short-term budget impact, and the ICER.
NoteThe historical-split subtlety
Because only the product is pinned by the referral data, the division of that product into “how many trusts” versus “how deep within each” is an assumption, not an observation. You may legitimately say “trusts reached full coverage by 2024” or “trusts were capped at 95%.” Doing so changes Trust_Coverage_Pct in the historical rows, and Within_Trust_Coverage_Pct must move the opposite way to keep the product equal to observed activity. Benefits are unchanged; the only thing that moves is the band-8a fixed cost (more trusts live earlier means slightly more fixed cost booked earlier). So “100% reached in 2024 vs 95% capped” is a cost-timing question in the historical period – and a benefits-and-cost question only from 2026 onward.
9.3.4 Building a forward scenario, step by step
Start from the calibrated history (2019–2025) and keep it identical across scenarios.
Choose, for 2026 onward, a trajectory for Trust_Coverage_Pct (how many trusts end up live, and when) and Within_Trust_Coverage_Pct (how deep, and when). Each is a decimal fraction in [0, 1].
Leave Rollout_Pct out – the engine derives it.
Set the remaining assumption columns (below), usually identical across rollout scenarios so that rollout is the only thing that differs.
Name the scenario in the Scenario column and add a matching row-set in scenarios_control_panel.csv (see the note on scenario axes at the end).
9.3.5 Worked examples
These three forward trajectories implement the options discussed with the service team and ship as an editable starting point in rollout_scenarios_template.csv. History (2019–2025) is the shared calibrated base in all three; only 2026 onward differs.
Scenario
Lever
2026
2028
2030
2035+
Full ambition (the “100%”)
Trust coverage
1.00
1.00
1.00
1.00
Within-trust
0.55
0.95
0.95
0.95
effective reach
0.55
0.95
0.95
0.95
Constrained (the “60%”)
Trust coverage
0.95
0.95
0.95
0.95
Within-trust
0.42
0.56
0.60
0.60
effective reach
0.40
0.53
0.57
0.57
Contraction (downside)
Trust coverage
0.92
0.84
0.77
0.75
Within-trust
0.38
0.34
0.30
0.30
effective reach
0.35
0.29
0.23
0.23
Full ambition reflects “with full funding, 95% within-trust coverage would have been reached sooner”: the within-trust ceiling is brought forward to 2028 (year 10) rather than 2030 (year 12), on full (100%) trust coverage.
Constrained reflects “trusts only expanded to about 60% within, and a few never came on board”: within-trust caps at 0.60 and trust coverage holds at 0.95, never reaching 100%.
Contraction is a downside in which funding gaps decommission services over time (trust coverage drifts from 0.92 to 0.75) and within-trust reach erodes. It answers “what do we lose if the service shrinks from here?”
Edit the numbers to whatever the team agrees; the structure is the point.
9.3.6 Modelling service contraction
Contraction is not a special mode – it is a forward trajectory in which one or both levers decline. To model decommissioning, reduce Trust_Coverage_Pct in the relevant years (each closed service removes its fixed cost and its share of activity). To model services pared back without closing, reduce Within_Trust_Coverage_Pct. Because the fixed per-trust cost follows Trust_Coverage_Pct, a decommissioning scenario correctly shows fixed costs falling as trusts close – something a single activity-only rollout number would miss.
9.3.7 The remaining columns
Column
Meaning
Typical handling
Adms_Growth_Pct
annual growth in hospital admissions (0.015 = +1.5%/yr)
hold constant across rollout scenarios
Prev_Change_Pct
annual population smoking-prevalence change (-0.005 = -0.5%/yr)
hold constant across rollout scenarios
Relapse_Rate
annual relapse to smoking among those quit >12 months
hold constant across rollout scenarios
Year
model-year index (1 = 2019); the engine’s join key
do not edit – derived from Calendar_Year
Phase
A_historical (<=2025) or B_forward (>=2026)
label only
Holding the three assumption columns identical across rollout scenarios ensures any difference in results is attributable to rollout alone – which is the comparison you want.
9.3.8 Adding a scenario: the three files and their layouts
A scenario is defined across three input files, and – this is the single most common source of confusion – they do not share a layout:
macro_trends_panel.csv is scenario-as-rows (long). Each scenario is a block of one row per calendar year, tagged by the Scenario column. This is where the rollout (the two coverage levers) lives.
scenarios_control_panel.csv is scenario-as-columns (wide). Each scenario is a column; each row is a clinical, quit-success, or cost parameter. This is where screening rates, assessment probabilities, prescribing, unit costs, and n_trusts_modelled live.
equity_multipliers_panel.csv is also scenario-as-columns (wide). Each scenario is a column of demographic (age/IMD) multipliers applied on top of the control-panel values.
The set of scenarios the model runs is discovered automatically from the columns of scenarios_control_panel.csv: every column that is not Category, Parameter_Name, Arm, Definition, or Source is treated as a scenario. scripts/00_build_scenarios.R melts the wide panels into the long matrices the engine consumes, and main.R then loops over unique(Scenario).
To add a new scenario – say a rollout scenario that differs from Basecase only in coverage – you therefore touch all three files:
macro_trends_panel.csv – add a row-block under the new Scenario name with the coverage trajectory you want (2026 onward; history shared).
scenarios_control_panel.csv – add a column with the new name. To vary only rollout, copy the Basecase column verbatim. (To explore rollout under an improved clinical pathway instead, copy a ..._BEST column.)
equity_multipliers_panel.csv – add a column with the new name, again copying Basecase. This step is easy to forget but matters: a scenario with no equity column is not given Basecase’s demographic gradient – its multipliers default to a neutral 1.0, which would make the scenario demographically flat and confound a rollout-only comparison. (This is why MVS, which has no equity column, is deliberately flat.)
Re-run scripts/00_build_scenarios.R (or the whole main.R) so the long matrices regenerate and the new scenario is picked up.