Default Model Example¶
Preprocess dataset, build and run default model with a custom infection-to-fatality delay.
[1]:
from epimodel.preprocessing.data_preprocessor import preprocess_data
from epimodel.pymc3_models.models import DefaultModel
from epimodel.pymc3_models.epi_params import EpidemiologicalParameters, bootstrapped_negbinom_values
import pymc3 as pm
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
Set Theano Environmental Variables for Parallelisation
Load Data¶
[2]:
data = preprocess_data('../notebooks/double-entry-data/double_entry_final.csv')
Dropping NPI Travel Screen/Quarantine
Dropping NPI Travel Bans
Dropping NPI Public Transport Limited
Dropping NPI Internal Movement Limited
Dropping NPI Public Information Campaigns
Dropping NPI Symptomatic Testing
Masking invalid values
Create custom infection to fatality delay using default generation interval¶
[3]:
example_symptom_to_fatality_delay = {
'mean_mean': 18,
'mean_sd': 1,
'disp_mean': 10,
'disp_sd': 3,
'source': 'made up',
'dist': 'negbinom',
'notes': 'For example purposes only'
}
[4]:
ep = EpidemiologicalParameters()
[6]:
infection_to_fatality_delay = bootstrapped_negbinom_values([ep.incubation_period,
example_symptom_to_fatality_delay])
100%|██████████| 250/250 [05:54<00:00, 1.42s/it]
[7]:
ep.infection_to_fatality_delay = infection_to_fatality_delay[0]
Initialise model with epidemiological parameters¶
[8]:
with DefaultModel(data) as model:
model.build_model(**ep.get_model_build_dict())
Run model¶
The notebook features a small number of samples to make documentation compilation proceed quickly. For a serious run, use at least 1000 samples and 500 tuning steps.
[9]:
with model.model:
model.trace = pm.sample(100, tune=100, cores=4, chains=4, max_treedepth=12, target_accept=0.95)
Only 100 samples in chain.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [DeathsDelayDisp, DeathsDelayMean, InitialSizeDeaths_log, CasesDelayDisp, CasesDelayMean, InitialSizeCases_log, PsiDeaths, PsiCases, GrowthDeathsNoise, GrowthCasesNoise, GI_sd, GI_mean, RegionLogR_noise, HyperRVar, CM_Alpha]
100.00% [800/800 13:21<00:00 Sampling 4 chains, 0 divergences]
Sampling 4 chains for 100 tune and 100 draw iterations (400 + 400 draws total) took 803 seconds.
The rhat statistic is larger than 1.4 for some parameters. The sampler did not converge.
The number of effective samples is smaller than 10% for some parameters.
Trace variables
[11]:
model.trace.varnames
[11]:
['CM_Alpha',
'HyperRVar_log__',
'RegionLogR_noise',
'GI_mean',
'GI_sd',
'GrowthCasesNoise',
'GrowthDeathsNoise',
'PsiCases_log__',
'PsiDeaths_log__',
'InitialSizeCases_log',
'CasesDelayMean',
'CasesDelayDisp',
'InitialSizeDeaths_log',
'DeathsDelayMean',
'DeathsDelayDisp',
'CMReduction',
'HyperRVar',
'RegionR',
'PsiCases',
'PsiDeaths',
'InfectedCases',
'ExpectedCases',
'InfectedDeaths',
'ExpectedDeaths']
[ ]: