Simulate the branching process of flipping until failure for K clusters

simulate_bp(
  K,
  inf_params,
  sample_covariates_df,
  covariate_names,
  covariate_weights = NULL,
  max_size = 50
)

Arguments

K

number of total clusters to simulate

inf_params

vector with beta coefficients to use in logistic function for probability of transmission

sample_covariates_df

Data frame of covariates to sample from

covariate_names

names of the covariates. Must match size of inf_params - 1.

covariate_weights

default is NULL which draws uniformly at random with replacement from the sample_covariates_df. Otherwise, the weights are used.

max_size

maximum size a cluster can be

Value

data frame with the following columns

cluster_id

unique cluster ID

person_id

order of infection in the cluster

gen

generation number (>=0)

inf_id

ID of the infector

n_inf

number of people infected by person

censored

whether the cluster end was censored or not

cluster_size

size of the cluster

covariates

covariates of the individuals

Details

Generate a branching process according to the following process. First a root infector is drawn covariates \(X\) from some distribution $F$ (given by the set of covariates in sample_covariates_df) and has probability of transmission according to a logit function. The number of infections produced by the root node $N_(1,1) is a geometric random variable with probability $p_(1,1)$ where the indexing represents $(g=$, generation, $i=$ index). If $N_(1,1) > 0$, then the $N_(1,1)$ infections are added to the cluster and assigned to generation $g=2$ with indices $i=1, ..., N_(1,1)$ and covariats are drawn for these new infections. The infection process continues with individuals $(2, 1)$ through $(2, $N_(1,1))$ where new infections are added, in order to the subsequent generation. The process terminates when either there are no new infections or the maximum number of infections specified in max_size is reached. $$X_{(g,i)} \sim F$$ $$p_{(g,i)} = logit^{-1}\left ( X_{(g,i)} \beta\right )$$ $$N_{(g,i)} \sim Geometric(p_{(g,i)})$$

Examples

set.seed(2020) inf_params <- c("beta_0" = -2, "beta_1" = 2) df <- data.frame(x= c(0, 1)) branching_processes <- simulate_bp(K = 10, inf_params = inf_params, covariate_names = "x", sample_covariates_df = df) head(branching_processes)
#> cluster_id person_id gen inf_id n_inf cluster_size x censored #> 1 1 C1-G1-N1 1 <NA> 1 2 1 FALSE #> 2 1 C1-G2-N1 2 C1-G1-N1 NA 2 1 FALSE #> 3 2 C2-G1-N1 1 <NA> NA 1 0 FALSE #> 4 3 C3-G1-N1 1 <NA> 1 3 1 FALSE #> 5 3 C3-G2-N1 2 C3-G1-N1 1 3 1 FALSE #> 6 3 C3-G3-N1 3 C3-G2-N1 NA 3 1 FALSE
table(branching_processes$cluster_size) / sort(unique(branching_processes$cluster_size))
#> #> 1 2 3 4 11 #> 6 1 1 1 1