Summarize binary covariate clusters

summarize_binary_clusters(df, covariate_name = "x")

Arguments

df	data frame with the following columns cluster_id unique cluster ID covariate_name actual name of feature to summarize over, a binary (0/1) covariate
covariate_name	name of the single binary covariate

Value

a data frame with the following columns

freq

frequency of the following clusters

cluster_size

total size of the cluster

x_pos

number of individuals in the cluster with the feature of interest =1

x_neg

number of individuals in the cluster with the feature of interest = 0

Details

Condense data from data frames about each individuals to the summary of the number of indivduals who have a particular covariate feature (1/0). This assumes the trees are in order by generation.

Examples

example_cluster <- data.frame(cluster_id = c(1, 1, 1,
2, 2,
3, 3, 3, 3,
4),
x = c(0, 1, 1,
0, 0,
1, 0, 1, 1,
0))
summarize_binary_clusters(example_cluster)
#> # A tibble: 4 x 4
#>    freq cluster_size x_pos x_neg
#>   <int>        <int> <int> <int>
#> 1     1            1     0     1
#> 2     1            2     0     2
#> 3     1            3     2     1
#> 4     1            4     3     1