Typically one will want to stratify by chain by calling cluster_test_by
, as this will calculate the number of cell "trials" separately depending on the chain recovered.
cluster_test_by(ccdb, fields = "chain", tbl = "cluster_tbl", ...)
cluster_logistic_test(
formula,
ccdb,
filterset = cluster_filterset(),
contig_filter_args = TRUE,
tie_break_keys = c("umis", "reads"),
add_cluster_tbl = FALSE,
keep_fit = FALSE,
fitter = glm_glmer,
silent = FALSE
)
character
naming fields in tbl
one of contig_tbl
, cell_tbl
or cluster_tbl
passed to cluster_logistic_test
the right-hand side of a glmer or glm-style formula.
a call to cluster_filterset()
that will be used to subset clusters.
an expression passed to dplyr::filter()
.
Unlike filter
, multiple criteria must be &
together, rather than using
commas to separate. These act on ccdb$contig_tbl
(optional) character
naming fields in contig_tbl
that are used sort the contig table in descending order.
Used to break ties if contig_filter_args
does not return a unique contig
for each cluster
logical
should the output be joined to the cluster_tbl
?
logical
as to whether the fit objects should be returned as a list column
a function taking arguments formula
, data
, is_mixed
and keep_fit
that is run on each cluster. Should return a tibble
or data.frame
logical
. Should warnings from fitting functions should be suppressed?
table with one row per cluster/term.
cluster_test_by
: split ccdb
and conduct tests within strata
library(dplyr)
data(ccdb_ex)
ccdb_ex = cluster_germline(ccdb_ex)
trav1 = filter(ccdb_ex$cluster_tbl, v_gene == 'TRAV1')
cluster_logistic_test(~pop + (1|sample), ccdb_ex,
filterset = cluster_filterset(white_list= trav1))
#> Fitting mixed logistic models to 2 clusters.
#> Loading required namespace: broom
#> Loading required namespace: lme4
#> Loading required namespace: broom.mixed
#> boundary (singular) fit: see help('isSingular')
#> Warning: variance-covariance matrix computed from finite-difference Hessian is
#> not positive definite or contains NA values: falling back to var-cov estimated from RX
#> Warning: variance-covariance matrix computed from finite-difference Hessian is
#> not positive definite or contains NA values: falling back to var-cov estimated from RX
#> boundary (singular) fit: see help('isSingular')
#> # A tibble: 4 × 7
#> effect term estimate std.error statistic p.value cluster_idx
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 fixed (Intercept) -6.05 1.00 -6.04 0.00000000149 1
#> 2 fixed popbalbc -30.0 3298970. -0.00000909 1.00 1
#> 3 fixed (Intercept) -38.9 543. -0.0716 0.943 2
#> 4 fixed popbalbc 32.9 543. 0.0606 0.952 2
# Fixed effect analysis of each cluster, by chain
prev4 = ccdb_ex$contig_tbl %>% group_by(cluster_idx) %>%
summarize(n()) %>% filter(`n()`>= 4)
cluster_test_by(ccdb = ccdb_ex, fields = 'chain',
tbl = 'cluster_tbl', formula = ~ pop, filterset = cluster_filterset(white_list= prev4))
#> Fitting fixed logistic models to 7 clusters.
#> Fitting fixed logistic models to 90 clusters.
#> # A tibble: 194 × 7
#> chain term estimate std.error statistic p.value cluster_idx
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 TRA (Intercept) -3.10 0.273 -11.3 8.64e-30 26
#> 2 TRA popbalbc -0.601 0.450 -1.33 1.82e- 1 26
#> 3 TRA (Intercept) -4.67 0.580 -8.06 7.88e-16 213
#> 4 TRA popbalbc -1.13 1.16 -0.973 3.30e- 1 213
#> 5 TRA (Intercept) -22.6 2678. -0.00843 9.93e- 1 295
#> 6 TRA popbalbc 18.2 2678. 0.00678 9.95e- 1 295
#> 7 TRA (Intercept) -5.08 0.709 -7.16 7.84e-13 442
#> 8 TRA popbalbc -0.0215 1.00 -0.0214 9.83e- 1 442
#> 9 TRA (Intercept) -22.6 2678. -0.00843 9.93e- 1 463
#> 10 TRA popbalbc 18.2 2678. 0.00678 9.95e- 1 463
#> # … with 184 more rows