qdiv.model.rcq

qdiv.model.rcq(obj, *, constrain_by=None, randomization='frequency', iterations=999, div_type='naive', distmat=None, q=1.0, use_tqdm=True, random_state=None, **kwargs)[source]

Raup–Crick-style null comparisons for beta-diversity.

Randomizes the abundance table while preserving each sample’s richness and total reads, then contrasts the observed beta-diversity matrix against a null distribution built via randomization.

Parameters:
  • obj (MicrobiomeData | dict | Any) – Input with at least an abundance table under key ‘tab’. Optionally may include ‘meta’ (sample metadata) and ‘tree’ (for phylogenetic measures).

  • constrain_by (str, optional) – Column in metadata to constrain randomization within categories; if None, randomize across all samples.

  • randomization ({"frequency", "abundance"}, default="frequency") –

    Randomization strategy for selecting the set of taxa per randomized sample:
    • ”abundance”: probabilities proportional to group-level summed abundances

    • ”frequency”: probabilities proportional to group-level presence frequency

    Within the selected set, additional reads are allocated proportional to the selected taxa’s group-level abundances to match each sample’s total reads.

  • iterations (int, default=999) – Number of randomization iterations used to build the null distribution.

  • div_type ({"Jaccard", "Bray", "naive", "phyl", "func"}, default="naive") – Dissimilarity index to compute for observed and null tables. - “Jaccard”, “Bray”: classic indices on the (randomized) count table - “naive”: Hill-number-based (requires q) - “phyl”: phylogenetic beta diversity (requires ‘tree’ in obj) - “func”: functional beta diversity (requires distmat)

  • distmat (pandas.DataFrame, optional) – Square functional distance matrix (features × features); required if div_type=”func”.

  • q (float, default=1.0) – Diversity order for Hill-number-based indices (used by “naive”, “phyl”, “func”).

  • use_tqdm (bool, default=True) – Use tqdm for progress bars.

  • random_state (int | numpy.random.Generator, optional) – Random seed or Generator for reproducibility.

Returns:

{

“div_type”: str, “obs_d”: DataFrame (S × S), observed beta-diversity, “p”: DataFrame (S × S), Raup–Crick probability P(null < obs) + 0.5·P(null == obs), “null_mean”:DataFrame (S × S), mean of null, “null_std”: DataFrame (S × S), std of null, “ses”: DataFrame (S × S), (null_mean - obs) / null_std

}

Return type:

dict

Notes

  • Per-sample constraints: if constrain_by is given, randomization is performed within each metadata category independently to preserve structure. Otherwise, all samples are randomized together.

  • Richness & read preservation: for each sample, we draw a set of taxa matching the original richness, then allocate extra reads to match the original total reads.

  • Raup–Crick p-index: counts how often the null dissimilarity is strictly lower than observed, ties contribute 0.5, normalized by iterations.

  • A p value close to zero means observed dissimilarity is lower than the null expectation.

  • A p value close to one means observed dissimilarity is higher than the null expectation.

  • A positive ses means observed dissimilarity is lower than the null expectation.

  • A negative ses means observed dissimilarity is higher than the null expectation.