qdiv.model.null module

qdiv.model.null.rcq(obj, *, constrain_by=None, randomization='frequency', iterations=999, div_type='naive', distmat=None, q=1.0, use_tqdm=True, random_state=None, **kwargs)[source]

Raup–Crick-style null comparisons for beta-diversity.

Randomizes the abundance table while preserving each sample’s richness and total reads, then contrasts the observed beta-diversity matrix against a null distribution built via randomization.

Parameters:
  • obj (MicrobiomeData | dict | Any) – Input with at least an abundance table under key ‘tab’. Optionally may include ‘meta’ (sample metadata) and ‘tree’ (for phylogenetic measures).

  • constrain_by (str, optional) – Column in metadata to constrain randomization within categories; if None, randomize across all samples.

  • randomization ({"frequency", "abundance"}, default="frequency") –

    Randomization strategy for selecting the set of taxa per randomized sample:
    • ”abundance”: probabilities proportional to group-level summed abundances

    • ”frequency”: probabilities proportional to group-level presence frequency

    Within the selected set, additional reads are allocated proportional to the selected taxa’s group-level abundances to match each sample’s total reads.

  • iterations (int, default=999) – Number of randomization iterations used to build the null distribution.

  • div_type ({"Jaccard", "Bray", "naive", "phyl", "func"}, default="naive") – Dissimilarity index to compute for observed and null tables. - “Jaccard”, “Bray”: classic indices on the (randomized) count table - “naive”: Hill-number-based (requires q) - “phyl”: phylogenetic beta diversity (requires ‘tree’ in obj) - “func”: functional beta diversity (requires distmat)

  • distmat (pandas.DataFrame, optional) – Square functional distance matrix (features × features); required if div_type=”func”.

  • q (float, default=1.0) – Diversity order for Hill-number-based indices (used by “naive”, “phyl”, “func”).

  • use_tqdm (bool, default=True) – Use tqdm for progress bars.

  • random_state (int | numpy.random.Generator, optional) – Random seed or Generator for reproducibility.

Returns:

{

“div_type”: str, “obs_d”: DataFrame (S × S), observed beta-diversity, “p”: DataFrame (S × S), Raup–Crick probability P(null < obs) + 0.5·P(null == obs), “null_mean”:DataFrame (S × S), mean of null, “null_std”: DataFrame (S × S), std of null, “ses”: DataFrame (S × S), (null_mean - obs) / null_std

}

Return type:

dict

Notes

  • Per-sample constraints: if constrain_by is given, randomization is performed within each metadata category independently to preserve structure. Otherwise, all samples are randomized together.

  • Richness & read preservation: for each sample, we draw a set of taxa matching the original richness, then allocate extra reads to match the original total reads.

  • Raup–Crick p-index: counts how often the null dissimilarity is strictly lower than observed, ties contribute 0.5, normalized by iterations.

  • A p value close to zero means observed dissimilarity is lower than the null expectation.

  • A p value close to one means observed dissimilarity is higher than the null expectation.

  • A positive ses means observed dissimilarity is lower than the null expectation.

  • A negative ses means observed dissimilarity is higher than the null expectation.

qdiv.model.null.nriq(obj, distmat, *, q=1.0, iterations=999, randomization='features', use_tqdm=True, random_state=None, **kwargs)[source]

Net Relatedness Index (NRI) with q-weighting of relative abundances. Accepts either a MicrobiomeData object or a dict with at least a ‘tab’ DataFrame.

Parameters:
  • obj (MicrobiomeData, dict, or compatible object) – Input data. Must provide at least an abundance table (‘tab’).

  • distmat (pd.DataFrame) – Square distance matrix indexed/columned by feature ids.

  • q (float, default=1.0) – Order of diversity weighting applied to relative abundances.

  • iterations (int, default=999) – Number of random permutations of distmat.

  • randomization ({'features', 'abundances'}, default='features') – Randomization strategy. Shuffle features in the phylogenetic tree or relative abundance values in each sample.

  • use_tqdm (bool, default=True) – Use tqdm for progress bars.

  • random_state (int or np.random.Generator, optional) – Random seed or generator for reproducibility.

Returns:

Indexed by sample names with columns: - ‘MPDq’ - ‘null_mean’ - ‘null_std’ - ‘p’ (Pr[ null < observed ] + 0.5*ties) / iterations - ‘ses’ (null_mean - observed) / null_std

Return type:

pandas.DataFrame

Notes

  • A p value close to zero means that the observed MPD is lower than the null expectation

  • A p value close to one means that the observed MPD is higher than the null expectation

  • A positive ses means that the observed MPD is lower than the null expectation

  • A negative ses means that the observed MPD is higher than the null expectation

References

Webb et al. (2002) American Naturalist.

qdiv.model.null.ntiq(obj, distmat, *, q=1.0, iterations=999, randomization='features', use_tqdm=True, random_state=None, **kwargs)[source]

Nearest Taxon Index (NTI) with q-weighting of relative abundances. Computes MNTD_q (mean nearest-taxon distance with q-weighted abundances), then compares to a null obtained by either permuting feature labels (“features”) or shuffling abundances within each sample (“abundances”).

Parameters:
  • obj (MicrobiomeData, dict, or compatible object) – Input data. Must provide at least an abundance table (‘tab’).

  • distmat (pd.DataFrame) – Square distance matrix indexed/columned by feature ids.

  • q (float, default=1.0) – Order of diversity weighting applied to relative abundances.

  • iterations (int, default=999) – Number of random permutations of distmat.

  • randomization ({'features', 'abundances'}, default='features') – Randomization strategy. Shuffle features in the phylogenetic tree or relative abundance values in each sample.

  • use_tqdm (bool, default=True) – Use tqdm for progress bars.

  • random_state (int or np.random.Generator, optional) – Random seed or generator for reproducibility.

Returns:

Indexed by sample names with columns: - ‘MNTDq’ - ‘null_mean’ - ‘null_std’ - ‘p’ (Pr[ null < observed ] + 0.5*ties) / iterations - ‘ses’ (null_mean - observed) / null_std

Return type:

pandas.DataFrame

Notes

  • A p value close to zero means that the observed MPNTD is lower than the null expectation

  • A p value close to one means that the observed MNTD is higher than the null expectation

  • A positive ses means that the observed MNTD is lower than the null expectation

  • A negative ses means that the observed MNTD is higher than the null expectation

qdiv.model.null.beta_nriq(obj, distmat, *, q=1.0, iterations=999, randomization='features', use_tqdm=True, random_state=None, **kwargs)[source]

Computes beta-MPD_q for all sample pairs, then contrasts against a null generated by (a) feature label permutations (“features”) or (b) within-sample abundance shuffles (“abundances”).

Parameters:
  • obj (MicrobiomeData, dict, or compatible object) – Input data. Must provide at least an abundance table (‘tab’).

  • distmat (pd.DataFrame) – Square distance matrix indexed/columned by feature ids.

  • q (float, default=1.0) – Order of diversity weighting applied to relative abundances.

  • iterations (int, default=999) – Number of random permutations of distmat.

  • randomization ({'features', 'abundances'}, default='features') – Randomization strategy. Shuffle features in the phylogenetic tree or relative abundance values in each sample.

  • use_tqdm (bool, default=True) – Use tqdm for progress bars.

  • random_state (int or np.random.Generator, optional) – Random seed or generator for reproducibility.

Returns:

‘beta_MPDq’ : observed beta-MPD_q ‘null_mean’ : mean of null beta-MPD_q ‘null_std’ : std of null beta-MPD_q ‘p’ : (count(null < obs) + 0.5 * ties) / iterations ‘ses’ : (null_mean - obs) / null_std

Return type:

dict of pandas.DataFrame (S x S)

Notes

  • Returns a dataframe with observed beta_MPDq if iterations=0, otherwise a dictionary is returned

  • A p value close to zero means that the observed MPD between samples is lower than the null expectation

  • A p value close to one means that the observed MPD between samples is higher than the null expectation

  • A positive ses means that the observed MPD between samples is lower than the null expectation

  • A negative ses means that the observed MPD between samples is higher than the null expectation

qdiv.model.null.beta_ntiq(obj, distmat, *, q=1.0, iterations=999, include_conspecifics=False, randomization='features', use_tqdm=True, random_state=None, **kwargs)[source]

Computes beta-MNTD_q (mean nearest-taxon distance with q-weighted abundances) for all sample pairs, then contrasts the observed matrix against a null distribution generated by randomization:

  • randomization=”features”: permute feature identities (rows) identically across samples

  • randomization=”abundances”: shuffle abundances within each sample (column-wise)

The null distribution is aggregated online using Welford updates, yielding per-pair null mean, null std, tie-aware p-index, and standardized effect size.

Parameters:
  • obj (MicrobiomeData | dict | Any) – Input with at least an abundance table under key ‘tab’.

  • distmat (pandas.DataFrame) – Square distance matrix (features × features) whose index/columns include tab.index.

  • q (float, default=1.0) – Diversity order used to weight relative abundances (applied only to strictly positive entries).

  • iterations (int, default=999) – Number of randomization iterations used to build the null distribution.

  • include_conspecifics (bool, default=False) – Determines whether conspecifics (identical features shared between samples) are allowed to contribute zero-distance matches in the nearest-taxon calculation.

  • randomization ({"features", "abundances"}, default="features") –

    Randomization strategy for the null model:
    • ”features”: permute feature identities identically for all samples (tip-label permutation).

    • ”abundances”: shuffle abundances within each sample (column-wise permutation).

  • use_tqdm (bool, default=True) – Use tqdm for progress bars (a lightweight stub is used if tqdm is unavailable).

  • random_state (int | numpy.random.Generator, optional) – Random seed or Generator for reproducibility.

Returns:

Full (samples × samples) matrices:
  • ’beta_MNTDq’ : observed beta-MNTD_q

  • ’null_mean’ : mean of null beta-MNTD_q

  • ’null_std’ : std of null beta-MNTD_q

  • ’p’ : (count(null < observed) + 0.5 * ties) / iterations

  • ’ses’ : (null_mean - observed) / null_std

Diagonal entries are set to NaN.

Return type:

dict of pandas.DataFrame

Notes

  • Returns a dataframe with observed beta_MNTDq if iterations=0, otherwise a dictionary is returned

  • A p value close to zero means that the observed MNTD between samples is lower than the null expectation

  • A p value close to one means that the observed MNTD between samples is higher than the null expectation

  • A positive ses means that the observed MNTD between samples is lower than the null expectation

  • A negative ses means that the observed MNTD between samples is higher than the null expectation

References

Webb et al. (2002) American Naturalist. Stegen et al. (2013) ISME Journal.