qdiv.diversity.beta_div module

qdiv.diversity.beta_div.ra_to_branches(ra, tree_df)[source]

Return tree2 = (branches × samples) relative-abundance table.

Parameters:

ra (DataFrame)
tree_df (DataFrame)

Return type:

DataFrame

qdiv.diversity.beta_div.compute_Tmean(tree_df, abund)[source]

Compute sample-specific mean tree height T (Chao et al. 2010).

T_j = sum_b L_b * a_{b,j}

Parameters:

tree_df (pd.DataFrame) – Tree dataframe with column ‘branchL’
abund (pd.DataFrame) – Branch × sample matrix of descendant relative abundances (a_{b,j})

Returns:

One T value per sample (indexed by sample name)

Return type:

pd.Series

qdiv.diversity.beta_div.naive_beta(tab, *, q=1, dis=True, viewpoint='regional', use_values_in_tab=False)[source]

Compute naive (taxonomic) pairwise beta diversity of order q.

Implements the two‑community Hill‑number beta diversity framework described in Chao et al. (2014), using only species abundances (no phylogenetic or functional information).

For two samples A and B:

α_q = Hill number of the average of A and B γ_q = Hill number of the pooled community β_q = γ_q / α_q

Special case q = 1 uses the Shannon limit:

α₁ = exp( -½ Σ pᵢ ln pᵢ - ½ Σ qᵢ ln qᵢ ) γ₁ = exp( -Σ mᵢ ln mᵢ )

Parameters:

tab (DataFrame | MicrobiomeData-like | dict) – Abundance table (features x samples) or convertible structure.
q (float, default=1) – Diversity order.
dis (bool, default=True) – If True, convert β to a dissimilarity using beta2dist. If False, return raw β values.
viewpoint ({'local', 'regional'}, default='regional') – Viewpoint for converting β to dissimilarity.
use_values_in_tab (bool, default=False) – If False, convert abundances to relative abundances. If True, assume tab already contains relative abundances.

Returns:

Pairwise β-diversity (or dissimilarity) matrix.

Return type:

pandas.DataFrame

Notes

Requires beta2dist() to be defined elsewhere.
Only works for ≥ 2 samples.

qdiv.diversity.beta_div.phyl_beta(obj, *, q=1, dis=True, viewpoint='regional', use_values_in_tab=False)[source]

Compute phylogenetic pairwise beta diversity of order q.

Implements the two‑community phylogenetic Hill‑number beta framework described in Chao et al. (2014), where branch lengths are weighted by the relative abundances of all features descending from each branch.

For two samples A and B:

α_q = phylogenetic Hill number of the average of A and B γ_q = phylogenetic Hill number of the pooled community β_q = γ_q / α_q

Special case q = 1 uses the Shannon limit.

Parameters:

obj (MicrobiomeData-like | dict) –
Must provide:
- ’tab’: feature × sample abundance DataFrame
- ’tree’: branch × columns DataFrame with:
  
  ’leaves’ : iterable/list of leaf IDs under each branch
  
  ’branchL’: branch length (float)
q (float, default=1) – Diversity order.
dis (bool, default=True) – If True, convert β to a dissimilarity using beta2dist.
viewpoint ({'local', 'regional'}, default='regional') – Viewpoint for converting β to dissimilarity.
use_values_in_tab (bool, default=False) – If False, convert abundances to relative abundances. If True, assume tab already contains relative abundances.

Returns:

Pairwise phylogenetic β-diversity (or dissimilarity) matrix.

Return type:

pandas.DataFrame

Notes

Requires beta2dist() to be defined elsewhere.
Only works for ≥ 2 samples.

qdiv.diversity.beta_div.func_beta(tab, distmat, *, q=1, dis=True, viewpoint='regional', use_values_in_tab=False, use_tqdm=True)[source]

Compute functional pairwise beta diversity of order q.

Implements the two‑community functional Hill‑number beta framework based on local functional overlaps as in Chao et al. (2014). Functional diversity is derived from pairwise trait distances between ASVs and their abundances.

For each pair of samples (A, B), the method computes:

Dg : functional Hill number for the pooled community (gamma)

Da : functional Hill number for the “average” community (alpha)

beta = Dg / Da

For q = 1, the Shannon-type limit is used; for q ≠ 1, the general Hill-number form is used.

Parameters:

tab (DataFrame | MicrobiomeData-like | dict) – Abundance table (features x samples) or convertible structure.
distmat (pandas.DataFrame) – Functional distance matrix (ASVs × ASVs), symmetric and indexed by the same ASVs as tab.
q (float, default=1) – Diversity order.
dis (bool, default=True) – If True, convert β to a dissimilarity using beta2dist.
viewpoint ({'local', 'regional'}, default='regional') – Viewpoint for converting β to dissimilarity.
use_values_in_tab (bool, default=False) – If False, convert abundances to relative abundances. If True, assume tab already contains relative abundances.
use_tqdm (bool, default=True) – Use tqdm for progress bars.

Returns:

Pairwise functional dissimilarity matrix (if dis=True) or squared functional beta (β²) matrix (if dis=False).

Return type:

pandas.DataFrame

Notes

Only works for ≥ 2 samples.

qdiv.diversity.beta_div.bray(tab, *, use_values_in_tab=False)[source]

Compute the Bray–Curtis dissimilarity matrix between all samples.

Bray–Curtis dissimilarity between two samples A and B is:

BC(A, B) = 1 − Σ_i min(p_iA, p_iB)

where p_iA and p_iB are relative abundances of feature i in samples A and B.

Parameters:

tab (DataFrame | MicrobiomeData-like | dict) – Abundance table (features x samples) or convertible structure.
use_values_in_tab (bool, default=False) – If False, convert abundances to relative abundances. If True, assume tab already contains relative abundances.

Returns:

Symmetric Bray–Curtis dissimilarity matrix.

Return type:

pandas.DataFrame

Notes

Requires at least two samples.
Zero-sum samples are not allowed unless use_values_in_tab=True.

qdiv.diversity.beta_div.jaccard(tab, *, use_values_in_tab=False)[source]

Compute the Jaccard dissimilarity matrix between all samples.

Jaccard dissimilarity between two samples A and B is:

J(A, B) = 1 − ( |A ∩ B| / |A ∪ B| )

where presence/absence is determined by whether abundance > 0.

Parameters:

tab (DataFrame | MicrobiomeData-like | dict) – Abundance table (features x samples) or convertible structure.
use_values_in_tab (bool, default=False) – Ignored for Jaccard (presence/absence only), included for API symmetry.

Returns:

Symmetric Jaccard dissimilarity matrix.

Return type:

pandas.DataFrame

Notes

Requires at least two samples.
Abundances are converted to binary presence/absence.

qdiv.diversity.beta_div.naive_multi_beta(obj, *, by=None, q=1)[source]

Compute naive (taxonomic) multi‑sample beta diversity for groups of samples.

This implements the multi‑sample Hill‑number beta framework:

β_q = γ_q / ( α_q / N )

where:

γ_q is the Hill number of the pooled community
α_q is the mean within‑sample Hill number
N is the number of samples in the group

Parameters:

obj (MicrobiomeData-like | dict) –
Must contain:
- ’meta’ : pandas.DataFrame with sample metadata
- ’tab’ : pandas.DataFrame with feature counts (features × samples)
by (str or None, default=None) – Column in metadata defining sample groups. If None, all samples are treated as one group.
q (float, default=1) – Diversity order.

Returns:

Index = categories in var (or ‘all’ if var=None) Columns:

N : number of samples in group

beta : multi‑sample beta diversity

local_dis : local‑viewpoint dissimilarity

regional_dis : regional‑viewpoint dissimilarity

Return type:

pandas.DataFrame

Notes

Groups with <2 samples return NaN.

qdiv.diversity.beta_div.phyl_multi_beta(obj, *, by=None, q=1)[source]

Compute phylogenetic multi‑sample beta diversity for groups of samples.

Implements the multi‑sample phylogenetic Hill‑number beta framework described in Chao et al. (2014), where branch lengths are weighted by the relative abundances of all ASVs descending from each branch.

For each group of samples:

β_q = γ_q / ( α_q / N )

where:

γ_q is the phylogenetic Hill number of the pooled community
α_q is the mean within‑sample phylogenetic Hill number
N is the number of samples in the group

Parameters:

obj (MicrobiomeData-like | dict) –
Must contain:
- ’meta’ : pandas.DataFrame with sample metadata
- ’tab’ : pandas.DataFrame with ASV counts (ASVs × samples)
- ’tree’pandas.DataFrame with:
  
  ’leaves’ : list of features under each branch
  
  ’branchL’ : branch length
by (str or None, default=None) – Metadata column defining sample groups. If None, all samples are treated as one group.
q (float, default=1) – Diversity order.

Returns:

Index = categories in by (or ‘all’ if by=None) Columns:

N : number of samples in group

beta : multi‑sample phylogenetic beta diversity

local_dis : local‑viewpoint dissimilarity

regional_dis : regional‑viewpoint dissimilarity

Return type:

pandas.DataFrame

Notes

Only works for ≥ 2 samples per group.

qdiv.diversity.beta_div.func_multi_beta(obj, distmat, *, by=None, q=1)[source]

Compute functional multi‑sample beta diversity for groups of samples.

Implements the multi‑sample functional Hill‑number beta framework described in Chiu et al. (2014), where functional diversity is derived from pairwise trait distances and species abundances.

For each group of samples:

β_q = D_gamma / D_alpha

where:

D_gamma is the functional Hill number of the pooled community
D_alpha is the mean functional Hill number across all sample pairs
N is the number of samples in the group
NxN = N² (number of ordered sample pairs)

Parameters:

obj (MicrobiomeData-like | dict) –
Must contain:
- ’meta’ : pandas.DataFrame with sample metadata
- ’tab’ : pandas.DataFrame (features × samples)
distmat (pandas.DataFrame) – Functional distance matrix (features × features).
by (str or None, default=None) – Metadata column defining sample groups. If None, all samples are treated as one group.
q (float, default=1) – Diversity order.

Returns:

Index = categories in by (or ‘all’ if by=None) Columns:

NxN : N² (number of ordered sample pairs)

beta : functional multi‑sample beta diversity

local_dis : local‑viewpoint dissimilarity

regional_dis : regional‑viewpoint dissimilarity

Return type:

pandas.DataFrame

Notes

Only works for ≥ 2 samples per group.

qdiv.diversity.beta_div.evenness(obj, distmat=None, *, q=1, div_type='naive', index='pielou', perspective='samples', use_values_in_tab=False)[source]

Compute evenness measures from Chao & Ricotta (2019, Ecology 100:e02852), with optional support for Pielou’s classical evenness index.

Supports:

naive (taxonomic) evenness
phylogenetic evenness
functional evenness

Supported evenness indices:

CR1 (regional evenness)
CR2 (local evenness)
CR3
CR4
CR5
pielou (Pielou’s J; defined only for q = 1)

Parameters:

obj (DataFrame | MicrobiomeData-like | dict) – Including abundance table (features × samples) and optionally tree (pandas.DataFrame, required if divType=’phyl’)
distmat (pandas.DataFrame, optional) – Required if divType=’func’. Functional distance matrix.
q (float, default=1) – Diversity order.
div_type ({'naive', 'phyl', 'func'}) – Type of diversity measure used to compute D.
index ({'CR1','CR2','CR3','CR4','CR5','local','regional','pielou'}) – Evenness index to compute.
perspective ({'samples','taxa'}) – Whether to compute evenness across samples (columns) or across taxa/branches (rows).
use_values_in_tab (bool, default=False) – If False, convert abundances to relative abundances.

Returns:

Evenness values indexed by sample or taxon.

Return type:

pandas.Series

Notes

CR1 = regional evenness
CR2 = local evenness
CR3–CR5 are alternative evenness formulations from Chao & Ricotta (2019)
Pielou’s index is included for convenience and corresponds to:
J = H’ / ln(S) = ln(D₁) / ln(S)

where D₁ is the Hill number of order q = 1.

qdiv.diversity.beta_div.dissimilarity_by_feature(obj, *, by=None, q=1, div_type='naive', index='regional', use_values_in_tab=False)[source]

Compute the contribution of individual taxa (or phylogenetic nodes) to the overall dissimilarity between multiple samples, following Chao & Ricotta (2019, Ecology 100:e02852).

Supports:

naive (taxonomic) dissimilarity
phylogenetic dissimilarity

Parameters:

obj (DataFrame | MicrobiomeData-like | dict) –
Must contain:
- ’tab’ : abundance table (features × samples)
- ’meta’ : metadata table (optional if by=None)
- ’tree’ : phylogenetic tree (required if divType=’phyl’)
by (str or None, default=None) – Metadata column defining sample groups. If None, all samples are treated as one group.
q (float, default=1) – Diversity order.
div_type ({'naive','phyl'}, default='naive') – Type of dissimilarity measure.
index ({'local','regional','CR1','CR2'}, default='regional') – Evenness/dissimilarity index.
use_values_in_tab (bool, default=False) – If False, convert abundances to relative abundances.

Returns:

Rows:

’dis’ : total dissimilarity
’N’ : number of samples in group
one row per taxon (naive) or per node (phylogenetic)

Columns:

one column per category in by

Return type:

pandas.DataFrame

qdiv.diversity.beta_div.beta_mpdq(obj, distmat, *, q=1.0)[source]

Computes beta-MPD_q for all sample pairs.

Parameters:

obj (MicrobiomeData, dict, or compatible object) – Input data. Must provide at least an abundance table (‘tab’).
distmat (pd.DataFrame) – Square distance matrix indexed/columned by feature ids.
q (float, default=1.0) – Order of diversity weighting applied to relative abundances.

Return type:

pandas.DataFrame (S x S)

qdiv.diversity.beta_div.beta_mntdq(obj, distmat, *, q=1.0, include_conspecifics=False)[source]

Computes beta-MNTD_q for all sample pairs.

Parameters:

obj (MicrobiomeData, dict, or compatible object) – Input data. Must provide at least an abundance table (‘tab’).
distmat (pd.DataFrame) – Square distance matrix indexed/columned by feature ids.
q (float, default=1.0) – Order of diversity weighting applied to relative abundances.
include_conspecifics (bool, default=False) – Determines whether conspecifics (identical features shared between samples) are allowed to contribute zero-distance matches in the nearest-taxon calculation.

Return type:

pandas.DataFrame (S x S)