qdiv.diversity.alpha_div module

qdiv.diversity.alpha_div.compute_Tmean(tree_df, abund)[source]

Compute sample-specific mean tree height T (Chao et al. 2010).

T_j = sum_b L_b * a_{b,j}

Parameters:
  • tree_df (pd.DataFrame) – Tree dataframe with column ‘branchL’

  • abund (pd.DataFrame) – Branch × sample matrix of descendant relative abundances (a_{b,j})

Returns:

One T value per sample (indexed by sample name)

Return type:

pd.Series

qdiv.diversity.alpha_div.ra_to_branches(ra, tree_df)[source]

Return tree2 = (branches × samples) relative-abundance table.

Parameters:
  • ra (DataFrame)

  • tree_df (DataFrame)

Return type:

DataFrame

qdiv.diversity.alpha_div.naive_alpha(tab, *, q=1, use_values_in_tab=False)[source]

Compute naive alpha diversity of order q for all samples.

Accepts:
  • DataFrame: features x samples

  • MicrobiomeData-like object: must expose a DataFrame in .tab / .table / .counts / .abundance

  • dict-of-dicts: either {feature: {sample: count}} or {sample: {feature: count}}

Parameters:
  • tab (DataFrame | MicrobiomeData-like | dict) – Abundance table (features x samples) or convertible structure.

  • q (float, default=1) – Diversity order: - q = 0 : species richness - q = 1 : exponential of Shannon entropy - q = 2 : inverse Simpson - general q : Hill number of order q

  • use_values_in_tab (bool, default=False) – If False (default), values are converted to relative abundances. If True, values in tab are assumed to already be relative abundances.

Returns:

Hill numbers for each sample. If input has one sample/column, returns a float.

Return type:

pandas.Series or float

Notes

  • For q = 1, the limit definition is used:

    H₁ = exp( - Σ pᵢ ln pᵢ )

  • For q ≠ 1:

    H_q = ( Σ pᵢ^q )^( 1 / (1 - q) )

  • Zero abundances are ignored safely.

qdiv.diversity.alpha_div.phyl_alpha(obj, *, q=1, index='D', use_values_in_tab=False)[source]

Compute phylogenetic alpha diversity based on Hill numbers.

This function implements the abundance-weighted phylogenetic diversity framework of Chao et al. (2010, Phil. Trans. R. Soc. B). Diversity is computed at the level of tree branches, where each branch is weighted by its length and by the total relative abundance of all descendant features.

The primary quantity returned is the mean phylogenetic diversity D̄_q(T), which is a true Hill number (dimensionless, continuous, and monotone in q). A branch-length–scaled quantity (phylogenetic diversity, PD_q) can optionally be returned as a derived measure.

Parameters:
  • obj (MicrobiomeData-like | dict) –

    Object containing an abundance table (tab) and a tree dataframe (tree). The tree dataframe must include:

    • ’leaves’ : list of descendant leaves for each branch

    • ’branchL’ : branch length

  • q (float, default=1) – Diversity order: - q = 0 : presence/absence weighting (Faith’s PD when index=’PD’) - q = 1 : exponential phylogenetic Shannon diversity - q = 2 : phylogenetic inverse Simpson diversity - general q : phylogenetic Hill number

  • index ({'D', 'PD', 'H'}, default='D') –

    Quantity to return: - ‘D’ : mean phylogenetic diversity D̄_q(T) (dimensionless; Hill number) - ‘PD’ : branch diversity PD_q(T) = T · D̄_q(T) - ‘H’ : entropy-like intermediate quantity:

    • q = 1 : phylogenetic entropy divided by T

    • q ≠ 1 : power-sum moment Σ_b (L_b/T) a_b^q

  • use_values_in_tab (bool, default=False) – If False, abundances are converted to relative abundances per sample. If True, the abundance table is assumed to already contain relative abundances.

Returns:

A vector of diversity values, one per sample.

Return type:

pandas.Series

Notes

For each sample j, the mean tree height is computed as:

T_j = Σ_b L_b · a_{b,j}

Mean phylogenetic diversity is defined as:

D̄_q(T) = ( Σ_b (L_b / T_j) · a_{b,j}^q )^(1 / (1 − q)), q ≠ 1 D̄_1(T) = exp( − Σ_b (L_b / T_j) · a_{b,j} · log a_{b,j} )

where a_{b,j} is the total relative abundance descending from branch b.

The branch diversity PD_q(T) = T_j · D̄_q(T) has units of branch length (or evolutionary time) and represents effective evolutionary work. Unlike D̄_q(T), PD_q(T) is not a Hill number for q ≠ 0, 1 and is not guaranteed to be monotone in q.

qdiv.diversity.alpha_div.func_alpha(tab, distmat, *, q=1, index='FD', use_values_in_tab=False)[source]

Compute functional alpha diversity (Hill numbers) of order q.

Implements the framework of Chiu et al. (2014, PLoS ONE), where functional diversity is derived from pairwise trait distances and species abundances.

For each sample, functional diversity is computed from:

Q = Σᵢ Σⱼ pᵢ pⱼ dᵢⱼ (Rao’s quadratic entropy)

and the functional Hill number of order q:

q = 1:

FD₁ = exp( -½ Σᵢ Σⱼ (pᵢ pⱼ ln(pᵢ pⱼ)) dᵢⱼ / Q )

q ≠ 1:

FD_q = ( Σᵢ Σⱼ (pᵢ pⱼ)ᵠ dᵢⱼ / Q )^( 1 / (2(1−q)) )

Parameters:
  • tab (DataFrame | MicrobiomeData-like | dict) – Abundance table (features x samples) or convertible structure.

  • distmat (pandas.DataFrame) – Functional distance matrix (features × features).

  • q (float, default=1) – Diversity order.

  • index ({'FD', 'D', 'MD'}, default='FD') – Output type: - ‘D’ : functional Hill number - ‘MD’ : mean functional diversity (D × Q) - ‘FD’ : functional diversity (D × MD)

  • use_values_in_tab (bool, default=False) – If False, convert abundances to relative abundances. If True, assume tab already contains relative abundances.

Returns:

Functional diversity values for each sample.

Return type:

pandas.Series

Notes

  • Uses Rao’s Q as implemented in your rao() function.

  • Zero abundances are handled safely.

qdiv.diversity.alpha_div.mpdq(obj, distmat, *, q=1.0)[source]

Mean phylogenetic distance (MPD) with q-weighting of relative abundances. Accepts either a MicrobiomeData object or a dict with at least a ‘tab’ DataFrame.

Parameters:
  • obj (MicrobiomeData, dict, or compatible object) – Input data. Must provide at least an abundance table (‘tab’).

  • distmat (pd.DataFrame) – Square distance matrix indexed/columned by feature ids.

  • q (float, default=1.0) – Order of diversity weighting applied to relative abundances.

Return type:

pandas.DataFrame

References

Webb et al. (2002) American Naturalist.

qdiv.diversity.alpha_div.mntdq(obj, distmat, *, q=1.0)[source]

Mean nearest taxon distance (MNTD) with q-weighting of relative abundances.

Parameters:
  • obj (MicrobiomeData, dict, or compatible object) – Input data. Must provide at least an abundance table (‘tab’).

  • distmat (pd.DataFrame) – Square distance matrix indexed/columned by feature ids.

  • q (float, default=1.0) – Order of diversity weighting applied to relative abundances.

Return type:

pandas.DataFrame