qdiv.plot.diversity_plots module
- qdiv.plot.diversity_plots.naive_alpha(tab, *, q=1, use_values_in_tab=False)[source]
Compute naive alpha diversity of order q for all samples.
- Accepts:
DataFrame: features x samples
MicrobiomeData-like object: must expose a DataFrame in .tab / .table / .counts / .abundance
dict-of-dicts: either {feature: {sample: count}} or {sample: {feature: count}}
- Parameters:
tab (DataFrame | MicrobiomeData-like | dict) – Abundance table (features x samples) or convertible structure.
q (float, default=1) – Diversity order: - q = 0 : species richness - q = 1 : exponential of Shannon entropy - q = 2 : inverse Simpson - general q : Hill number of order q
use_values_in_tab (bool, default=False) – If False (default), values are converted to relative abundances. If True, values in tab are assumed to already be relative abundances.
- Returns:
Hill numbers for each sample. If input has one sample/column, returns a float.
- Return type:
pandas.Series or float
Notes
- For q = 1, the limit definition is used:
H₁ = exp( - Σ pᵢ ln pᵢ )
- For q ≠ 1:
H_q = ( Σ pᵢ^q )^( 1 / (1 - q) )
Zero abundances are ignored safely.
- qdiv.plot.diversity_plots.phyl_alpha(obj, *, q=1, index='D', use_values_in_tab=False)[source]
Compute phylogenetic alpha diversity based on Hill numbers.
This function implements the abundance-weighted phylogenetic diversity framework of Chao et al. (2010, Phil. Trans. R. Soc. B). Diversity is computed at the level of tree branches, where each branch is weighted by its length and by the total relative abundance of all descendant features.
The primary quantity returned is the mean phylogenetic diversity D̄_q(T), which is a true Hill number (dimensionless, continuous, and monotone in q). A branch-length–scaled quantity (phylogenetic diversity, PD_q) can optionally be returned as a derived measure.
- Parameters:
obj (MicrobiomeData-like | dict) –
Object containing an abundance table (tab) and a tree dataframe (tree). The tree dataframe must include:
’leaves’ : list of descendant leaves for each branch
’branchL’ : branch length
q (float, default=1) – Diversity order: - q = 0 : presence/absence weighting (Faith’s PD when index=’PD’) - q = 1 : exponential phylogenetic Shannon diversity - q = 2 : phylogenetic inverse Simpson diversity - general q : phylogenetic Hill number
index ({'D', 'PD', 'H'}, default='D') –
Quantity to return: - ‘D’ : mean phylogenetic diversity D̄_q(T) (dimensionless; Hill number) - ‘PD’ : branch diversity PD_q(T) = T · D̄_q(T) - ‘H’ : entropy-like intermediate quantity:
q = 1 : phylogenetic entropy divided by T
q ≠ 1 : power-sum moment Σ_b (L_b/T) a_b^q
use_values_in_tab (bool, default=False) – If False, abundances are converted to relative abundances per sample. If True, the abundance table is assumed to already contain relative abundances.
- Returns:
A vector of diversity values, one per sample.
- Return type:
pandas.Series
Notes
- For each sample j, the mean tree height is computed as:
T_j = Σ_b L_b · a_{b,j}
- Mean phylogenetic diversity is defined as:
D̄_q(T) = ( Σ_b (L_b / T_j) · a_{b,j}^q )^(1 / (1 − q)), q ≠ 1 D̄_1(T) = exp( − Σ_b (L_b / T_j) · a_{b,j} · log a_{b,j} )
where a_{b,j} is the total relative abundance descending from branch b.
The branch diversity PD_q(T) = T_j · D̄_q(T) has units of branch length (or evolutionary time) and represents effective evolutionary work. Unlike D̄_q(T), PD_q(T) is not a Hill number for q ≠ 0, 1 and is not guaranteed to be monotone in q.
- qdiv.plot.diversity_plots.func_alpha(tab, distmat, *, q=1, index='FD', use_values_in_tab=False)[source]
Compute functional alpha diversity (Hill numbers) of order q.
Implements the framework of Chiu et al. (2014, PLoS ONE), where functional diversity is derived from pairwise trait distances and species abundances.
For each sample, functional diversity is computed from:
Q = Σᵢ Σⱼ pᵢ pⱼ dᵢⱼ (Rao’s quadratic entropy)
and the functional Hill number of order q:
- q = 1:
FD₁ = exp( -½ Σᵢ Σⱼ (pᵢ pⱼ ln(pᵢ pⱼ)) dᵢⱼ / Q )
- q ≠ 1:
FD_q = ( Σᵢ Σⱼ (pᵢ pⱼ)ᵠ dᵢⱼ / Q )^( 1 / (2(1−q)) )
- Parameters:
tab (DataFrame | MicrobiomeData-like | dict) – Abundance table (features x samples) or convertible structure.
distmat (pandas.DataFrame) – Functional distance matrix (features × features).
q (float, default=1) – Diversity order.
index ({'FD', 'D', 'MD'}, default='FD') – Output type: - ‘D’ : functional Hill number - ‘MD’ : mean functional diversity (D × Q) - ‘FD’ : functional diversity (D × MD)
use_values_in_tab (bool, default=False) – If False, convert abundances to relative abundances. If True, assume tab already contains relative abundances.
- Returns:
Functional diversity values for each sample.
- Return type:
pandas.Series
Notes
Uses Rao’s Q as implemented in your rao() function.
Zero abundances are handled safely.
- qdiv.plot.diversity_plots.naive_multi_beta(obj, *, by=None, q=1)[source]
Compute naive (taxonomic) multi‑sample beta diversity for groups of samples.
This implements the multi‑sample Hill‑number beta framework:
β_q = γ_q / ( α_q / N )
- where:
γ_q is the Hill number of the pooled community
α_q is the mean within‑sample Hill number
N is the number of samples in the group
- Parameters:
obj (MicrobiomeData-like | dict) –
- Must contain:
’meta’ : pandas.DataFrame with sample metadata
’tab’ : pandas.DataFrame with feature counts (features × samples)
by (str or None, default=None) – Column in metadata defining sample groups. If None, all samples are treated as one group.
q (float, default=1) – Diversity order.
- Returns:
Index = categories in var (or ‘all’ if var=None) Columns:
N : number of samples in group
beta : multi‑sample beta diversity
local_dis : local‑viewpoint dissimilarity
regional_dis : regional‑viewpoint dissimilarity
- Return type:
pandas.DataFrame
Notes
Groups with <2 samples return NaN.
- qdiv.plot.diversity_plots.phyl_multi_beta(obj, *, by=None, q=1)[source]
Compute phylogenetic multi‑sample beta diversity for groups of samples.
Implements the multi‑sample phylogenetic Hill‑number beta framework described in Chao et al. (2014), where branch lengths are weighted by the relative abundances of all ASVs descending from each branch.
For each group of samples:
β_q = γ_q / ( α_q / N )
- where:
γ_q is the phylogenetic Hill number of the pooled community
α_q is the mean within‑sample phylogenetic Hill number
N is the number of samples in the group
- Parameters:
obj (MicrobiomeData-like | dict) –
- Must contain:
’meta’ : pandas.DataFrame with sample metadata
’tab’ : pandas.DataFrame with ASV counts (ASVs × samples)
- ’tree’pandas.DataFrame with:
’leaves’ : list of features under each branch
’branchL’ : branch length
by (str or None, default=None) – Metadata column defining sample groups. If None, all samples are treated as one group.
q (float, default=1) – Diversity order.
- Returns:
Index = categories in by (or ‘all’ if by=None) Columns:
N : number of samples in group
beta : multi‑sample phylogenetic beta diversity
local_dis : local‑viewpoint dissimilarity
regional_dis : regional‑viewpoint dissimilarity
- Return type:
pandas.DataFrame
Notes
Only works for ≥ 2 samples per group.
- qdiv.plot.diversity_plots.func_multi_beta(obj, distmat, *, by=None, q=1)[source]
Compute functional multi‑sample beta diversity for groups of samples.
Implements the multi‑sample functional Hill‑number beta framework described in Chiu et al. (2014), where functional diversity is derived from pairwise trait distances and species abundances.
For each group of samples:
β_q = D_gamma / D_alpha
- where:
D_gamma is the functional Hill number of the pooled community
D_alpha is the mean functional Hill number across all sample pairs
N is the number of samples in the group
NxN = N² (number of ordered sample pairs)
- Parameters:
obj (MicrobiomeData-like | dict) –
- Must contain:
’meta’ : pandas.DataFrame with sample metadata
’tab’ : pandas.DataFrame (features × samples)
distmat (pandas.DataFrame) – Functional distance matrix (features × features).
by (str or None, default=None) – Metadata column defining sample groups. If None, all samples are treated as one group.
q (float, default=1) – Diversity order.
- Returns:
Index = categories in by (or ‘all’ if by=None) Columns:
NxN : N² (number of ordered sample pairs)
beta : functional multi‑sample beta diversity
local_dis : local‑viewpoint dissimilarity
regional_dis : regional‑viewpoint dissimilarity
- Return type:
pandas.DataFrame
Notes
Only works for ≥ 2 samples per group.
- qdiv.plot.diversity_plots.get_colors_markers(get_type='colors', plot=False)[source]
Return predefined color or marker lists, or optionally plot them.
- Parameters:
get_type ({'colors', 'markers'}, default='colors') – Whether to return color names or marker symbols.
plot (bool, default=False) – If True, display a figure showing all available options. If False, return a list of colors or markers.
- Returns:
If
plot=False: a list of color names or marker symbols.If
plot=True: displays a figure and returns None.
- Return type:
list of str or None
Notes
Colors are sorted by HSV (hue, saturation, value) for visual coherence.
- qdiv.plot.diversity_plots.dissimilarity_contributions(obj, *, by=None, q=1.0, div_type='naive', index='local', n=20, levels=None, from_file=None, figsize=(7.086614173228346, 5.511811023622047), fontsize=10, savename=None)[source]
Plot contributions of taxa to observed dissimilarity within categories.
This function visualizes how individual taxa contribute to dissimilarity (e.g., Bray-Curtis or Hill-based) across sample groups defined by metadata.
- Parameters:
obj (dict or MicrobiomeData) –
- Input data containing at least:
- ’tab’: pandas.DataFrame
Abundance table (features x samples).
- ’tax’: pandas.DataFrame
Taxonomy table (features x taxonomic levels).
meta(pandas.DataFrame): metadata table.
by (str, optional) – Metadata column used to categorize samples. Dissimilarity is calculated within each category.
q (float, default=1.0) – Diversity order (Hill number).
div_type ({'naive', 'phyl'}, default='naive') – Diversity type: - ‘naive’: taxonomic dissimilarity. - ‘phyl’: phylogenetic dissimilarity.
index ({'local', 'regional'}, default='local') – Index type for dissimilarity calculation.
n (int, default=20) – Number of top taxa to include in the plot.
levels (list of str, optional) – Taxonomic levels to include in y-axis labels (e.g., [‘Genus’]).
from_file (str, optional) – Path to a CSV file with precomputed dissimilarity contributions. If None, contributions are computed from obj.
figsize (tuple of float, default=(18/2.54, 14/2.54)) – Figure size in inches.
fontsize (int, default=10) – Font size for plot text.
savename (str, optional) – If provided, save the figure to this path and also as PDF.
- Returns:
fig (matplotlib.figure.Figure)
df (pandas.DataFrame) – DataFrame of contributions for plotted taxa and categories. Returns None if computation or plotting fails.
- Return type:
Tuple[plt.figure.Figure, pd.DataFrame]
Notes
If from_file is provided, the function reads contributions from that file.
If levels is provided and div_type=’naive’, taxonomy names are appended to feature IDs.
For phylogenetic diversity, node names or feature sets are used for labeling.
Examples
>>> df = dissimilarity_contributions(obj, by='Treatment', q=1, div_type='naive', levels=['Genus']) >>> print(df.head())
- qdiv.plot.diversity_plots.phyl_tree(obj, *, width=12, name_internal_nodes=False, abundance_info=None, xlog=False, savename=None)[source]
Plot a phylogram from a tree DataFrame with optional abundance bars.
- Parameters:
obj (dict or MicrobiomeData) – Input object with required key: -
tree(pandas.DataFrame): tree structure with columns [‘nodes’, ‘leaves’, ‘branchL’]. Optional keys: -tab(pandas.DataFrame): abundance table (features x samples). -meta(pandas.DataFrame): metadata table for sample grouping.width (float, default=12) – Width of the plot in centimeters. Height is set automatically based on number of ASVs.
name_internal_nodes (bool, default=False) – If True, labels are added to internal nodes.
abundance_info ({'index'} or str, optional) – If ‘index’, plot relative abundance bars for each ASV. If a metadata column name, plot grouped abundance bars for each category.
xlog (bool, default=False) – If True, abundance bars use a logarithmic x-axis.
savename (str, optional) – If provided, save the figure to this path and also as PDF.
- Returns:
fig (matplotlib.figure.Figure)
df_endN (pandas.DataFrame) – DataFrame of end nodes with positions and optional abundance info.
- Return type:
Tuple[plt.figure.Figure, pd.DataFrame]
Notes
The tree DataFrame must contain columns: ‘nodes’, ‘leaves’, ‘branchL’.
If abundance_info is provided, relative abundances are computed per leaf or category.
Bars are plotted to the right of the tree when abundance_info is not None.
Examples
>>> phyl_tree(obj, width=15, name_internal_nodes=True, abundance_info='Treatment', xlog=True, savename='phylogram')
- qdiv.plot.diversity_plots.harvey_balls(meta, columns_by=None, *, rows_by='index', row_colors=None, column_colors=None, row_label_width=4, figsize=(7.086614173228346, 5.511811023622047), fontsize=10, savename=None)[source]
Plot Harvey balls (fraction-of-circle indicators) for percentage columns in metadata.
- Parameters:
meta (DataFrame | MicrobiomeData-like | dict) – Object with metadata table. Must contain the columns_by fields and optionally a rows_by field used to derive row labels.
columns_by (list of str) – List of metadata column names containing percentages (0–100) to visualize as Harvey balls across rows.
rows_by (str, default='index') – Name of the metadata column used as row labels. If ‘index’, the DataFrame index is used as row labels.
row_colors (str, optional) – Name of metadata column containing per-row text colors (e.g., ‘red’, ‘#333’). If None, all row labels are drawn in black.
column_colors (list of str, optional) – Colors for the column headers (one per columns_by). If None, defaults to black for all headers; if provided but shorter than columns_by, the list is padded with black.
row_label_width (int, default=4) – Number of GridSpec columns reserved for the row label area (left-hand text).
figsize (tuple of float, default=(18/2.54, 14/2.54)) – Figure size in inches.
fontsize (int, default=10) – Base font size for the figure.
savename (str, optional) – If provided, saves the figure (PNG) to this path and also as PDF (savename + ‘.pdf’).
- Returns:
fig (matplotlib.figure.Figure)
plot_data (pandas.DataFrame) – A DataFrame containing the row labels and selected percentage values used for plotting: columns [‘__label__’, *columns_by]. Returns None if validation fails.
- Return type:
Tuple[plt.figure.Figure, pd.DataFrame]
Notes
Harvey balls are drawn using pie charts where the black wedge represents the percentage, and the white wedge represents the complement to 100%.
All values in columns_by must be numeric (0–100). Non-numeric rows are coerced if possible; rows with missing values will still be plotted (missing values treated as 0).
If rows_by=’index’, row labels are taken from meta.index; otherwise, from meta[rows_by].
Examples
>>> df = harvey_balls( ... meta, ... rows_by='Treatment', ... columns_by=['PFAS_%', 'DOC_%'], ... row_colors='TreatmentColor', ... column_colors=['#1f77b4', '#ff7f0e'], ... savename='harvey_balls' ... ) >>> print(df.head())
- qdiv.plot.diversity_plots.alpha_diversity_profile(obj, *, q_range=(0.0, 2.0), q_step=0.05, distmat=None, div_type='naive', color_by=None, order=None, ylog=False, figsize=(7.086614173228346, 5.511811023622047), fontsize=10, colorlist=None, use_values_in_tab=False, savename=None)[source]
Plot alpha diversity vs diversity order across samples.
This function computes alpha diversity (Hill numbers) for a range of diversity orders q and plots the curves per sample. It supports taxonomic, phylogenetic, and functional diversity depending on div_type.
- Parameters:
obj (dict or MicrobiomeData) –
- Input data structure containing at least:
tab(pandas.DataFrame): abundance table (features x samples).meta(pandas.DataFrame): sample metadata.
- If div_type=’phyl’, must also contain:
tree(pandas.DataFrame or compatible structure): phylogenetic tree info.
q_range (tuple of float, default=(0.0, 2.0)) – Inclusive range (start, end) of diversity orders to evaluate.
q_step (float, default=0.05) – Step size between q values. Must be positive.
distmat (pandas.DataFrame, optional) – Functional distance matrix (features x features). Required when div_type=’func’.
div_type ({'naive', 'phyl', 'func'}, default='naive') – Diversity type: - ‘naive’: taxonomic alpha diversity. - ‘phyl’ : phylogenetic alpha diversity (requires
treein obj). - ‘func’ : functional alpha diversity (requires distmat).color_by (str, optional) – Metadata column name used to group legend colors. If None, each sample is labeled individually.
order (str, optional) – Metadata column name used to sort samples before plotting.
ylog (bool, default=False) – If True, plot alpha diversity on a logarithmic y-scale.
figsize (tuple of float, default=(18/2.54, 14/2.54)) – Figure size in inches.
fontsize (int, default=10) – Font size for plot text.
colorlist (list of str, optional) – List of colors to use for groups or samples. If None, uses package defaults via
get_colors_markers('colors')or Matplotlib’s cycle.use_values_in_tab (bool, default=False) – Pass-through flag to alpha diversity backends (e.g., whether tab is already normalized).
savename (str, optional) – If provided, saves the figure to this path and also as PDF (i.e., savename and savename + ‘.pdf’).
- Returns:
fig (matplotlib.figure.Figure) – The created figure.
ax (matplotlib.axes.Axes) – The matplotlib Axes object for the figure.
df (pandas.DataFrame) – DataFrame with rows = q-values and columns = samples, containing computed alpha diversity values.
- Return type:
Tuple[Figure, Axes, DataFrame]
Notes
For div_type=’phyl’, get_df(obj, ‘tree’) must exist.
For div_type=’func’, distmat must be provided and compatible with tab.
The legend groups are deduplicated using the values of color_by. Only the first occurrence of each group is shown in the legend.
Examples
>>> fig, ax, df = alpha_diversity(obj, q_range=(0, 2), q_step=0.1, ... div_type='naive', color_by='Treatment') >>> df.head()
- qdiv.plot.diversity_plots.beta_diversity_profile(obj, *, q_range=(0.0, 2.0), q_step=0.05, distmat=None, group_by=None, order=None, dis=True, viewpoint='regional', div_type='naive', ylog=False, figsize=(7.086614173228346, 5.511811023622047), fontsize=10, colorlist=None, savename=None, drop_na_groups=True)[source]
Plot multi-sample β-diversity (or its dissimilarity transform) vs diversity order q.
This function evaluates β_q across a grid of q-values using naive_multi_beta and plots a curve per group (or a single curve labeled ‘all’ if group_by=None). It can optionally convert β to dissimilarity using the “local” or “regional” viewpoints (as returned by naive_multi_beta).
- Parameters:
obj (dict or MicrobiomeData-like) – Must support get_df(obj, ‘tab’) -> DataFrame (features × samples) and get_df(obj, ‘meta’) -> DataFrame (sample metadata).
q_range ((float, float), default=(0.0, 2.0)) – Inclusive (start, end) range of q-values.
q_step (float, default=0.05) – Step size between consecutive q values. Must be positive.
distmat (pandas.DataFrame, optional) – Functional distance matrix (features x features). Required when div_type=’func’.
group_by (str or None, default=None) – Metadata column defining groups of samples. If None, treats all samples as one group.
order (str or None, default=None) – Metadata column used to sort samples before computing group order. The order of first appearance of groups in the (optionally) sorted metadata determines the plotting order.
dis (bool, default=True) – If True, plot dissimilarity instead of raw β. Uses the viewpoint column (‘local_dis’ or ‘regional_dis’) returned by naive_multi_beta.
viewpoint ({'local', 'regional'}, default='regional') – Which dissimilarity column to use when dis=True.
div_type ({'naive', 'phyl', 'func'}, default='naive') – Diversity type: - ‘naive’: taxonomic alpha diversity. - ‘phyl’ : phylogenetic alpha diversity (requires
treein obj). - ‘func’ : functional alpha diversity (requires distmat).ylog (bool, default=False) – If True, use a logarithmic y-scale. Note that dissimilarities may include zeros, which cannot be shown on a log scale; such points will be omitted.
figsize ((float, float), default=(18/2.54, 14/2.54)) – Figure size in inches.
fontsize (int, default=10) – Base font size.
colorlist (list of str or None, default=None) – Colors for groups. If None, uses Matplotlib’s default color cycle.
savename (str or None, default=None) – If provided, saves the figure as savename (raster) and savename + ‘.pdf’.
drop_na_groups (bool, default=True) – If True, drops groups that are entirely NaN across all q (e.g., groups with <2 samples).
- Returns:
fig (matplotlib.figure.Figure) – The created figure.
ax (matplotlib.axes.Axes) – The matplotlib Axes object for the figure.
df (pandas.DataFrame) – DataFrame with rows = q-values and columns = groups. Contains β (if dis=False) or dissimilarity (if dis=True) for each group at each q.
- Return type:
Tuple[Figure, Axes, DataFrame]