qdiv.plot.relative_abundance_plots module

qdiv.plot.relative_abundance_plots.heatmap(obj, *, group_by=None, value_aggregation='sum', order=None, levels=None, include_index=False, levels_shown=None, subset_levels=None, subset_patterns=None, n=20, featurelist=None, method='max', sorting='abundance', use_values_in_tab=False, italics=False, figsize=(14, 10), fontsize=15, sep_col=None, sep_line=None, labels=True, labelsize=10, color_threshold=8.0, cmap='Reds', gamma=0.5, colorbar_ticks=None, vmin=None, vmax=None, dpi=240, savename=None)[source]

Plot a heatmap of taxa abundances.

Parameters:
  • obj (dict or MicrobiomeData) –

    Input data containing at least:
    • ’tab’: pandas.DataFrame

      Abundance table (features x samples).

    • ’tax’: pandas.DataFrame

      Taxonomy table (features x taxonomic levels).

  • group_by (str or list, optional) – Metadata column(s) used to merge samples.

  • value_aggregation ({'sum', 'mean'}, default = 'sum')

  • order (str, optional) – Metadata column used to order samples along the x-axis.

  • levels (list of str, optional) – Taxonomic levels used for y-axis grouping.

  • include_index (bool, default=False) – Whether to include the feature index in labels.

  • levels_shown ({'number', None}, optional) – If ‘number’, show numeric labels instead of taxonomic names.

  • subset_levels (str or list of str, optional) – Taxonomic levels to filter by.

  • subset_patterns (str or list of str, optional) – Text patterns to filter taxa.

  • n (int, default=20) – Number of top taxa to plot (ignored if featurelist is provided).

  • featurelist (list of str, optional) – Specific features to plot.

  • method ({'max', 'min'}, default = 'max')

  • sorting ({'abundance', 'alphabetical'}, default = 'abundance')

  • italics (bool, default=False) – If True, italicize taxonomic names where appropriate.

  • figsize (tuple of float, default=(14, 10)) – Figure size in inches.

  • fontsize (int, default=15) – Font size for axis labels.

  • sep_col (list of int, optional) – Column indices where separators are inserted.

  • sep_line (list of int, optional) – Column indices where vertical lines are drawn.

  • labels (bool, default=True) – Whether to show abundance values in cells.

  • labelsize (int, default=10) – Font size of cell labels.

  • color_threshold (float, default=8.0) – Threshold for switching label color (black/white).

  • cmap (str, default='Reds') – Colormap for heatmap.

  • gamma (float, default=0.5) – Gamma for PowerNorm scaling.

  • colorbar_ticks (list of float, optional) – Tick marks for colorbar.

  • vmin (float, optional) – Minimum value for cplor normalization (passed to PowerNorm).

  • vmax (float, optional) – Maximum value for cplor normalization (passed to PowerNorm).

  • dpi (int, default 240) – Resolution of saved figure.

  • savename (str, optional) – Filename to save figure (PNG and PDF). If None, figure is not saved.

  • use_values_in_tab (bool, default = False)

Returns:

  • fig (matplotlib.figure.Figure) – The created figure.

  • ax (matplotlib.axes.Axes) – The matplotlib Axes object for the figure.

  • table (pandas.DataFrame) – The final abundance table (after grouping, filtering, and sorting) that was plotted.

Return type:

Tuple[Figure, Axes, DataFrame]

Examples

>>> heatmap(obj, group_by='Treatment', levels=['Genus'], n=30, savename='heatmap.png')
qdiv.plot.relative_abundance_plots.rarefactioncurve(obj, distmat=None, *, step='flexible', div_type='naive', q=0.0, figsize=(14, 10), fontsize=18, color_by=None, order=None, tag=None, colorlist=None, only_return_data=False, only_plot_data=None, savename=None)[source]

Calculate and plot rarefaction curves for alpha diversity (Hill numbers).

The function subsamples (without replacement) individual reads within each sample to compute the rarefaction curve for a chosen diversity type, then plots per-sample curves. If only_return_data=True, it returns the computed curves instead of plotting them. You can also supply precomputed curves via only_plot_data to plot without recomputation.

Parameters:
  • obj (dict or MicrobiomeData) –

    Input data containing at least:
    • ’tab’: pandas.DataFrame

      Abundance table (features x samples).

    • meta (pd.DataFrame): metadata with sample IDs as index matching tab columns.

    Optional keys depending on div_type: - tree: phylogenetic tree object (required if div_type='phyl').

  • distmat (str or pandas.DataFrame or None, optional) – Distance matrix required when div_type='func'. Can be a preloaded DataFrame or a path-like string handled by your func_alpha implementation.

  • step ({'flexible'} or int, default='flexible') – Subsampling step size (depth increments). If ‘flexible’, the total reads of each sample are divided by 20 (min 1). If an integer, it must be a positive step size in reads.

  • div_type ({'naive', 'phyl', 'func'}, default='naive') – Diversity measure to compute: - ‘naive’ : taxonomic (plain) diversity via naive_alpha. - ‘phyl’ : phylogenetic diversity via phyl_alpha (requires tree). - ‘func’ : functional diversity via func_alpha (requires distmat).

  • q (float, default=0.0) – Order of diversity (Hill number).

  • figsize (tuple of float, default=(14, 10)) – Figure size (width, height) in inches.

  • fontsize (int, default=18) – Base font size for the plot.

  • color_by (str, optional) – Metadata column in used to color-code lines (group legend).

  • order (str, optional) – Metadata column in used to order samples along the legend or visual grouping in the plot.

  • tag ({'index'} or str, optional) – If ‘index’, annotate curve endpoints with sample IDs. If a metadata column name, annotate with that column’s values.

  • colorlist (list of str, optional) – Colors used for plotting. If not provided, colors are drawn from get_colors_markers('colors'). Ensure the list is long enough for all groups/samples.

  • only_return_data (bool, default=False) – If True, return the computed data dictionary and do not plot.

  • only_plot_data (dict, optional) – Precomputed data dictionary to plot (skips computation). The format is: {sample_id: (xvals: np.ndarray, yvals: np.ndarray)}.

  • savename (str, optional) – If provided, save the plot to savename and also to a PDF file savename + '.pdf' (unless savename already ends with .pdf).

Returns:

Returns a dictionary with the keys ‘meta’, which holds the metadata dataframe and ‘samples’, which is another dictionary mapping sample IDs to (x, y) arrays for the rarefaction curves.

Return type:

dict

Notes

  • The function shuffles individual reads per sample using numpy.random.shuffle. For reproducibility, set the global NumPy random seed before calling.

  • Helper functions naive_alpha, phyl_alpha, and func_alpha are assumed to be available in the current namespace.

  • The count table obj['tab'] must contain non-negative integers; zero-count features are ignored per sample during accumulation.

Examples

Compute and plot, coloring by a metadata column:

>>> data = rarefactioncurve(
...     obj,
...     step='flexible',
...     div_type='naive',
...     q=0,
...     color_by='Treatment',
...     savename='rarefaction.png'
... )
>>> rd = rarefactioncurve(obj, step=500, only_return_data=True)

Plot from precomputed data:

>>> _ = rarefactioncurve(obj, only_plot_data=rd)  # uses obj['meta'] for annotations
qdiv.plot.relative_abundance_plots.octave(obj, *, group_by=None, values=None, nrows=2, ncols=2, fontsize=11, figsize=(10, 6), xlabels=True, ylabels=True, title=True, color='blue', savename=None)[source]

Plot octave distributions of ASV abundances according to Edgar & Flyvbjerg (DOI: 10.1101/38983).

This function bins feature counts into logarithmic intervals (powers of 2) and plots histograms for each sample or merged group of samples. Useful for visualizing abundance distributions across samples.

Parameters:
  • obj (dict or MicrobiomeData) –

    Input data containing at least:
    • ’tab’: pandas.DataFrame. Abundance table (features x samples).

    Optional key: - meta (pandas.DataFrame): metadata table for sample grouping.

  • group_by (str, optional) – Metadata column name used to merge samples by category. If None, each sample is plotted individually.

  • values (list of str, optional) – Subset of sample names or metadata values to include. If None, all samples or all categories in group_by are used.

  • nrows (int, default=2) – Number of rows in the subplot grid.

  • ncols (int, default=2) – Number of columns in the subplot grid. nrows * ncols must be >= number of panels.

  • fontsize (int, default=11) – Font size for plot text.

  • figsize (tuple of float, default=(10, 6)) – Figure size in inches.

  • xlabels (bool, default=True) – Whether to show x-axis labels (k bins).

  • ylabels (bool, default=True) – Whether to show y-axis labels (ASV counts).

  • title (bool, default=True) – Whether to display sample name or group name as subplot title.

  • color (str, default='blue') – Color of the bars in the histograms.

  • savename (str, optional) – If provided, save the figure to this path and also as PDF. Additionally, export the bin counts as a CSV file (savename + '.csv').

Returns:

  • fig (matplotlib.figure.Figure)

  • df (pandas.DataFrame) – DataFrame containing bin definitions and counts per sample/group. Columns: [‘k’, ‘min_count’, ‘max_count’, sample1, sample2, …]. Returns None if plotting fails due to insufficient panels.

Return type:

Tuple[plt.figure.Figure, pd.DataFrame]

Notes

  • Bins are defined as intervals [2^k, 2^(k+1)).

  • If the number of samples exceeds nrows * ncols, the function prints a warning and returns None without plotting.

Examples

>>> df = octave(obj, group_by='Treatment', nrows=2, ncols=3, color='green', savename='octave_plot')
>>> print(df.head())
qdiv.plot.relative_abundance_plots.pie(obj, *, group_by=None, value_aggregation='sum', order=None, levels=None, include_index=False, levels_shown=None, subset_levels=None, subset_patterns=None, n=6, featurelist=None, method='max', sorting='abundance', use_values_in_tab=False, nrows=1, ncols=1, figsize=(7.086614173228346, 3.937007874015748), fontsize=10, colorlist=None, other_color='grey', legend_columns=1, show_legend=True, savename=None)[source]

Plot pie charts of taxonomic composition for samples or merged groups.

Parameters:
  • obj (dict or MicrobiomeData) –

    Input data containing at least:
    • ’tab’: pandas.DataFrame

      Abundance table (features x samples).

    • ’tax’: pandas.DataFrame

      Taxonomy table (features x taxonomic levels).

  • group_by (str, optional) – Metadata column used to merge samples.

  • value_aggregation ({'sum', 'mean'}, default = 'sum')

  • order (str, optional) – Metadata column used to order samples along the x-axis.

  • levels (list of str, optional) – Taxonomic levels used for grouping.

  • include_index (bool, default=False) – Whether to include the feature index in labels.

  • levels_shown ({'number', None}, optional) – If ‘number’, show numeric labels instead of taxonomic names.

  • subset_levels (str or list of str, optional) – Taxonomic levels to filter by.

  • subset_patterns (str or list of str, optional) – Text patterns to filter taxa.

  • n (int, default=20) – Number of top taxa to plot (ignored if featurelist is provided).

  • featurelist (list of str, optional) – Specific features to plot.

  • method ({'max', 'min'}, default = 'max')

  • sorting ({'abundance', 'alphabetical'}, default = 'abundance')

  • use_values_in_tab (bool)

  • nrows (int)

  • ncols (int)

  • figsize (Tuple[float, float])

  • fontsize (int)

  • colorlist (List[str] | None)

  • other_color (str)

  • legend_columns (int)

  • show_legend (bool)

  • savename (str | None)

Return type:

Tuple[plt.figure.Figure, pd.DataFrame]

nrowsint, default=1

Number of rows in the subplot grid.

ncolsint, default=1

Number of columns in the subplot grid.

figsizetuple of float, default=(18/2.54, 10/2.54)

Figure size in inches.

fontsizeint, default=10

Font size for titles and legend.

colorlistlist of str, optional

Colors for taxa slices. If None, defaults from get_colors_markers(‘colors’) are used.

other_color : Color for ‘Other’ slice. legend_columns : Number of columns in the legend. show_legend : Default is True.

Returns:

  • fig (matplotlib.figure.Figure)

  • table (pandas.DataFrame) – DataFrame of relative abundances for plotted taxa and samples. Returns None if required keys are missing.

Parameters:
  • obj (Dict[str, Any] | Any)

  • group_by (str | None)

  • value_aggregation (Literal['sum', 'mean'])

  • order (str | None)

  • levels (List[str] | None)

  • include_index (bool)

  • levels_shown (str | None)

  • subset_levels (str | List[str] | None)

  • subset_patterns (str | List[str] | None)

  • n (int)

  • featurelist (List[str] | None)

  • method (Literal['max', 'mean'])

  • sorting (Literal['abundance', 'alphabetical'])

  • use_values_in_tab (bool)

  • nrows (int)

  • ncols (int)

  • figsize (Tuple[float, float])

  • fontsize (int)

  • colorlist (List[str] | None)

  • other_color (str)

  • legend_columns (int)

  • show_legend (bool)

  • savename (str | None)

Return type:

Tuple[plt.figure.Figure, pd.DataFrame]

Notes

  • Taxa are grouped by the specified level using groupbytaxa.

  • Remaining taxa beyond n are aggregated into ‘Other’.

  • If order is provided, samples are sorted by that metadata column.

Examples

>>> df = pie(obj, group_by='Treatment', level='Genus', n=8, savename='pie_chart')
>>> print(df.head())