Plotting¶

All plot functions return plotnine.ggplot objects, which can be composed with + before rendering. Call .draw() to display or .save("output.pdf") to write to disk.

The make_network function is an exception: it returns a networkx.Graph.

from pyloseq import (
    plot_bar, plot_richness, plot_ordination,
    plot_heatmap, plot_tree, make_network, plot_network,
)

plot_bar¶

Stacked bar chart of OTU abundances. Internally calls psmelt to convert to long format before plotting.

p = plot_bar(ps, fill="Phylum", facet_grid="~ SampleType")
p.draw()

Stack order is deterministic: sorted first by the x column, then by the fill column, matching R phyloseq's rendering.

# Group by sample type on x-axis, fill by phylum
p = plot_bar(ps, x="SampleType", fill="Phylum")

# Facet by sample type
p = plot_bar(ps, fill="Genus", facet_grid="SampleType ~")

# Compose with a plotnine theme
from plotnine import theme_bw
p = plot_bar(ps, fill="Phylum") + theme_bw()

pyloseq.plot_bar ¶

plot_bar(
    ps: Phyloseq,
    x: str = "Sample",
    y: str = "Abundance",
    fill: str | None = None,
    facet_grid: str | None = None,
    title: str | None = None,
) -> Any

Stacked bar chart of OTU/feature abundances.

Bars are ordered by x and stacked deterministically by fill so that fill segments line up consistently across samples (matching R phyloseq's behaviour).

R reference: plot_bar(physeq, x, y, fill, facet_grid, title)

Parameters:

Name	Type	Description	Default
`ps`	`Phyloseq`	`Phyloseq` object.	required
`x`	`str`	Column in the melted data to use as x-axis (default `"Sample"`).	`'Sample'`
`y`	`str`	Column for bar height (default `"Abundance"`).	`'Abundance'`
`fill`	`str \| None`	Column for bar fill colour (e.g. `"Phylum"`).	`None`
`facet_grid`	`str \| None`	Facet formula string (e.g. `"~ SampleType"`).	`None`
`title`	`str \| None`	Plot title.	`None`

Returns:

Type	Description
`ggplot`

plot_richness¶

Alpha diversity plot faceted by measure. By default each panel shows a box-and-whisker summary; set boxplot=False for a points-only view (useful when x groups only a few samples and boxes add noise):

p = plot_richness(ps, x="SampleType", color="SampleType",
                  measures=["Shannon", "Simpson"])
p.draw()

# Points only — no boxes
p = plot_richness(ps, x="SampleType", color="SampleType", boxplot=False)

Standard-error whiskers are drawn automatically for measures that have a corresponding se.* column in the estimate_richness output (currently only se.chao1).

Setting x="samples" (the default) puts individual sample names on the x-axis. Set x to any column in sample_data to group or aggregate:

# Individual samples, coloured by environment type
p = plot_richness(ps, x="samples", color="SampleType")

# Boxplots by environment type
p = plot_richness(ps, x="SampleType", color="SampleType")

pyloseq.plot_richness ¶

plot_richness(
    ps: Phyloseq,
    x: str = "samples",
    color: str | None = None,
    measures: list[str] | None = None,
    title: str | None = None,
    boxplot: bool = True,
) -> Any

Alpha diversity box-and-point plots, faceted by measure.

Standard-error whiskers are drawn for any measure that has a matching se.<measure> column from :func:estimate_richness (e.g. se.chao1, se.ACE), matching R phyloseq.

R reference: plot_richness(physeq, x, color, measures, title)

Parameters:

Name	Type	Description	Default
`ps`	`Phyloseq`	`Phyloseq` object.	required
`x`	`str`	Sample metadata column for x-axis. Use `"samples"` (default) to put individual samples on x.	`'samples'`
`color`	`str \| None`	Sample metadata column for point colour.	`None`
`measures`	`list[str] \| None`	Diversity measures to include; default is all.	`None`
`title`	`str \| None`	Plot title.	`None`
`boxplot`	`bool`	If `True` (default), layer a box-and-whisker summary under the points. Set `False` for points only (e.g. when `x` groups few samples and the boxes add noise).	`True`

Returns:

Type	Description
`ggplot`

plot_rarefaction_curve¶

Rarefaction curves showing observed richness as a function of sequencing depth. For each sample the OTU counts are subsampled (without replacement) at n_steps evenly-spaced depths between step and the minimum sample depth, and the number of distinct observed taxa is counted at each depth:

from pyloseq import plot_rarefaction_curve

p = plot_rarefaction_curve(ps, step=500, n_steps=30)
p.draw()

# Colour curves by a metadata variable
p = plot_rarefaction_curve(ps, step=200, n_steps=40, color="SampleType")

All curves share the same rightmost depth (the minimum sample depth across all samples), so the plot shows where each sample's curve plateaus relative to the overall sequencing effort.

The returned ggplot's data attribute contains columns Sample, Depth, and Observed (plus the color column if one was requested), enabling further customisation:

p = plot_rarefaction_curve(ps, step=500, n_steps=20, rng_seed=42)
p + theme_bw() + labs(title="Rarefaction curves")

Note

Each depth point uses an independent random draw, not a cumulative one. The curves show expected richness at each depth — use a fixed rng_seed for reproducible figures, and increase n_steps for smoother curves.

pyloseq.plot_rarefaction_curve ¶

plot_rarefaction_curve(
    ps: Phyloseq,
    step: int = 500,
    n_steps: int = 30,
    color: str | None = None,
    rng_seed: int | None = None,
    title: str | None = None,
) -> Any

Rarefaction curves showing observed richness vs. sequencing depth.

For each sample a curve is drawn by randomly subsampling (without replacement) the reads at n_steps depths between step and the minimum sample depth, then counting the number of distinct observed taxa at each depth. The minimum depth sets the right-hand end of all curves so that every sample reaches the same maximum depth point.

R reference: vegan::rarecurve(t(otu_table(physeq)), step=...) or microbiome::plot_richness_estimates (depth-based variant)

Parameters:

Name	Type	Description	Default
`ps`	`Phyloseq`	`Phyloseq` object. OTU table values must be integer counts.	required
`step`	`int`	Starting depth for the rarefaction grid (first subsampling depth).	`500`
`n_steps`	`int`	Number of evenly-spaced depth points from `step` to the minimum sample depth.	`30`
`color`	`str \| None`	Column in `sample_data` to color the curves by. If `None` all curves use the default color.	`None`
`rng_seed`	`int \| None`	Seed for the subsampling RNG. Pass `None` for non-reproducible draws.	`None`
`title`	`str \| None`	Plot title.	`None`

Returns:

Type	Description
`ggplot`	The underlying `data` attribute contains columns `Sample`, `Depth`, `Observed`, and any column named by `color`.

plot_ordination¶

Scatter plot of ordination results. The kind parameter controls what is plotted:

`kind`	What is plotted
`"samples"`	Sample coordinates (default)
`"taxa"`	Feature/taxa scores (requires ordination with feature scores, e.g. CA, RDA)
`"biplot"`	Samples and taxa together, taxa scores rescaled to sample axis range
`"split"`	Samples and taxa side-by-side in facets
`"scree"`	Proportion of variance explained per axis

from pyloseq import ordinate, plot_ordination

ord_result = ordinate(ps, method="PCoA", distance="bray")

# Basic sample scatter
p = plot_ordination(ps, ord_result, color="SampleType")
p.draw()

# With convex hulls per colour group
p = plot_ordination(ps, ord_result, color="SampleType", show_hull=True)

# Scree plot
p = plot_ordination(ps, ord_result, kind="scree")

# Return the DataFrame instead of a plot (for custom plotting)
df = plot_ordination(ps, ord_result, just_df=True)

Note

kind="taxa" and kind="biplot" require an ordination that produces feature scores, such as CA, CCA, or RDA. PCoA and NMDS do not produce feature scores; using those with kind="taxa" raises pyloseqValidationError.

pyloseq.plot_ordination ¶

plot_ordination(
    ps: Phyloseq,
    ord: Any,
    kind: str = "samples",
    color: str | None = None,
    shape: str | None = None,
    label: str | None = None,
    title: str | None = None,
    show_hull: bool = False,
    just_df: bool = False,
    **kwargs: Any,
) -> Any

Scatter plot of ordination results.

R reference: plot_ordination(physeq, ordination, type, color, shape, label, title, justDF)

Parameters:

Name	Type	Description	Default
`ps`	`Phyloseq`	`Phyloseq` object.	required
`ord`	`Any`	`skbio.stats.ordination.OrdinationResults` from :func:`ordinate`.	required
`kind`	`str`	One of `"samples"`, `"taxa"`, `"biplot"`, `"split"`, `"scree"`.	`'samples'`
`color`	`str \| None`	Sample/taxa metadata column for point colour.	`None`
`shape`	`str \| None`	Sample metadata column for point shape.	`None`
`label`	`str \| None`	Column to annotate points with text labels.	`None`
`title`	`str \| None`	Plot title.	`None`
`show_hull`	`bool`	If `True`, shade a convex hull behind each colour group (samples and split kinds only). Off by default; this is not phyloseq behaviour but is offered as a convenience.	`False`
`just_df`	`bool`	If `True`, return the assembled plotting `DataFrame` instead of a ggplot (mirrors R phyloseq's `justDF=TRUE`).	`False`

Returns:

Type	Description
`ggplot or DataFrame`

plot_heatmap¶

Heatmap of OTU abundances across samples. By default, rows and columns are reordered by an ordination to group similar taxa and samples together:

p = plot_heatmap(ps, method="PCoA", distance="bray", trans="log4")
p.draw()

Pass method=None to skip ordination entirely and preserve the original sample and taxa order from the Phyloseq object:

p = plot_heatmap(ps, method=None)

The trans parameter applies a transformation before plotting. Zero values become NaN and render using na_value rather than the low-end gradient colour:

`trans`	Transformation
`None`	Raw values
`"log4"`	log₄(x), zeros → `na_value`

Use label to replace x-axis sample-name tick labels with a sample_data variable, and taxa_label to replace y-axis taxa-name tick labels with a taxonomic rank. Ordering (from ordination or original order) is unaffected — only the tick text changes. These mirror R phyloseq's sample.label and taxa.label:

# Label x-axis ticks by treatment group, y-axis ticks by phylum
p = plot_heatmap(ps, method="PCoA", label="TreatmentGroup", taxa_label="Phylum")

pyloseq.plot_heatmap ¶

plot_heatmap(
    ps: Phyloseq,
    method: str | None = "NMDS",
    distance: str = "bray",
    trans: str | None = "log4",
    low: str = "#000033",
    high: str = "#66CCFF",
    na_value: str = "black",
    title: str | None = None,
    label: str | None = None,
    taxa_label: str | None = None,
) -> Any

Abundance heatmap with samples and taxa reordered by ordination.

Both axes are reordered using the ordination result (samples along x, taxa/OTUs along y), matching R phyloseq. Zero/NA abundances are mapped to na_value rather than the gradient's low colour.

R reference: plot_heatmap(physeq, method, distance, trans, low, high, na.value, title, sample.label, taxa.label)

Parameters:

Name	Type	Description	Default
`ps`	`Phyloseq`	`Phyloseq` object.	required
`method`	`str \| None`	Ordination method used to reorder samples and taxa. Pass `None` to skip ordination and preserve the original sample/taxa order.	`'NMDS'`
`distance`	`str`	Distance metric for ordination.	`'bray'`
`trans`	`str \| None`	Transformation applied to abundances before plotting. `"log4"` (default) computes `log4(x)` with zeros kept as missing (so they map to `na_value`); `None` uses raw counts.	`'log4'`
`low`	`str`	Gradient colour endpoints.	`'#000033'`
`high`	`str`	Gradient colour endpoints.	`'#000033'`
`na_value`	`str`	Colour for zero/NA cells.	`'black'`
`title`	`str \| None`	Plot title.	`None`
`label`	`str \| None`	Column in `sample_data` whose values label the x-axis ticks instead of sample names. Samples are still ordered by ordination (or original order when `method=None`); only the tick text changes. A warning is emitted if the column is not found. Mirrors R phyloseq's `sample.label`.	`None`
`taxa_label`	`str \| None`	Taxonomic rank (e.g. `"Class"`) whose values label the y-axis ticks instead of OTU/taxa names. Taxa are still ordered by ordination (or original order when `method=None`); only the tick text changes. A warning is emitted if the rank is not found. Mirrors R phyloseq's `taxa.label`.	`None`

Returns:

Type	Description
`ggplot`

plot_tree¶

Phylogenetic tree visualization. Requires phy_tree:

p = plot_tree(ps, color="SampleType", label_tips="Phylum")
p.draw()

method parameter:

"treeonly" — draw the tree structure only
"sampledodge" — dodge sample points along the tips by metadata

# Tree with phylum labels at tips
p = plot_tree(ps, method="treeonly", label_tips="Phylum", ladderize=True)

# Sample-dodge mode: show samples by environment type at tips
p = plot_tree(ps, method="sampledodge", color="SampleType", size="Abundance")

pyloseq.plot_tree ¶

plot_tree(
    ps: Phyloseq,
    method: str = "sampledodge",
    color: str | None = None,
    shape: str | None = None,
    size: str | None = None,
    label_tips: str | None = None,
    text_size: float | None = None,
    sizebase: float = 5.0,
    base_spacing: float = 0.02,
    min_abundance: float = float("inf"),
    ladderize: bool | str = False,
    justify: str = "jagged",
    plot_margin: float = 0.2,
    figure_size: tuple[float, float] | None = None,
    title: str | None = None,
) -> Any

Phylogenetic tree with per-sample points at the tips.

R reference: plot_tree(physeq, method, color, shape, size, label.tips, text.size, sizebase, base.spacing, min.abundance, ladderize, justify, plot.margin, title)

Parameters:

Name	Type	Description	Default
`ps`	`Phyloseq`	`Phyloseq` object; must include a phylogenetic tree (`phy_tree`).	required
`method`	`str`	`"sampledodge"` (default) draws one point per sample at each tip, offset rightward along x. `"treeonly"` draws the tree alone.	`'sampledodge'`
`color`	`str \| None`	Sample metadata or `tax_table` column for point colour.	`None`
`shape`	`str \| None`	Sample metadata column for point shape.	`None`
`size`	`str \| None`	`"Abundance"` to scale point size by log-transformed abundance, any numeric metadata column, or `None` for fixed size.	`None`
`label_tips`	`str \| None`	`tax_table` column whose values label each tip (e.g. `"Genus"`).	`None`
`text_size`	`float \| None`	Font size for tip labels. Auto-scaled from tip count when `None`.	`None`
`sizebase`	`float`	Log base for the abundance → size transform.	`5.0`
`base_spacing`	`float`	Fractional x-step between dodged sample points, as a proportion of the maximum tip x value.	`0.02`
`min_abundance`	`float`	Abundance threshold for printing per-point text labels. Default `inf` suppresses all labels (matching R phyloseq). Points themselves are always shown for `Abundance > 0`.	`float('inf')`
`ladderize`	`bool \| str`	`False`, `True` / `"right"` (most-speciose clade at top), or `"left"` (most-speciose clade at bottom).	`False`
`justify`	`str`	`"jagged"` (default) starts each tip's dodge column from its own x position. `"left"` aligns all dodge columns at the rightmost tip.	`'jagged'`
`plot_margin`	`float`	Fractional right-margin added beyond the last dodged point so that tip labels are not clipped.	`0.2`
`figure_size`	`tuple[float, float] \| None`	`(width, height)` in inches. When `None` (default), height is auto-scaled from the tip count (`0.2 * n_tips`, min 6) and width is fixed at 12.	`None`
`title`	`str \| None`	Plot title.	`None`

Returns:

Type	Description
`ggplot`

make_network / plot_network¶

make_network builds a networkx.Graph where nodes are samples and edges connect samples whose distance is below max_dist:

from pyloseq import make_network, plot_network

g = make_network(ps, max_dist=0.4, distance="bray")
p = plot_network(g, ps, color="SampleType", label="SampleID")
p.draw()

The distance parameter accepts either a metric name string (see distance) or a precomputed skbio.stats.distance.DistanceMatrix — mirroring R's plot_net(distance=as.dist(...)) pattern. This makes it straightforward to use phylogenetic distances from gunifrac:

from pyloseq import gunifrac, make_network, plot_network

results = gunifrac(ps)
g = make_network(ps, distance=results["d_0.5"], max_dist=0.5)
p = plot_network(g, ps, color="SampleType")
p.draw()

Edge width is scaled inversely by distance: edges between more-similar samples appear thicker, matching R's plot_net. If the shape aesthetic maps to more than 6 unique values, the shape legend is suppressed automatically (shapes are still distinct in the plot) and a warning is issued — plotnine's default shape palette has only 6 entries.

Node attributes from sample_data are attached to each node automatically, making the graph available for further networkx analysis.

pyloseq.make_network ¶

make_network(
    ps: Phyloseq,
    kind: str = "samples",
    distance: str | Any = "jaccard",
    max_dist: float = 0.4,
    keep_isolates: bool = False,
    **kwargs: Any,
) -> Any

Build a sample (or taxa) network based on a distance threshold.

Edges are drawn between nodes whose distance is strictly less than max_dist (matching R phyloseq's max.dist semantics).

R reference: make_network(physeq, type, distance, max.dist, keep.isolates) R reference: plot_net(physeq, distance=as.dist(...), ...) — precomputed DM

Parameters:

Name	Type	Description	Default
`ps`	`Phyloseq`	`Phyloseq` object.	required
`kind`	`str`	`"samples"` (default) or `"taxa"`.	`'samples'`
`distance`	`str \| Any`	Distance metric string (see :func:`pyloseq.distance`) or a precomputed `skbio.stats.distance.DistanceMatrix`. Passing a `DistanceMatrix` mirrors R's `plot_net(distance=as.dist(...))` pattern (e.g. from :func:`pyloseq.gunifrac`).	`'jaccard'`
`max_dist`	`float`	Maximum distance for an edge to be drawn (strict `<`).	`0.4`
`keep_isolates`	`bool`	If `False` (default), remove nodes with no edges.	`False`

Returns:

Type	Description
`Graph`

pyloseq.plot_network ¶

plot_network(
    g: Any,
    ps: Phyloseq,
    color: str | None = None,
    shape: str | None = None,
    line_weight: float = 0.5,
    line_color: str = "grey",
    line_alpha: float = 0.4,
    point_size: float = 5.0,
    label: str | None = None,
    layout: str = "fruchterman_reingold",
    title: str | None = None,
) -> Any

Plot a network graph as a ggplot scatter.

R reference: plot_network(ig, physeq, color, shape, line_weight, line_color, line_alpha, point_size, label, layout, title)

Parameters:

Name	Type	Description	Default
`g`	`Any`	`networkx.Graph` from :func:`make_network`.	required
`ps`	`Phyloseq`	`Phyloseq` object (used for metadata annotations).	required
`color`	`str \| None`	Node attribute column for colour.	`None`
`shape`	`str \| None`	Node attribute column for point shape.	`None`
`line_weight`	`float`	Edge line width.	`0.5`
`line_color`	`str`	Edge colour.	`'grey'`
`line_alpha`	`float`	Edge opacity.	`0.4`
`point_size`	`float`	Node point size.	`5.0`
`label`	`str \| None`	Node attribute column (or `"node"` for the node id) to draw as text labels next to each node.	`None`
`layout`	`str`	NetworkX layout algorithm name (e.g. `"fruchterman_reingold"`).	`'fruchterman_reingold'`
`title`	`str \| None`	Plot title.	`None`

Returns:

Type	Description
`ggplot`