ema switch geneview¶
ema switch geneview renders gene-track panels showing per-cluster PAS
coverage and proportions for a set of genes. For each gene, it builds a panel
with depth-normalised per-cluster PAS read bars, optional isoform structure
(from a GTF), and optional PDUI / proportion / entropy score overlays (from a
length TSV). The output is one figure file per gene, a meta.json manifest,
and a figures_INDEX.md index.
Genes are selected from the union of:
- Explicit
--gene-idarguments (normalised to uppercase). - Top-N genes auto-ranked from
--diff-tsvresults by volcano score.
At least one of --diff-tsv or --gene-id must be provided.
When --output is left at its default and the --h5ad file comes from a
recognisable peakatail_runs/<run>/ directory, output is routed inside the
originating run directory as peakatail_runs/<run>/switch_geneview_<timestamp>/.
Source: ema/cli/common.py::resolve_subcommand_output_dir.
When to use it
- You have run
ema switch diffand want to visualise the top differentially used genes as per-cluster PAS track plots. - You have a specific gene of interest and want to see how PAS usage differs across all clusters.
- You want to overlay PDUI scores from
ema switch lengthon the per-cluster coverage bars to connect quantitative scores to the raw data.
When NOT to use it
- You have not provided either
--diff-tsvor--gene-id. The command raises aUsageErrorand exits without writing any files. - The gene identifier you supply does not appear in
adata.varorpasbed.bed. The command logs a warning and skips that gene (no figure is written). - The
--h5adfile has nogene_idcolumn invar. Gene-to-PAS mapping will fail and no panels will be rendered.
Quick example¶
# Auto-pick top 10 genes from a diff result
uv run ema switch geneview \
--diff-tsv peakatail_runs/emaout_.../switch_diff_<ts>/differential/fisher_0_vs_1.tsv \
--pasbed peakatail_runs/emaout_.../per_dataset/sample1/pasbed.bed \
--h5ad peakatail_runs/emaout_.../per_dataset/sample1/clusters.h5ad \
--top-genes 10
# Explicit gene with isoform overlay and PDUI score
uv run ema switch geneview \
--gene-id ACTB \
--gene-id MYC \
--diff-tsv peakatail_runs/emaout_.../switch_diff_<ts>/differential/fisher_0_vs_1.tsv \
--length-tsv peakatail_runs/emaout_.../switch_length_<ts>/pdui_classic.tsv \
--pasbed peakatail_runs/emaout_.../per_dataset/sample1/pasbed.bed \
--h5ad peakatail_runs/emaout_.../per_dataset/sample1/clusters.h5ad \
--gtf /path/to/gencode.v44.annotation.gtf \
--plot-engine plotly
What lands on disk after the first command:
peakatail_runs/emaout_.../switch_geneview_<ts>/figures/gene_<GENEID>.png— one PNG per rendered gene.peakatail_runs/emaout_.../switch_geneview_<ts>/figures/meta.json— manifest listing all rendered figures.peakatail_runs/emaout_.../switch_geneview_<ts>/figures/figures_INDEX.md— Markdown index of all figure files.peakatail_runs/emaout_.../switch_geneview_<ts>/peakatail_<ts>.log— run log.
Full --help output¶
Usage: ema switch geneview [OPTIONS]
Gene-track visualisation: per-cluster PAS coverage and proportions.
Renders one panel per gene showing:
- Per-cluster PAS read coverage (depth-normalised)
- Per-cluster within-gene PAS proportions
- Gene isoform structure (when --gtf is supplied)
Genes are selected from the union of --gene-id and the top-N genes
ranked by volcano score from --diff-tsv results.
Options:
--threads INTEGER Max parallel workers (auto-detected if not
set). Respected by ResourceManager as an
absolute ceiling.
-v, --verbose Increase verbosity. -v = DEBUG for ema.*;
-vv = DEBUG everywhere.
-q, --quiet WARNING and up only. Overrides --verbose.
--log-level TEXT Explicit logger level (DEBUG/INFO/WARNING
/ERROR) or `logger.name=LEVEL` (repeatable:
comma-separated).
--no-log-file Don't write peakatail_<ts>.log next to the
outputs.
--no-progress Suppress Rich progress bars.
-c, --config PATH YAML config; CLI flags override individual
keys.
-o, --output PATH Output directory (timestamp suffix added
automatically).
--plot-engine TEXT Engines: 'matplotlib' (default), 'plotly',
'both', 'none', or comma list.
--plot-format TEXT Restrict output formats. Default 'all' =
png+svg+html as appropriate.
--no-plots Disable all plotting (alias for --plot-
engine none).
--diff-tsv PATH Differential TSV from `ema switch diff`
(repeatable). Used to auto-select top-N genes
by volcano score. Optional when --gene-id is
supplied.
--length-tsv PATH PDUI / proportion / entropy TSV from `ema
switch length`. When present, overlays
strategy scores on per-cluster bars.
--pasbed PATH pasbed.bed — PAS coordinates (BED6 format).
[required]
-i, --h5ad PATH clusters.h5ad with leiden cluster labels and
gene_id in var. [required]
--gtf PATH Ensembl GTF for isoform structure (optional).
--top-genes INTEGER Number of top genes to auto-pick from --diff-
tsv results. [default: 10]
--gene-id TEXT Explicit gene ID(s) to render (repeatable).
Union-ed with --top-genes list.
--cluster-key TEXT obs column carrying cluster labels.
[default: leiden]
--help Show this message and exit.
Flags¶
Inputs¶
| Flag | Type | Default | Description |
|---|---|---|---|
--h5ad / -i |
PATH | — | Single clusters.h5ad file from ema run. Must have gene_id in adata.var (written when --gtf was provided to ema run). The Leiden cluster labels (adata.obs[cluster_key]) drive the per-cluster display. Required. |
--pasbed |
PATH | — | PAS BED file in BED6 format (chrom, start, end, pas_id, score, strand). Loaded to retrieve genomic coordinates for PAS track positioning. Required. |
--diff-tsv |
PATH (repeatable) | — | One or more differential TSV files from ema switch diff. Each TSV is parsed for cluster1, cluster2, gene_id, and volcano columns. Multiple TSVs for the same cluster pair are concatenated. Optional when --gene-id is provided; required otherwise. |
--length-tsv |
PATH | — | PDUI, proportion, or entropy TSV from ema switch length. When provided, score values are overlaid on per-cluster PAS bars. Optional. |
--gtf |
PATH | — | Ensembl/GENCODE GTF file. When provided, isoform structure is loaded via ema.viz._gene_track_helpers.load_isoforms_for_gene and drawn as a transcript model track below the PAS bars. Optional; failures are logged as warnings and do not abort the run. |
Gene selection¶
| Flag | Type | Default | Description |
|---|---|---|---|
--top-genes |
INT | 10 | Number of top genes to auto-select from --diff-tsv results, ranked by volcano score via ema.viz._gene_track_helpers.rank_top_genes. Only used when --diff-tsv is provided. |
--gene-id |
TEXT (repeatable) | — | Explicit gene ID(s) to render regardless of differential ranking. Normalised to uppercase and stripped of whitespace. Repeated to specify multiple genes: --gene-id ACTB --gene-id MYC. Added to the gene list before auto-ranked genes; the union is deduplicated while preserving order. |
Display¶
| Flag | Type | Default | Description |
|---|---|---|---|
--cluster-key |
TEXT | leiden |
adata.obs column carrying cluster labels. Change this if ema run was called with --external-clusters or a custom labelling. |
Output¶
| Flag | Type | Default | Description |
|---|---|---|---|
--output / -o |
PATH | switch_geneview |
Output directory base name. Auto-routed inside the originating run dir when --h5ad is from a peakatail_runs/ path. |
--plot-engine |
TEXT | matplotlib |
Rendering engine. matplotlib writes PNG (and SVG with --plot-format svg). plotly writes interactive HTML. both writes all formats. The renderer auto-discovers engines via ema.viz.render_all. |
--plot-format |
TEXT | all |
Restrict output formats. png, svg, html, or comma list. |
--no-plots |
FLAG | off | Disable all figure output. The gene list and manifest are still computed, but no figure files are written. |
Gene selection logic¶
The command implements the following selection pipeline (source:
ema/cli/switch_geneview.py lines 161–218):
- Parse
--gene-idarguments: normalise to uppercase, strip whitespace. - If
--diff-tsvis provided, callrank_top_genes(pair_results, n=top_genes)on the accumulated diff DataFrames. This ranks genes by volcano score across all pairs. - Build the final gene list as
explicit_genes + auto_genes, deduplicated while preserving order (explicit genes appear first). - For each gene, call
build_gene_panelandrender_all. Genes not found in the AnnData or pasbed are skipped with a warning.
Output files¶
All figure files are written to <out_dir>/figures/.
figures/gene_<GENE_ID>.<ext>
One file per gene per engine/format combination. The base name is always
gene_<GENE_ID> where <GENE_ID> is the gene identifier as normalised (uppercase).
Extensions: .png (matplotlib), .svg (matplotlib with svg format), .html
(plotly). Source: ema/cli/switch_geneview.py line 269.
figures/meta.json
Written by ema.viz._meta.write_figures_index. JSON manifest of all figure
files in the directory with their paths and the originating command
("ema switch geneview"). Failures to write this file are caught and logged
as warnings (the figures are not affected).
figures/figures_INDEX.md
Markdown index of figure files in the figures/ directory, generated alongside
meta.json. Suitable for inclusion in a report or MkDocs documentation site.
How it relates to other commands¶
ema run— producesclusters.h5adandpasbed.bed. Gene annotation inadata.var["gene_id"]is required.ema switch diff— produces thedifferential/*.tsvfiles consumed via--diff-tsvfor auto gene ranking.ema switch length— produces the PDUI / proportion / entropy TSV consumed via--length-tsvfor score overlay.
See also¶
- Strategy pages in
../strategies/— details on the volcano ranking and gene-track rendering. - Tutorial in
../tutorials/— step-by-step gene-track visualisation guide.