Core Capabilities

Virus-Mapper is more than an annotator. It's a pangenome interrogation engine designed to deliver precise, actionable insights from your variant data.

Pangenome-Powered Classification

This is the central function of Virus-Mapper. Instead of just guessing if a variant is real, the tool directly checks for its presence in the pangenome graph. The result is a clear, unambiguous classification in your VCF INFO field.

  • KNOWN: The exact alternate allele exists in the graph.
  • NOVEL: The alternate allele is not found in the graph.
# Your input VCF record
chr1    12345   .   A   G   .   .   ...

# The annotated output from Virus-Mapper
chr1    12345   .   A   G   .   .   G_ALLELE_STATE=KNOWN;...

Haplotype & Sample Resolution

Knowing a variant is present in the pangenome is only half the story. Virus-Mapper tells you exactly which samples or reference paths in the graph carry that variant.

This is critical for tracking variant inheritance, identifying cohort-specific alleles, or confirming a variant's presence on an emerging viral lineage.

# An output record showing the allele is present in two haplotypes
... G_ALLELE_STATE=KNOWN;G_HAPLOTYPES=HG002,NA19240

Powerful & Intuitive Filtering

Zero in on the variants that matter. Virus-Mapper includes a point-and-click filter builder that lets you construct complex queries without needing to memorize syntax.

For power users, it supports standard VCF expressions (like those used in `bcftools`), giving you full control to filter by quality, INFO fields, or any other metric before the pangenome analysis begins.

# Build complex filters with a simple UI
# Example: Find high-quality variants with sufficient read depth
QUAL >= 50 && INFO.DP > 20

Integrated Viral Gene Annotation

When working with viral genomes, context is everything. With the annotation feature enabled, Virus-Mapper automatically cross-references variant positions against the NCBI RefSeq viral database.

Instantly see if your variant falls within a critical gene (like Spike) or another feature, adding a crucial layer of biological context to your results.

# A variant falling within the Spike (S) gene's coding sequence (CDS)
... G_ALLELE_STATE=KNOWN;V_GENE=S;V_FEAT=CDS

Privacy by Design: Your Data Stays Yours

In an era of cloud computing, Virus-Mapper takes a different approach. All analysis—from file parsing to pangenome interrogation—happens entirely on your local machine within the secure Docker container.

Your sensitive genomic data is never uploaded, transmitted, or shared. This local-first architecture guarantees complete confidentiality and control, making it suitable for clinical, proprietary, or any other sensitive research.

[ Your Computer ]
   |
   +-- [ Docker Container ]
   |      |
   |      +-- [ Virus-Mapper App ] <--> [ Your Local Files ]
   |
   (No data ever leaves)
   x---------------------> [ The Internet ]

Engineered for Speed & Scale 🚀

Built in Rust, Virus-Mapper is designed to maximize performance by using all available CPU cores and processing data efficiently.

  • Parallel Processing: The analysis of VCF records is automatically distributed across all your computer's CPU cores, dramatically reducing analysis time.
  • In-Memory Graph: The entire pangenome graph (GFA/ODGI) is loaded into RAM for the fastest possible access. Multi-gigabyte graphs are fully supported, provided you have enough available RAM (typically 1.5x-2x the ODGI file size).

Memory Usage Note: While the graph is the main user of RAM, the number of variants in your VCF file also impacts memory. All results are collected in memory before the final output file is written. Based on testing, a 50 MB VCF file can cause peak memory usage to climb by over 1.5 GB during the results collection phase. Please be mindful of this when processing very large VCF files on systems with limited RAM.

[ VCF File ] ----> | CPU 1: Process chunk 1 | ----> [ Results ]
                 | CPU 2: Process chunk 2 |         (in RAM)
                 | CPU 3: Process chunk 3 |
                 | ...                    |
                 | CPU N: Process chunk N |

Powered by a High-Performance Core

Virus-Mapper's speed and pangenome capabilities are made possible by the state-of-the-art ODGI library. To bring this power to a modern web application, we developed odgi-ffi.

This new Rust crate serves as a safe, high-performance bridge between the Rust backend and the C++ ODGI library. This architecture gives you the best of both worlds: the memory safety and concurrency of Rust with the battle-tested performance of ODGI's pangenome algorithms.

Learn more about the crate here: odgi-ffi on crates.io

[ Virus-Mapper Web App (Rust) ]
           ↓
[ odgi-ffi (Safe Rust Bridge) ]
           ↓
[ ODGI Core Engine (C++) ]

Built for Modern Research

Feature Description
Robust SV Support Accurately classifies DEL, INS, and DUP variants alongside SNPs and indels by searching for their exact sequence in the graph. Note: INV support is currently limited.
Advanced Variant Filtering Pre-filter variants using an intuitive UI or standard VCF expressions before running the main analysis.
User-Friendly Web UI No command-line expertise needed. All analysis is managed through an intuitive local web interface.
100% Private & Secure All processing is performed locally on your machine. Your data is never uploaded, ensuring complete confidentiality.
Pre-flight Validation Automatically checks for common errors like chromosome name mismatches and unsorted VCFs before analysis begins.
High-Performance Backend Powered by Rust and the ODGI library for fast, memory-efficient graph operations.
Batch Processing Analyze multiple VCF files in a single run, with results delivered as a convenient ZIP archive.