Skip to main content

tools to support genome and metagenome analysis

Project description

genome-grist - map Illumina metagenomes to GenBank genomes

PyPI License: 3-Clause BSD

  1. download a metagenome
  2. process it into trimmed reads, and make a sourmash signature
  3. search the sourmash signature with 'gather' against sourmash databases, e.g. all of genbank
  4. download the matching genomes from genbank
  5. map all metagenome reads to genomes using minimap - map_reads and extract_mapped_reads
  6. extract matching reads iteratively based on gather, successively eliminating reads that matched to previous gather matches - extract_gather
  7. run mapping on “leftover” reads to genomes - map_gather
  8. summarize all mapping results for comparison and graphing - summarize_gather

Why the name grist?

In the sourmash family of names (sourmash, wort, distillerycats, etc.)

NOT: https://en.wikipedia.org/wiki/Grist_(computing)

THIS: https://en.wikipedia.org/wiki/Grist

Leftover text

podar ref genomes

Snakefile based on @luizirber code

Genome URL generation code

download SRA code

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page