Getting started#


Latch’s snakemake integration allows developers to build graphical interfaces to expose their workflows to wet lab teams. It also provides managed cloud infrastructure for the execution of the workflow’s jobs.

A primary design goal for the Snakemake integration is to allow developers to register existing projects with minimal added boilerplate and modifications to code. Here, we outline these changes and why they are needed.

How to Upload a Snakemake Workflow#

Recall a snakemake project consists of a Snakefile , which describes workflow rules in an “extension” of Python, and associated python code imported and called by these rules. To make this project compatible with Latch, we need to do the following:

  1. Identify and construct explicit parameters for each file dependency in

  2. Build a container with all runtime dependencies

  3. Ensure your Snakefile is compatible with cloud execution

In this guide, we will walk through how you can upload a simple Snakemake workflow to Latch.

The example being used here comes from the short tutorial in Snakemake’s documentation.


pip install latch[snakemake]

Step 1#

First, initialize an example Snakemake workflow:

latch init snakemake-wf --template snakemake

The workflow generated contains what is typically seen in a Snakemake workflow, such as python scripts and a Snakefile.

├── Dockerfile # Latch specific
├── Snakefile
├── data
│   ├── genome.fa
│   ├── genome.fa.amb
│   ├── genome.fa.ann
│   ├── genome.fa.bwt
│   ├── genome.fa.fai
│   ├── genome.fa.pac
│   ├──
│   └── samples
│       ├── A.fastq
│       ├── B.fastq
│       └── C.fastq
├── environment.yaml
├── scripts
│   ├── __pycache__
│   │   └── plot-quals.cpython-39.pyc
│   └──
├── version
└── wf

To make the workflow compatible to execute on Latch, two additional files are needed:

  • Dockerfile to specify dependencies the workflow needs to run

  • to specify workflow parameters to expose on the user interface.

In this tutorial, we will walk through how these two files can be constructed.

Step 2: Add a metadata file#

The is used to specify the input parameters that the Snakemake workflow needs to run.

For example, by examining the Snakefile, we determine there are two parameters that the workflow needs: a reference genome and a list of samples to be aligned against the reference genome.

from latch.types.metadata import SnakemakeMetadata, SnakemakeFileParameter
from import LatchDir
from latch.types.metadata import LatchAuthor, LatchMetadata, LatchParameter
from pathlib import Path

        "samples" : SnakemakeFileParameter(
                display_name="Sample Input Directory",
                description="A directory full of FastQ files",
        "ref_genome" : SnakemakeFileParameter(
                display_name="Indexed Reference Genome",
                description="A directory with a reference Fasta file and the 6 index files produced from `bwa index`",

For each LatchFile/LatchDir parameter, the path keyword specifies the path where files will be copied before the Snakemake workflow is run and should match the paths of the inputs for each rule in the Snakefile.

If your Snakemake project has an existing config.yaml file, you can automatically generate the file by typing:

latch generate-metadata <path_to_config.yaml>

Step 3: Add dependencies#

Next, create an environment.yaml file to specify the dependencies that the Snakefile needs to run successfully:

# environment.yaml
  - bioconda
  - conda-forge
  - snakemake=7.25.0
  - jinja2
  - matplotlib
  - graphviz
  - bcftools =1.15
  - samtools =1.15
  - bwa =0.7.17
  - pysam =0.19
  - pygments

A Dockerfile can be automatically generated by typing:

latch dockerfile snakemake-wf --snakemake

Step 3: Upload the workflow to Latch#

Finally, type the following command to register the workflow to Latch:

cd snakemake-wf &&\ 
latch register . --snakefile Snakefile

During registration, a workflow image is built based on dependencies specified in the environment.yaml file. Once the registration finishes, the stdout provides a link to your workflow on Latch.

Snakemake workflow interface on Latch

Step 4: Run the workflow#

Snakemake support is currently uses JIT (Just-In-Time) registration. This means that the workflow produced by latch register will register a second workflow, which will run the actual Snakemake jobs.

Once the workflow finishes running, results will be deposited to Latch Data under the Snakemake Outputs folder.

Next Steps#

  • Learn more about the lifecycle of a Snakemake workflow on Latch by reading our manual.

  • Learn about how to modify Snakemake workflows to be cloud-compatible here.

  • Visit troubleshooting to diagnose and find solutions to common issues.

  • Visit the repository of public examples of Snakemake workflows on Latch.