Workflow Environment#

Workflow code is rarely free of dependencies. It may require python or system packages or make use of environment variables. For example, a task that downloads compressed reference data from AWS S3 will need the aws-cli and unzip APT packages, then use the pyyaml python package to read the included metadata.

The workflow environment is encapsulated in a Docker container, which is created from a recipe defined in a text document named Dockerfile.. Latch provides four baseline environments which each latch workflow inherits from. In most cases, modifying the Dockefile manually is unnecessary, so Latch will automatically generate one using conventional dependency lists and heuristics. To use a handwritten Dockerfile, run the eject command.

Automatic Dockerfile Generation#

Below is the list of files used when auto-generating Dockerfiles.

If auto-generation does not cover your use case, please open a suggestion on GitHub.

Python: requirements.txt#

Dependencies from a requirements.txt file will be automatically installed using pip install --requirement.

Example File

Generated Docker Commands
copy requirements.txt /opt/latch/requirements.txt
run pip install --requirement /opt/latch/requirements.txt

Python:, PEP-621 pyproject.toml#

Workflows with a package specification in a file or a PEP-621 pyproject.toml file will be automatically installed using pip install --editable

Poetry pyproject.toml files are not supported.

Example File
from setuptools import setup


Generated Dockerfile Commands
copy . /root/
run pip install --editable /root/

System/Python: Conda environment.yaml#

The Conda environment in an environment.yaml file will be automatically created using conda env create --file with latest miniconda. The environment will be activated by default.

Example File
name: workflow
  - conda-forge
  - defaults
  - python=3.7
  - bwakit=0.7.17
  reference: ~/covid19

Generated Dockerfile Commands
env CONDA_DIR /opt/conda

run apt-get update --yes && \
    apt-get install --yes curl && \
    curl --remote-name && \
    mkdir /root/.conda && \
    # docs for -b and -p flags:
    bash -b -p /opt/conda && \
    rm -f

copy environment.yaml /opt/latch/environment.yaml
run conda env create --file /opt/latch/environment.yaml --name workflow

shell ["conda", "run", "--name", "workflow", "/bin/bash", "-c"]
run pip install --upgrade latch

R: environment.R#

Any script in an environment.R file will be automatically executed when the workflow is built. This is intended for installing dependencies but there are no actual limits on what the script does.

Currently only R 4.0.0 is supported.

Note that some R packages may have system dependencies that need to be installed using APT or another method. These packages will list these dependencies in their documentation. Missing dependencies will cause crashes during workflow build or when using the packages.

Example File


Generated Dockerfile Commands
run apt-get update --yes && \
    apt-get install --yes software-properties-common && \
    add-apt-repository "deb buster-cran40/" && \
    apt-get install --yes r-base r-base-dev libxml2-dev libcurl4-openssl-dev libssl-dev wget

copy environment.R /opt/latch/environment.R
run Rscript /opt/latch/environment.R

System: APT#

Dependencies from a system-requirements.txt text document will be automatically installed using apt-get install --yes

Example File

Generated Dockerfile Commands
copy system-requirements.txt /opt/latch/system-requirements.txt
run apt-get update --yes && \
    xargs apt-get install --yes < /opt/latch/system-requirements.txt

Environment Variables#

Environment variables from an .env text document will be automatically set in the workflow environment.

Example File

Generated Dockerfile Commands
env BOWTIE2_INDEXES="reference"
env PATH="/root/bowtie2:$PATH"

Example of Auto-generated Dockerfile#

The following Dockerfile is generated in the subprocess template (using latch init --template subprocess --dockerfile example_workflow):

# latch base image + dependencies for latch SDK --- removing these will break the workflow
run pip install latch==2.12.1
run mkdir /opt/latch

# install system requirements
copy system-requirements.txt /opt/latch/system-requirements.txt
run apt-get update --yes && xargs apt-get install --yes </opt/latch/system-requirements.txt

# copy all code from package (use .dockerignore to skip files)
copy . /root/

# set environment variables
env BOWTIE2_INDEXES=reference

# latch internal tagging system + expected root directory --- changing these lines will break the workflow
arg tag
workdir /root

Note on Python Requirements#

The order of python requirement installation is as follows

  1. conda

  2. / pyproject.toml

  3. requirements.txt

Consequently, a package specified in the requirements.txt file will overwrite a previous install of the same packaged installed by the conda environment.

Ejecting Auto-generation#

The auto-generated Dockerfile can be saved to the workflow root using latch dockerfile <path to workflow root>. Subsequent latch register and latch develop commands will use the saved version. This also disables automatic generation so no dependency files will be used and changes in these files will not have any effect.

To start with a custom Dockerfile, the --dockerfile option for latch init can be used.

This can be used to switch to a more complicated handwritten Dockerfile or to debug any issues with auto-generation. Removing the Dockerfile will re-enable automatic generation.

If you use ejection because auto-generation does not cover your use case, please open a suggestion on GitHub.

Excluding Files#

By default, all files in the workflow root directory are included in the workflow build. Any unnecessary files will increase the resulting workflow container image size and increase registration and startup time proportional to their size.

To exclude files from the build use a .dockerignore. Files can be specified one at a time or using glob patterns.

The default .dockerignore includes files auto-generated by Latch.

GPU Task Limitations#

Commands that require certain kernel capabilities will fail with “Permission denied” in GPU tasks (small-gpu-task, large-gpu-task). This includes mount and chroot among others.