Skip to content

Workflow Engines Overview

Everyone's computing needs are different, so we ensured that quacc is interoperable with a variety of modern workflow management tools. There are 300+ workflow management tools out there, so we can't possibly support them all. Instead, we have focused on a select few that adopt a similar decorator-based approach to defining workflows with substantial support for HPC systems.

Note

We are planning to further enhance the support for Prefect and Jobflow starting in fall, 2025.

Summary

Recommendations

Not sure which to choose? In general, we recommend starting with Parsl for most HPC users. For a more feature-rich workflow orchestration platform, we recommend trying Prefect or Jobflow depending on your needs. Some additional opinions on the matter:

  • Covalent: You want a visual dashboard and are prioritizing the use of distributed compute resources, especially cloud compute.
  • Dask: You already are familiar with the Dask ecosystem and are happy to stick with it.
  • Parsl: You want to run many workflows as fast as possible on one or more job scheduler-based HPC machines.
  • Prefect: You want a visual dashboard with a robust workflow management platform and are familiar with the basic concepts of workflow orchestration.
  • Redun: You are running calculations on AWS.
  • Jobflow: You are familiar with using MongoDB.

Covalent is a user-friendly workflow management solution from the company Agnostiq.

Pros:

  • Excellent visual dashboard for job monitoring
  • Easy to use in distributed, heterogeneous compute environments
  • Excellent documentation
  • Automatic and simple database integration
  • The compute nodes do not need to be able to connect to the internet

Cons:

  • It requires a centralized server to be running continuously in order to manage the workflows unless using Covalent Cloud
  • Support for job scheduler HPC environments is available but not as robust or performant as other solutions
  • High-security HPC environments may be difficult to access via SSH with the centralized server approach
  • Not as widely used as other workflow management solutions

Dask is a popular parallel computing library for Python. We use Dask Delayed for lazy function execution, Dask Distributed for distributed compute, and (optionally) Dask-Jobqueue for orchestrating the execution on HPC machines.

Pros:

  • Extremely popular
  • Has native support for running on HPC resources
  • It does not involve a centralized server or network connectivity
  • Supports adaptive scaling of compute resources
  • The dashboard to monitor resource usage is very intuitive

Cons:

  • If the Dask cluster dies, there is no mechanism to gracefully recover the workflow history
  • Monitoring job progress is more challenging and less detailed than other solutions
  • The documentation, while comprehensive, can be difficult to follow given the various Dask components
  • Calculations cannot be submitted remotely or across disparate compute resources

Parsl is a workflow management solution out of Argonne National Laboratory, the University of Chicago, and the University of Illinois. It is well-adapted for running on virtually any HPC environment with a job scheduler.

Pros:

  • Extremely configurable and deployable for virtually any HPC environment
  • Quite simple to define the workflows and run them from a Jupyter Notebook
  • Thorough documentation and active user community across academia
  • Well-suited for pilot jobs and advanced queuing schemes
  • Does not rely on maintaining a centralized server

Cons:

  • The number of different terms can be slightly overwhelming to those less familiar with HPC
  • Monitoring job progress is more challenging and less detailed than other solutions
  • Debugging failed workflows can be difficult
  • The pilot job model is often a new concept to many HPC users that takes some time to understand

Prefect is a workflow orchestration tool that is popular in the data engineering community. It has an excellent dashboard for monitoring workflows.

Pros:

  • Quite popular among the data engineering community
  • Excellent web-based dashboard for monitoring workflow progress
  • The free version of Prefect Cloud is reasonably generous
  • Can use advanced queueing schemes to manage workflow execution
  • New features are being added regularly and rapidly

Cons:

  • For those who are less HPC-savvy, some of the concepts can be quite technical
  • If using Prefect Cloud, the compute nodes must have a network connection
  • The dashboard, while useful for monitoring successes and failures, is not ideal for analyzing results
  • The software is geared more towards data engineering than scientific computing, and that is reflected in the features and documentation

Redun is a flexible workflow management program developed by Insitro.

Pros:

  • Extremely simple syntax for defining workflows
  • Has strong support for task/result caching
  • Useful CLI-based monitoring system
  • Very strong AWS support

Cons:

  • Currently lacks support for typical HPC job schedulers
  • No user-friendly GUI for job monitoring
  • Does not have a particularly active user community
  • Not updated frequently

Jobflow is developed and maintained by the Materials Project team at Lawrence Berkeley National Laboratory and serves as a seamless interface to FireWorks or Jobflow Remote for dispatching and monitoring compute jobs.

Warning

Jobflow is not yet compatible with the @flow or @subflow decorators used in many quacc recipes and so should only be used if necessary. See this issue to track the progress of this enhancement.

Pros:

  • Native support for a variety of databases
  • Directly compatible with Atomate2
  • Designed with materials science workflows in mind
  • Actively supported by the Materials Project team

Cons:

  • Is not fully compatible with all the features of quacc
  • Parsing the output of a workflow is not as intuitive as other solutions
  • Defining dynamic workflows with Jobflow's Response object can be more complex than other solutions