Workflow Engine Setup¶
Using a workflow engine is a crucial component for scaling up quacc calculations in a high-throughput setting. We describe the necessary installation steps here for the workflow manager of your choosing.
Picking a Workflow Engine
If you don't want to use a workflow engine or are just starting out, you can simply skip this section.
For a comparison of the different compatible workflow engines, refer to the Workflow Engines Overview section.
Installation
To install Covalent, run
Starting the Server
Covalent uses a server to dispatch and store calculation details and results. To start the server, simply run covalent start
in your terminal. It will return a URL (usually http://localhost:48008) that you can use to access the Covalent dashboard, which is shown below.
Tip
Once you start scaling up your calculations, we recommend hosting the Covalent server on a dedicated machine or using Covalent Cloud. Refer to the Covalent Deployment Guide for details.
Installation
To install Parsl, run the following:
Parsl has many configuration options, which we will cover later in the documentation.
Installation
To install Prefect, run the following:
To connect to Prefect Cloud, run the following as well:
Prefect has many configuration options. For instance, you can store the quacc logs in the UI as follows:
Installation
To install Jobflow with support for FireWorks, run the following:
MongoDB Setup
Jobflow and FireWorks both require the use of a database (most commonly a MongoDB instance) to store calculation results.
Tip
If it is not possible to use MongoDB, you can use a variety of other data store options available within the maggma package, including a MontyStore
that solely relies on the local filesystem.
Jobflow DB Setup
If you plan to use Jobflow to write your workflows, you will need to make a jobflow.yaml
file. This file will generally be formatted like the example below. Fill in the fields with the appropriate values for your MongoDB cluster, which is where all your calculation inputs and outputs will be stored.
JOB_STORE:
docs_store:
type: MongoStore
host: <host name>
port: 27017
username: <username>
password: <password>
database: <database name>
collection_name: <collection name>
MongoDB Atlas
If you are using a URI (as is common with MongoDB Atlas), then you will instead have a jobflow.yaml
file that looks like the example below. Here, you will put the full URI in the host
field. The username
and password
are part of the URI and so should not be included elsewhere in the YAML file.
You will then need to define a JOBFLOW_CONFIG_FILE
environment variable pointing to the file you made. For instance, in your ~/.bashrc
file, add the following line:
FireWorks Setup
If you plan to use FireWorks to dispatch your Jobflow workflows, you will also need to make a few configuration files: FW_config.yaml
, my_fworker.yaml
, my_launchpad.yaml
, and my_qadapter.yaml
. To begin, make a directory called fw_config
where you will store the four files described in greater detail below. The directory structure will look like the following:
FW Config File
For the FW_config.yaml
, you can use the following template. Make sure to update the path to the fw_config
folder where the file resides.
You will also need to define a FW_CONFIG_FILE
environment variable pointing to the FW_config.yaml
file you made. For instance, in your ~/.bashrc
file, add the following line:
FWorker
For the my_fworker.yaml
, you can use the following template. You do not need to make any modifications.
Launchpad
For the my_launchpad.yaml
, you can use the following template. Replace the entries in <>
with the appropriate values for your Mongo database.
host: <host name>
port: 27017
name: <database name>
username: <username>
password: <password>
logdir: null
strm_lvl: DEBUG
user_indices: []
wf_user_indices: []
MongoDB Atlas
If you are accessing your MongoDB via a URI (e.g. as with MongoDB Atlas), then you will use the following my_launchpad.yaml
template instead.
QAdapter
Assuming you plan to use a queuing system for your compute jobs, you will need to make a my_qadapter.yaml
file. For this, you will need to follow the instructions in the FireWorks documentation for your specific job scheduling system. An example my_qadapter.yaml
file is shown below for Slurm.
_fw_name: CommonAdapter
_fw_q_type: SLURM
rocket_launch: rlaunch -w </path/to/fw_config/my_fworker.yaml> singleshot
nodes: 1
walltime: 00:30:00
account: <account>
job_name: quacc_firework
qos: regular
pre_rocket: |
conda activate MyEnv
module load MyModuleName
export MyEnvVar=MyEnvValue
In the above example, you would need to change the path in the rocket_launch
field to the correct path to your my_fworker.yaml
. The nodes, walltime, account, and qos are the corresponding parameters for your queuing system. Finally, anything in the pre_rocket
field will be executed before the job begins running. It is a good place to load modules and set environment variables. A representative example has been provided above.
Database Initialization
Danger
Running lpad reset
will clear your FireWorks launchpad, so only use this command if you are a new user.
To check that FireWorks can connect to your database, run lpad reset
if this is your first time using FireWorks.
Quacc Settings
Finally, since FireWorks will create unique folders of its own for each job, you can disable quacc's handling of directory management as follows: