Skip to content

Snakemake configuration for Cannon partition selection

Cannon Snakemake configuration

Snakemake's SLURM executor plugin can perform automatic partition selection on SLURM clusters if provided with a properly formatted configuration file. We host and maintain a configuration file for Harvard's Cannon cluster here:

/n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/cannon-snakemake.yml

Usage

To use this configuration file, other resources (threads, mem_mb, runtime) must be defined for your rules, either with a resources: definition within the rules or with a workflow profile (provided with --workflow-profile). The plugin will analyze the requested resources for each rule along with the available resources per partition provided in this config file to determine the most suitable partition for the rule.

When the resources in your Snakemake workflow are setup, provide the path to the partition config file with --slurm-partition-config option:

snakemake ... --slurm-partition-config /n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/cannon-snakemake.yml

It may be easiest to store this path as an environment variable, so you can more easily reference it:

SMK_PART_CFG=/n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/cannon-snakemake.yml
snakemake ... --slurm-partition-config $SMK_PART_CFG

To permanently store this environment variable, add it to your .bashrc file in your home directory by adding this line of text:

export SMK_PART_CFG=/n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/cannon-snakemake.yml

Now, the SMK_PART_CFG will be loaded automatically everytime you log in. NOTE: If you want to see the variable immediately after adding it to .bashrc, you will have to either close the terminal and reconnect, or manually reload .bashrc with source ~/.bashrc.

Confirm the SMK_PART_CFG environment variable is loaded

You can always confirm that that SMK_PART_CFG environment variable is loaded by typing echo $SMK_PART_CFG. It should print out the path to the profile. If it displays nothing (or some other data), the config won't work properly with that variable.

Generating a custom configuration file

There may be some instances where one might want to modify the configuration file, for instance to exclude certain partitions from being considered, or for adding lab-specific partitions to the config file.

We provide a script that will parse the partitions available to the current user to automatically to generate a config file:

/n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/generate_snakemake_partition_config.py

While you can run this script from our directory by providing the full path to Python (e.g. python /n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/generate_snakemake_partition_config.python), it may be easier to copy the script to your own directory:

cp /n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/generate_snakemake_partition_config.py generate_snakemake_partition_config.py
The examples below assume a copy of the script is in your current working directory.

First, get a list of all partitions you have access to:

python generate_snakemake_partition_config.py --list-partitions

The table output lists the partitions and their resources. Use the selected column to determine if each partition will be written to the config file. To include a partition that is currently not selected or would be commented, use the --include flag:

python generate_snakemake_partition_config.py --include part1,part2 --list-partitions

Likewise, to exclude a partition that would be included, use the --exclude flag:

python generate_snakemake_partition_config.py --exclude part3,part4,part5 --list-partitions

These options can be combined:

python generate_snakemake_partition_config.py --include part1,part2 --exclude part3,part4,part5 --list-partitions

Once the partition table includes all the partitions you want Snakemake to choose from, remove the --list-partitions option to generate the config file:

python generate_snakemake_partition_config.py --include part1,part2 --exclude part3,part4,part5

By default, this will write a file called cannon-snakemake.generated-<timestamp>.yml, but you can specify a file name with --output

python generate_snakemake_partition_config.py --include part1,part2 --exclude part3,part4,part5 --output my-custom-config.yml

See the cannon-snakemake-cfg repo for more documentation related to the script.

Modifying the configuration file manually

If you don't wish to run the Python script above, you can always manually edit the configuration file.

In both cases (excluding partitions or including lab partitions), the first step will be to copy the configuration file:

cp /n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/cannon-snakemake.yml cannon-snakemake.cfg

This will make a copy of the file in your current directory. You could also rename the file by changing the second file name in the cp command, say to cannon-snakemake-<project>.cfg or cannon-snakemake-<lab name>.cfg, depending on the circumstances.

Adding lab-specific partitions

After you've made your copy of the config file, you can edit it in any text editor to add a partition. Simply add an entry under partitions::

partitions:
  <PARTITION NAME>:
    max_runtime: <INT; MINUTES>
    max_mem_mb: <INT; MEGABYTES>
    max_cpus_per_task: <INT>
    max_nodes: <INT>
    max_threads: <INT; NUMBER OF CPUS ON A GIVEN NODE>

Replace with the real name of your lab's partition.

In most cases, max_cpus_per_task can equal max_threads and max_nodes will be 1.

For the other resources, you likely already have them documented internally. If not, you may be able to get max runtime with:

scontrol show partition <PARTITION NAME> | grep -o "MaxTime=[^ ]*"

and max CPUS and memory with:

sinfo -p <PARTITION NAME> -N -o "%c %m" | awk 'NR>1 { 
    if ($1 > max_cpus) max_cpus=$1; 
    if ($2 > max_mem)  max_mem=$2 
} END { 
    printf "MaxCPUsPerNode=%d  MaxMemPerNode=%dMB\n", max_cpus, max_mem 
}'

Again, replacing with the real name of your lab's partition.

Exclude a partition from consideration

To exclude a partition from being considered for submission, simply comment that partition out in your copy of the config file by adding # symbols to the beginning of each line referencing that partition. For example, if one wanted to exclude the shared partition, they would edit that section in the config to look like this:

  # shared:
  #  max_runtime: 4320  # 3 days
  #  max_mem_mb: 184000  # 184 GB
  #  max_cpus_per_task: 48
  #  max_nodes: 1
  #  max_threads: 48

Page maintainer:Gregg Thomas