Snakemake configuration for Cannon partition selection
Snakemake's SLURM executor plugin can perform automatic partition selection on SLURM clusters if provided with a properly formatted configuration file. We host and maintain a configuration file for Harvard's Cannon cluster here:
Usage
To use this configuration file, other resources (threads, mem_mb, runtime) must be defined for your rules, either with a resources: definition within the rules or with a workflow profile (provided with --workflow-profile). The plugin will analyze the requested resources for each rule along with the available resources per partition provided in this config file to determine the most suitable partition for the rule.
When the resources in your Snakemake workflow are setup, provide the path to the partition config file with --slurm-partition-config option:
snakemake ... --slurm-partition-config /n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/cannon-snakemake.yml
It may be easiest to store this path as an environment variable, so you can more easily reference it:
SMK_PART_CFG=/n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/cannon-snakemake.yml
snakemake ... --slurm-partition-config $SMK_PART_CFG
To permanently store this environment variable, add it to your .bashrc file in your home directory by adding this line of text:
export SMK_PART_CFG=/n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/cannon-snakemake.yml
Now, the SMK_PART_CFG will be loaded automatically everytime you log in. NOTE: If you want to see the variable immediately after adding it to .bashrc, you will have to either close the terminal and reconnect, or manually reload .bashrc with source ~/.bashrc.
Confirm the SMK_PART_CFG environment variable is loaded
You can always confirm that that SMK_PART_CFG environment variable is loaded by typing echo $SMK_PART_CFG. It should print out the path to the profile. If it displays nothing (or some other data), the config won't work properly with that variable.
Generating a custom configuration file
There may be some instances where one might want to modify the configuration file, for instance to exclude certain partitions from being considered, or for adding lab-specific partitions to the config file.
We provide a script that will parse the partitions available to the current user to automatically to generate a config file:
/n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/generate_snakemake_partition_config.py
While you can run this script from our directory by providing the full path to Python (e.g. python /n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/generate_snakemake_partition_config.python), it may be easier to copy the script to your own directory:
cp /n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/generate_snakemake_partition_config.py generate_snakemake_partition_config.py
First, get a list of all partitions you have access to:
The table output lists the partitions and their resources. Use the selected column to determine if each partition will be written to the config file. To include a partition that is currently not selected or would be commented, use the --include flag:
Likewise, to exclude a partition that would be included, use the --exclude flag:
These options can be combined:
python generate_snakemake_partition_config.py --include part1,part2 --exclude part3,part4,part5 --list-partitions
Once the partition table includes all the partitions you want Snakemake to choose from, remove the --list-partitions option to generate the config file:
By default, this will write a file called cannon-snakemake.generated-<timestamp>.yml, but you can specify a file name with --output
python generate_snakemake_partition_config.py --include part1,part2 --exclude part3,part4,part5 --output my-custom-config.yml
See the cannon-snakemake-cfg repo for more documentation related to the script.
Modifying the configuration file manually
If you don't wish to run the Python script above, you can always manually edit the configuration file.
In both cases (excluding partitions or including lab partitions), the first step will be to copy the configuration file:
cp /n/holylfs05/LABS/informatics/Everyone/internal-share/cannon-snakemake-cfg/cannon-snakemake.yml cannon-snakemake.cfg
This will make a copy of the file in your current directory. You could also rename the file by changing the second file name in the cp command, say to cannon-snakemake-<project>.cfg or cannon-snakemake-<lab name>.cfg, depending on the circumstances.
Adding lab-specific partitions
After you've made your copy of the config file, you can edit it in any text editor to add a partition. Simply add an entry under partitions::
partitions:
<PARTITION NAME>:
max_runtime: <INT; MINUTES>
max_mem_mb: <INT; MEGABYTES>
max_cpus_per_task: <INT>
max_nodes: <INT>
max_threads: <INT; NUMBER OF CPUS ON A GIVEN NODE>
Replace
In most cases, max_cpus_per_task can equal max_threads and max_nodes will be 1.
For the other resources, you likely already have them documented internally. If not, you may be able to get max runtime with:
and max CPUS and memory with:
sinfo -p <PARTITION NAME> -N -o "%c %m" | awk 'NR>1 {
if ($1 > max_cpus) max_cpus=$1;
if ($2 > max_mem) max_mem=$2
} END {
printf "MaxCPUsPerNode=%d MaxMemPerNode=%dMB\n", max_cpus, max_mem
}'
Again, replacing
Exclude a partition from consideration
To exclude a partition from being considered for submission, simply comment that partition out in your copy of the config file by adding # symbols to the beginning of each line referencing that partition. For example, if one wanted to exclude the shared partition, they would edit that section in the config to look like this:
# shared:
# max_runtime: 4320 # 3 days
# max_mem_mb: 184000 # 184 GB
# max_cpus_per_task: 48
# max_nodes: 1
# max_threads: 48
Gregg Thomas