Making Slurm Life a Little Easier

If you work in research or high-performance computing, you’re probably familiar with the Slurm Workload Manager. It’s a powerful tool for managing and scheduling jobs on a cluster, but let’s be honest, it can sometimes be a bit… verbose. I often find myself typing the same long commands over and over again, or wrestling with sbatch scripts for simple tasks.

To make my own life easier, and hopefully yours too, I’ve put together a small collection of bash scripts and helper functions called slurm-utils. My goal was to create a set of simple, easy-to-use tools to handle the most common Slurm tasks.

What’s Inside?

The repository is a mix of standalone scripts and interactive helper functions that you can use directly from your terminal.

Scripts

The main script in the collection is launch_marimo_server. This is a wrapper script that submits a Slurm job to run a Marimo notebook. It will even request a GPU for you if you need one. The best part is that the script waits for the job to start and then prints out the exact SSH tunnel command you need to connect from your local machine.

Here’s how you’d use it to launch a notebook on a CPU node:

launch_marimo_server notebooks/my_analysis.py

And if you need a GPU:

launch_marimo_server --gpu notebooks/deep_learning.py

Interactive Functions

I’ve also included a set of helper functions in the slurm_helpers.sh file that you can source in your .bashrc. These are great for interactive work.

  • quick_cpu: This function starts a 2-hour interactive CPU job. It’s perfect for when you need to quickly debug a script or do some development work on a compute node.
  • quick_gpu: Similar to quick_cpu, but this function starts a 2-hour interactive job with a GPU.
  • check_jobs: A simple shortcut for squeue -u $USER to see all of your current jobs.

Getting Started

Getting started with slurm-utils is easy.

  1. First, clone the repository:
    git clone git@github.com:AADeLucia/slurm-utils.git
    
  2. Next, copy the example configuration file and edit it to match your cluster’s setup. This is where you’ll set things like your default partitions and account information.
    cd slurm-utils
    cp config.sh.example config.sh
    vim config.sh
    
  3. Finally, add the scripts to your PATH and source the helper functions in your ~/.bashrc file.
    # This command must be run from inside the slurm-utils directory
    echo "" >> ~/.bashrc
    echo "# Load Slurm utility scripts and functions" >> ~/.bashrc
    echo "export PATH=\"$(pwd):\$PATH\"" >> ~/.bashrc
    echo "source \"$(pwd)/slurm_helpers.sh\"" >> ~/.bashrc
    source ~/.bashrc
    

And that’s it! You’re ready to go.

I hope you find these utilities as helpful as I have. If you have any suggestions or contributions, feel free to open an issue or pull request on the GitHub repository.




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Introducing Little Bird - A Python Package for Tweet Processing
  • How to Apply to a PhD in Computer Science
  • How to Write a Literature Review
  • Tips for Debugging PyTorch Errors