Genomics on the Command Line

Workshop Developers:Danielle Khost · Last updated: July 9, 2025

This is a four-part beginner workshop designed to help researchers build confidence in taking a practical first pass look at genomics data on the command line. The goal is not to teach in-depth scripting in Bash, Python, or LLM-based workflows, but to help participants develop a hands-on feel for common bioinformatics file formats, the basic tools available for inspecting them, and the kinds of questions they can answer quickly when something looks off.

By the end of the workshop, students will:

be familiar with the “ingredients” (aka command line tools to look at ones data) available to them for pulling out overviews
know where to begin troubleshooting problems
start developing the bioinformatics intuition about their data (e.g. "this file looks weird, I wonder if it's because of X?")

The class will be structured around a series of example file format issues that students will likely encounter and that we will work through the diagnosis and troubleshooting of together. We will be using a combination of command line tools such as grep, awk, samtools, bedtools, and bcftools to explore the data and answer questions about it.

We will be working with common file types including sequence files (FASTA/Q), alignment files (BAM/SAM), and tabular genomic data such as TSV, CSV, BED, GFF, and GTF.

This workshop assumes you have some basic knowledge of the Linux command line. If you know several simple commands like ls, cd, cp, and mv you should be ok. However, we will go over all the basics on the first day.

Setup instructions

These workshops are Jupyter notebooks, which are documents that contain interactive code blocks interleaved with formatted text. You can participate in this workshop by clicking the links below which will open the respective notebooks in Google Colab . That's it! The notebook should load and be able to run without any additional setup.

We will primarily be using the terminal from within the Jupyter notebook to run the commands, but the text/explanation of the workshop will be written in the notebook so you can follow along.