Training
We’ve aggregated some training materials on good practices for computational research. Please let us know what else would be useful.
-
Extracting data from PDFs with Tabula
Tutorial explaining how to extract data from tables in PDFs using Tabula, an open source project. The tutorial uses Docker.
-
Avian Influenza Nextstrain Build
Step by step instructions for Nextstrain installation, and pipelining with Snakemake, using an example dataset of publicly available nonhuman H3Nx data from Genbank.
-
Seasonal Influenza Nextstrain Quickstart Guide
How to build a Nexstrain analysis from GISAID data.
-
Git for Science
With the rise of digital methods in research, version control systems such as git serve as modern lab notebooks. Version control platforms (Github, Gitlab) also facilitate publishing and collaboration.
-
Code Structure: Part 1
Lean how to write simple, canonical code.
Part 1 covers the theoretical framework, functional design, the limits of Functional Programming, and some well-known software principles.
-
Code Structure: Part 2
Lean how to write simple, canonical code.
Part 2 covers interfaces, and a realistic example from modeling populations of pathogens.
-
Pathogen evolution, selection, and immunity
Trevor Bedford and Sarah Cobey teach a 2.5-day module on pathogen evolution, selection, and immunity for SISMID each July, with an emphasis on modeling and statistics. Our slides and exercises are here. Registration for the course usually starts in January.
-
Digital Validation in Research
Repeatability, in science and computation, is conceptually very simple. Make conditions the same, as exactly as necessary, and the process will repeat. Drop an apple and it will fall. There are, of course, details that can be omitted. Knowing which details are indispensable is essential to ensuring repeatability.
-
Nextstrain in HPC Environments
If you don't have the option to install Nextstrain on a local machine, cloud host, or another dedicated environment, a traditional HPC environment will work just as well with some adjustments. And it comes with learning how to use Snakemake, a useful tool for orchestrating complex jobs.
Suggestions?
If you have feedback on training, ideas for new modules, or collaboration proposals, please let us know.