Snakemake + DataLad + Worktrees: Automated Pipelines with Provenance Tracking

Previously on git worktrees… In the previous post, I introduced a workflow for running datalad run commands in a dedicated git worktree in a different branch while continuing development in the main worktree. The batch-processing script was a plain bash loop — it got the job done, but it had no notion of what had already run, what was stale, or what depended on what. If the script failed half-way, a rerun after the fix would either rerun everything again that did not fail, or require me to manually comment out jobs. It also only ran one procedure for multiple subjects, but not all procedures for one subject. ...

2026-04-18 · 14 min · 2909 words · Jiameng Wu