Metadata

Two years ago, my colleagues published FAIRly big: A framework for computationally reproducible processing of large-scale data. In this paper, they describe how to partition a large analysis (their example: processing anatomical images of 42 thousand subjects from UK Biobank), using DataLad to provision data and capture provenance, so that individual results can be reproduced on a laptop, even though a cluster is needed to run the entire group analysis. The article is accompanied by a workflow template and a tutorial dataset. ...