Artwork © by Greg Findlay
Multicellular systems develop from single cells through a lineage, but current lineage tracing approaches scale poorly to whole organisms. Here we use genome editing to progressively introduce and accumulate diverse mutations in a DNA barcode over multiple rounds of cell division. The barcode, an array of CRISPR/Cas9 target sites, records lineage relationships in the patterns of mutations shared between cells. In cell culture and zebrafish, we show that rates and patterns of editing are tunable, and that thousands of lineage-informative barcode alleles can be generated. By sampling hundreds of thousands of cells from individual zebrafish, we find that most cells in adult zebrafish organs derive from relatively few embryonic progenitors. Genome editing of synthetic target arrays for lineage tracing (GESTALT) will help generate large-scale maps of cell lineage in multicellular systems.
The data for the paper is publicly available on both the NCBI's Gene Expression Omnibus website with dataset identifier GSE81713, as well as on the Dryad data repository here. For each sample, both raw reads (in SRA format) as well as statistics (stats) files are included. Let us know if you see anything missing or incomplete and we'll get it fixed.
In addition for aggregate lineage experiments, like the individual adult fish or the cell culture lineage, trees are available in PHYLIP Mix newick output file (*.newick.txt.gz), as well as our custom JSON file type (*.json.gz) that we adapted for visualization using the Data Driven Documents (D3) library.
For convenience we've also included aggregated allele files for each of the adult fish, the cell culture data, and the sets of embryos. This files include only UMI tagged amplicons that passed filtering.
The code used in the paper is available on the Shendure lab GitHub site here. We've included all the code to process raw reads into event calls, as well as much of the visualization and analysis scripts used in the paper.
To help users dig into the data, we've created a (very) initial interactive tool for visualizing trees produced in GESTALT. Included are the adult trees highlighted in the paper, the cell culture lineage trees, as well as trees created from individual embryos, many of which didn't make the paper. You can play around with it here.