Earlham Institute



Who we are:

The Earlham Institute (EI) explores living systems by applying biotechnology and computational science to answer ambitious biological questions and generate enabling resources. EI’s researchers brings together a wealth of expertise in biosciences, genomics, bioinformatics, high performance computing and statistics to understand complex biological systems and their interaction with the environment. We have access to cutting-edge, diverse genomic and computational technologies, coupled with a unique provision to efficiently produce and process biological data at a large-scale. We also host one of the largest computing hardware facilities dedicated to life science research in Europe.

What we’re doing in the Darwin Tree of Life Project:

The DToL team at EI is involved in two workstreams within the DToL grant. First our eInfrastructure team is responsible for adapting our COPO platform to broker the metadata for all samples in the project. This involves close collaboration with the Genome Acquisition Labs to develop and maintain a fit-for-purpose metadata collection manifest and with the Sanger Sample Tracking System and the ENA to ensure that all metadata is correctly linked to each sample through every stage of the process. Secondly, we are developing a novel wet-lab and bioinformatic pipeline to generate high quality annotated genomes from single-celled, eukaryotic protists. In this second endeavour, we are working closely with the Oxford GAL, who are collecting and sorting environmental water samples into single cell experimental unit. After the samples arrive at EI, we are responsible for all stages from cell to genome, including DNA and RNA extractions, sequencing, assembly and annotation.

Why we’re invested in the Darwin Tree of Life Project:

Single-cell protists are the most biodiverse groups of organisms on the planet, harbouring novel traits, evolutionary paths and bioeconomical potential. Despite their clear scientific and economic interest, the clear majority of this group remains underexplored and underrepresented in sequencing databases. This is because they are by definition small, hard to identify visually, and, in most cases, difficult to culture. We want to transform the current situation by removing the technological roadblocks that are hindering scientific investigation of these taxa. We aim to enable this endeavour with the provision of a high-throughput, end-to-end pipeline resulting in high quality annotated genomes from individual cells. Our vision is for this pipeline to be used in the global Earth Biogenome Project (EBP) to enable a watershed movement in protist studies ranging from discovery work to fundamental blue-skies research to studies with a translational focus.

Systematically collecting metadata associated with every sample collected for the DToL endeavour will increase the reach of the data, allowing it to be used in a greater breadth of studies. EI is passionate about correctly curating and maintaining metadata associated with all biological samples. The COPO data brokering platform that was developed at EI will play a crucial role in the DToL project and we hope that be next adopted and scaled to the EBP.

Our people in DToL are: