Human cell atlas: mapping the building blocks of life

By Ellie Fung

In 2003, the completion of the Human Genome Project (HGP) marked a major milestone in biological research. The collaborative efforts of research groups worldwide, later boosted by the development of next-generation sequencing technologies, culminated in the first fully sequenced human reference genome.¹ Since then, the HGP has transformed research in human biology and disease. Large-scale projects, such as the HapMap Project and the 1000 Genomes project, were soon launched to understand the genetic variation underlying human health and disease. The HGP laid the foundation for systems biology and omics, enabling novel biological insights to be drawn at several molecular levels.²

Analogous to the HGP, the Human Cell Atlas (HCA) is an ongoing international effort to generate an openly accessible, foundational reference map for all cell types in the human body. The completed atlas would include comprehensive molecular profiles for each cell type, their relative abundances, function, spatiotemporal localisation patterns and changes in cell state under different contexts. The relationships between cellular populations would also be mapped out.³

The HCA Initiative was launched in 2016, prompted by the emergence of single-cell technologies. Earlier sequencing approaches, performed on bulk tissue samples, yielded average genomic and transcriptomic measures across all constituent cells, and thus failed to capture a tissue’s underlying cellular heterogeneity. In contrast, single-cell genomic technologies profile the molecular properties of every individual cell within a tissue sample.³ These ultra-high-resolution technologies help in elucidating the intricate molecular and cellular networks comprising human tissue.

Notably, single-cell RNA-seq (scRNA-seq) can profile total mRNA expression of individual cells, revealing heterogeneity and spatiotemporal dynamism of a given tissue at a cellular resolution.⁴ Distinct cell populations and transient cell states could be identified based on similarities of gene expression profiles between individual cells. This can reveal the complex cellular composition of tissues under certain conditions.⁴ Characterising whole gene expression of unique cell populations in a tissue may also allow for bioinformatic inference of interaction patterns between cell types and gene regulatory networks. Furthermore, single-cell data gathered across different developmental time points can reveal the developmental trajectory of a specialised cell of interest, from the earliest multipotent stem cells to mature terminally differentiated cells. The cellular and molecular factors that influence cell fate decisions, transitions between differentiation states and cell growth may be dissected as well.⁵

Since scRNA-seq requires tissue dissociation, positional information of cells in their broader endogenous context is lost. Yet, a cell’s physical location is an important determinant of its molecular properties as gene expression is strongly influenced by signals from the microenvironment and nearby cells.⁶ This was overcome by the development of spatially resolved transcriptomics, which profiles the transcriptomes of individual cells while preserving their spatial localisation patterns. With these technologies, the complex cellular architecture and physical cell-cell relationships within tissues can finally be uncovered. Given its potential to reveal deeper biological insights, spatially resolved transcriptomics was even named Nature’s Methodology of the Year in 2021.⁷

Owing to single-cell technologies, the HCA now contains data for many tissues at various contexts. Although it is far from complete, the HCA has already permitted for a better understanding of human biology and disease. Notably, the HCA community currently dedicates much effort into enriching knowledge of SARS-CoV-2 biology and pathology.⁸ Pooling scRNA-seq data across 25 tissues identified cell types that were susceptible to infection according to expression of ACE2, the SARS-CoV-2 receptor. The abundance of ACE2 in type II alveolar lung cells and gastrointestinal enterocytes suggested transmission occurs via the respiratory and oral-faecal routes.⁹ The analysis further implicated that cell types in diverse organs may be infected, concurrent with the range of multi-organ symptoms observed in COVID-19 patients. As many of these cells were rare or exhibited low ACE2 expression, they could not have been identified by traditional bulk tissue sequencing.⁹ Integrating HCA data with epidemiological factors also identified at-risk populations.⁸ Ultimately, these insights may go on to inform clinical care and therapeutic development. 

The HCA offers the opportunity to uncover non-conserved, human-specific aspects of development. Traditional use of model organisms in developmental biology have imparted key insights into how a fertilized egg develops into a fully formed foetus at birth, but not all identified principles may be directly applied to humans.¹⁰ The Human Developmental Cell Atlas (HDAC), a branch of the HCA, aims to produce a comprehensive reference map of cell types and cell states throughout human development.¹¹ Accordingly, these atlases may illuminate the poorly understood mechanisms underlying congenital disorders, which arise from perturbations in prenatal development. 

Developmental atlases are also illuminating how dysregulated developmental pathways are involved in pathogenesis in childhood- and adult-onset cancers.¹¹ Malignant pathology of several cancers often involves the re-emergence of developmental molecular programs. For instance, the transcriptomic profile of Wilms’ tumour cells resembled that of foetal cell states during kidney formation, suggesting that some renal carcinomas have developmental origins and establish malignancy by co-opting foetal programs.¹² Recently, scRNA-seq of foetal skin suggested that foetal genetic programs re-emerge in the pathogenesis of inflammatory skin diseases¹³, indicating that developmental pathways underlie a broader range of diseases than previously appreciated. 

With the advent of cutting-edge technologies, significant headway has already been made in the HCA. However, many roadblocks still need to be overcome before the atlas can be completed. There is a wide range of scRNA-seq and spatially resolved transcriptomic approaches, each differing in throughput and resolution. Computational data integration allows for deeper, more comprehensive insights of tissues to be gathered. Yet, this is highly challenging, as each approach generates an abundance of noisy, multiplexed data with various parameters. There is thus a need for novel algorithms to efficiently analyse such complex data, and more importantly, harmonise diverse datasets.⁴˒¹⁴ Furthermore, the robustness and amenability to wider application outside the laboratory from which these technologies have been developed is unknown.⁶

Once these challenges are resolved, the HCA would greatly accelerate biomedical research and clinical application. By providing a comprehensive reference of the cells in the human body at various spatiotemporal contexts, gene expression at different physiological states can be determined. This allows for the effects of disease-causing mutations to be studied at the cellular level. Expression signatures derived for unique cell types at different states may then enable for disease diagnosis and monitoring.³ These signatures may also be harnessed in experimental assays to isolate, label and track cells in disease models, or to assess the physiological impacts of small molecule drugs on target cells.³ Furthermore, by revealing the differentiation trajectory of various cell types and the molecular factors required in the process, the HDCA may inform the development of regenerative medicines, such as stem cell therapies and tissue replacement.¹¹

Akin to the HGP, completion of the HCA would transform the way human biology and disease is studied and usher in a new era of biological research. Importantly, the HCA is yet another example of the scientific feats that can be accomplished by collaborative efforts across global scientific communities. 


(1) International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature. 2004; 431 (7011): 931-945. 10.1038/nature03001. 

(2) Hood L, Rowen L. The Human Genome Project: big science transforms biology and medicine. Genome Medicine. 2013; 5 (9): 79. 10.1186/gm483. 

(3) Regev A, Teichmann SA, Lander ES, Amit I, Benoist C, Birney E, et al. Science Forum: The Human Cell Atlas. eLife. 2017; 6 e27041.

(4) Chen G, Ning B, Shi T. Single-Cell RNA-Seq Technologies and Related Computational Data Analysis. Frontiers in Genetics. 2019; 10 10.3389/fgene.2019.00317/full. 

(5) Griffiths JA, Scialdone A, Marioni JC. Using single‐cell genomics to understand developmental processes and cell fate decisions. Molecular Systems Biology. 2018; 14 (4): e8046-n/a. 10.15252/msb.20178046. 

(6) Asp M, Bergenstråhle J, Lundeberg J. Spatially Resolved Transcriptomes—Next Generation Tools for Tissue Exploration. BioEssays. 2020; 42 (10): e1900221-n/a. 10.1002/bies.201900221. 

(7) Marx V. Method of the Year: spatially resolved transcriptomics. Nature Methods. 2021; 18 (1): 9-14. 10.1038/s41592-020-01033-y. 

(8) Teichmann S, Regev A. The network effect: studying COVID-19 pathology with the Human Cell Atlas. Nature Reviews Molecular Cell Biology. 2020; 21 (8): 415-416. 10.1038/s41580-020-0267-3. 

(9) Sungnak W, Huang N, Becavin C, Berg M, Queen R, Litvinukova M, et al. SARS-CoV-2 entry factors are highly expressed in nasal epithelial cells together with innate immune genes. Nature Medicine. 2020; 26 (5): 681-687. 10.1038/s41591-020-0868-6. 

(10) Müller WA. Model Organisms in Developmental Biology. In: Müller WA. (ed.) Developmental Biology. New York, NY: Springer New York; 1997. pp. 21-121.

(11) Haniffa M, Taylor D, Linnarsson S, Aronow BJ, Bader GD, Barker RA, et al. A roadmap for the Human Developmental Cell Atlas. Nature. 2021; 597 (7875): 196-205. 10.1038/s41586-021-03620-1. 

(12) Young MD, Mitchell TJ, Vieira Braga FA, Tran MGB, Stewart BJ, Ferdinand JR, et al. Single-cell transcriptomes from human kidneys reveal the cellular identity of renal tumors. Science. 2018; 361 (6402): 594-599. 10.1126/science.aat1699. 

(13) Reynolds G, Vegh P, Fletcher J, Poyner EFM, Stephenson E, Goh I, et al. Developmental cell programs are co-opted in inflammatory skin disease. Science. 2021; 371 (6527): 364. 10.1126/science.aba6500. 

(14) Larsson L, Frisén J, Lundeberg J. Spatially resolved transcriptomics adds a new dimension to genomics. Nature Methods. 2021; 18 (1): 15-18. 10.1038/s41592-020-01038-7. 

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s