Bioinformatics

A complete human genome, a complete view on humans

Complete Human Genome
Photo: All You Need is Biology (Wordpress)

Iris Mestres, a BDBI student, writes about the Human Genome Project and having at last the whole sequence, what it is and why it is so important for all of us. A tremendous step towards the understanding of both our development and the reason behind some diseases.

The world of scientific research has been rocked these days by the news of having, at last, the whole sequence of a human genome. Since the ‘Human Genome Project’ started in 1989 there was the aim of achieving the whole sequence. In April 2003 they reached 99% of the gene-containing part of the human sequence – nearly 92% of the whole genome – and 18 years later, 100% of the genome has been achieved. But, what’s a genome? Why has it been so important to have it all sequenced? What’s going to be next?

Each living being is made up upon a set of instructions, like a manual known as DNA. DNA is a molecule with two strands each of which has 3 billion letters. But not any letter can be in there, it takes only 4 letters – A, C, G and T, which stand for the ‘nucleotides’ Adenine, Cytosine, Guanine and Thymine respectively – to build all the instructions making us as unique as we are. In fact, our singularities are encoded in different amounts of letters known as genes. Furthermore, all these genes are shared out between the 23 human pair of chromosomes, one of which is the one determining the sex – XX for women and XY for males –. Still, that’s not everything; not every group of letters in chromosomes are coding for a gene, for a physical characteristic. There are some parts, known as non-coding DNA sequences, with different functions – such as deciding whether a gene is expressed or not – or even with unknown functions.

It may seem quite messy but it’s easier if the DNA is pictured as an encyclopedia. The whole set of volumes would be the DNA and each book, individually, a chromosome. Every entry, every article, would be a gene and finally, the words contained in each entry are the bases composing the DNA. Also, non-coding sequences may be pictured as the spaces between lines or between entries.

Before going further, it is important to stress that despite humans sharing 99.9% of the sequence, there’s that 0.1% which is making us unique. This tiny portion may seem insignificant but it’s not as it contains mostly all the sequences that encode genes. Genes are not only responsible of giving you brown or red hair, freckles or any other physical trait but also how all your organs and cells function. A single letter change – called genetic variant or mutation – inside a gene can either make your hair lighter or make you develop type 2 diabetes.

With this big picture, we can define now what the genome is. The genome is all the genetic material of an organism, normally it’s DNA. The genome is the complete sequence of all the nucleotides that form all the chromosomes of an organism. The genome is the whole encyclopedia, the full set of instructions that build us up.

In 2003, when almost all genomes had been sequenced, scientists called their result ‘The human genome reference sequence’. This genome reference was made upon hundreds of samples, hundreds of researchers of different fields – from biologist to engineers – and also, hundreds of dollars. Having this genome reference was a huge step forward in genetics and genomics. We were able to study which mutations within a gene lead to develop certain diseases, how these diseases are transmitted to offspring and how genes are either activated or not by comparing the reference genome with any other one and look up for differences. But that was not enough, that reference genome was not completed, it had some blank gaps which lead to blindness in some aspects. To put it simply, imagine you want to prepare a pie, you open the recipe book and there are stains making it harder or impossible to read some of the instructions. You may have a general idea of how to do it but you won’t know how to proceed in some steps or some of the ingredients. So research was not over, and through the years, the reference genome has been amplified little by little. However, no one had ever declared to have a whole human genome, at least not until now.

A huge team of researchers from different institutions, universities and countries, with the leadership of Karen Miga and Adam Philippy had announced the achievement of having sequenced a whole human genome. This work has not been yet revised by experts so it might show some weaknesses in its assumptions, methods or conclusions so it’s only published in bioRxiv, a web page where studies are upload before formal publication in a scholarly journal. Nevertheless, this first truly complete sequence of a human genome represents the largest improvement to the human reference genome since its initial release. This complete human genome, achieved by the Telomere-to-Telomere (T2T) consortium, includes the gapless sequence for all 22 chromosomes and the chromosome X – as the sample comes from a woman – while correcting numerous errors. Also, chromosome Y remains to be sequenced, but the same consortium ensures it will be added by sequencing another genome. Overall, this report will help on having a bigger notion of the role played by genes in defining our unique characteristics and on the development of several diseases.

This feat is not the end of the road, there are still two issues to be faced. The first one is the fact that the human genome reference is made upon people with the same ancestor group, so made upon a human species. What may seem a negligible incident is quite important as what could be a good treatment for one ancestor group, could not be profitable for another one. As this reference genome can not represent all humanity, a whole set of genomic references for ancestrally different populations will collect and represent all human genomics diversity. In order to achieve this, the T2T consortium has started a project called The Human Pangenome. Last but not least, having the complete human genome or even having the human pangenome is not all we need to understand the role of genes in giving us different traits or how diseases appear. In the last few years, it has been shown that environmental factors – from the ones received before we are born to the ones we face through all our lives – may affect how genes work without changing a single letter in our genome. This is known as ‘epigenetics’ and it also plays a big role in how we are and change from our physical image to the internal functioning of our body.

On account of that, maybe there’s still a lot of work to be done, but the sequencing of a whole human genome is a tremendous step towards the understanding of both our development and the reason behind some diseases.

For further and more detailed information, you can read this article published in ‘The Atlantic’.

We also recommend you