The need for diversity in genome sequencing

Among the various things that unite humans around the world, DNA sequences hovers at the top: 99.9% of human DNA sequences are identical between humans.

Gregor Mendel, a monk and scientist whose 200th birthday falls this Wednesday (July 20), suggested that some “unseen factors” were responsible for the various properties we display. Today, we know that these factors are genes, which make up our DNA, or deoxyribonucleic acid.

This acid molecule gives genetic instructions to living organisms. If humans share much of the same DNA, why is diversity important in the context of DNA sequencing?

To understand this, we have to shift our focus to 0.1% of the variation in the human DNA sequence. The seemingly small difference stems from the differences between the nearly 3 billion bases (or nitrogen-based compounds) in our DNA.

All the differences that we know between different human beings including hair or eye color or a person’s height, are due to these differences.

However, over the years scientists have found that these differences can also give us vital information about an individual’s or population’s risk of developing a particular disease.

We can then use risk assessment from genetic data to design a health care strategy tailored to the individual.

Genetics and disease risk assessment

Many of us have experience filling out forms in the doctor’s office asking about the various illnesses our parents or relatives have suffered. You are warned to stay away from sweets and processed sugars if a parent has diabetes, for example.

While the transmission of heart disease, cancer and diabetes from one generation to the next is more commonly known, there are many diseases that can be hereditary.

For example, we know that sickle cell anemia occurs when a person inherits two abnormal copies of the gene that makes hemoglobin, a protein in our red blood cells, one from each parent.

In recent decades, genetic research has advanced to the point where scientists can isolate the genes responsible for many of these diseases.

And herein lies the problem: We know this link between genes and disease for a very limited population.

central data

Sarah Tishkoff, a geneticist and evolutionary biologist at the University of Pennsylvania in the US, is one of many in the scientific community urging more diverse genomic data sets.

“Suppose a study focusing on people of European ancestry identifies genetic variants associated with risk of heart disease or diabetes, and uses this information to predict disease risk in patients not included in the original study,” Tishkoff said.

“We know from experience that this prediction of disease risk does not work well when applied to individuals of different ancestry, especially if they have African ancestry.”

Historically, the people who provided DNA for genomics research were mostly of European ancestry, “creating gaps in knowledge about the genome from people in the rest of the world,” according to the National Human Genome Research Institute (NHGRI) in the United States.

The institute states that 87% of all our genome data are from individuals of European ancestry, followed by 10% of Asian descent and 2% of African ancestry.

As a result, the potential benefits of genetic research, which include understanding early diagnosis and treatment of various diseases, may not benefit underrepresented populations.

Unfair treatment

The problem does not stop with assessing the risk of disease. Jan Witkowski, a professor in the Graduate School of Biological Sciences at Cold Spring Harbor Laboratory in the US state of New York, says it is permeating the healthcare fair space, too.

“Suppose you have two groups: Group A and Group B, which are completely different. The knowledge and information you learn about people in Group A may not apply to people in Group B, but imagine developing medical treatments based on information from Group A.” Just for everyone,” he said, adding, “You won’t work with group B.”

By including diverse populations in genomic studies, researchers can identify genetic variants associated with different health outcomes at the individual and population levels.

The NHRI also states, however, that diversifying those involved in genomics research is costly and requires establishing long-term and respectful relationships of trust between communities and researchers.

Leave a Comment