Pioneering research finds missing pieces in the genomic puzzle

Tech Science 6. jul 2018 2 min Postdoc Jonas Andreas Sibbesen Written by Morten Busch

Today, we can get our genome sequenced for less than DKK 4000 and find out how the small changes in our genome might affect the risk of various diseases. The way computer programs compare genomes has primarily focused on these small changes, but major changes in the genome have often been overlooked. Now Danish researchers have developed a new algorithm that finds the pieces that are often overlooked in the enormous genomic puzzle. This new method is expected to be applied in important ways for the personalized medicine of the future.

Interested in Tech Science? We can keep you updated for free.

Follow Tech Science

Jonas Andreas Sibbesen

Computational and RNA Biology, University of Copenhagen

Follow Jonas

Determining the sequence of a person’s genome is similar to a jigsaw puzzle. The current technology cannot actually decode the entire genome. Instead, it produces a gigantic puzzle comprising billions of small pieces, and advanced algorithms must assemble them before the genetic profile can be decoded.

“Analysing genome sequencing data requires laying each individual piece on top of a set of known pieces, called the reference genome. New pieces, such as genomic insertions, are therefore easily overlooked because placing them correctly on the reference genome is difficult. We have developed a new computer algorithm that creates this genomic reference in 3D. This offers greater opportunities for discovering the complex and often overlooked genomic changes and thus provides a clearer image of the genomic landscape,” explains a main author, Jonas Andreas Sibbesen, Section for Computational and RNA Biology, Department of Biology, University of Copenhagen.

Hard to process the extra pieces

Genome sequencing has become affordable for almost anyone. For a few thousand Danish kroner, people can have their entire genome sequenced and thus obtain information on variants in their genome and how these might affect their risk of developing various diseases such as cancer and metabolic diseases.

“Providing these answers requires advanced computer algorithms that can assemble the genomes and compare them with a standard genome. Paradoxically, the algorithms used so far have primarily discovered the smaller genetic variants in the genome, but the major variants such as genomic insertions have remained a blind spot for researchers.”

One approach to assembling the genomic puzzle involves placing the pieces from the start without knowing the picture portrayed in the puzzle. With billions of pieces, this task is incredibly time-consuming and laborious. This is why the assembly method is seldom used. Mapping is therefore often the preferred method; here the tiny pieces are instead embedded onto a reference genome – a known puzzle. This makes the analysis much easier. However, in areas in which the individual sequenced and reference genomes differ greatly, this technique can result in variants being overlooked.

“For example, we know that there are many variants in the HLA region, which encodes for genes that play key roles in our immune system. The pieces there can differ so greatly from the reference genome that embedding them is almost impossible, resulting in many variants in this region not being visible.”

The researchers’ new algorithm uses a new approach: instead of working with a randomly selected reference genome, genetic variants from many individuals can be used simultaneously.

“This trick provides much greater opportunities to use genetic variants known from previous studies in analysing new individuals, which increases the sensitivity for more complex forms of genetic variation. You could say that, instead of embedding the pieces in a single individual, we embed them in thousands of individuals simultaneously.”

Revealing the dark patches

Genome sequencing data have already revolutionized the opportunities for researchers and doctors to investigate the human genome, and this trend will increase in the future. In Denmark, the GenomeDenmark project has mapped the Danish reference genome, and this was the basis for a research group from the Section for Computational and RNA Biology at the Department of Biology of the University of Copenhagen developing the new and pioneering algorithm.

“In the GenomeDenmark project, we used our algorithm to significantly enlarge the spectrum of genetic variants that can be identified from such data. This especially applied to the more complex variations such as large deletions and insertions in the genome, where we discovered many new and previously unseen variations.”

The ability to better visualize the previously dark patches on the genetic map is expected to be applied in important ways for personalized medicine, in which charting an individual’s genetic profile will play a role in choosing treatment.

“As more and more countries launch these large-scale national genome projects, having algorithms that can give doctors a more complete genetic picture is increasingly essential. The goal is therefore to continually become better at discovering new variations in our genomes because this will probably help in providing more answers as to why we become ill and how we need to be treated.”

“Accurate genotyping across variant classes and lengths using variant graphs” has been published in Nature Genetics. Lasse Maretty and Anders Krogh from the Bioinformatics Centre, Department of Biology, University of Copenhagen are co-authors. The Novo Nordisk Foundation and Innovation Fund Denmark funded the project.

Follow Tech Science

Postdoc

Jonas Andreas Sibbesen

Computational and RNA Biology, University of Copenhagen

Follow Jonas

Current methods for genotyping structural variation, from high-throughput sequencing data, are generally based on comparing the reads to a linear refe...

Tech Science

2. jan 2022 2 min

Pioneering research finds missing pieces in the genomic puzzle

Interested in Tech Science? We can keep you updated for free.

Jonas Andreas Sibbesen

Hard to process the extra pieces

Revealing the dark patches

Jonas Andreas Sibbesen

Related articles

Spider silk has no antimicrobial effect

People may test positive for SARS-CoV-2 in a rectal swab despite testing negative in a throat swab

Optimising the use of large biobanks and genetic databases

Researchers intended to develop a better photovoltaic cell but instead discovered a new particle

Proteins reveal cancer patients’ risk of relapse

Genetic variants associated with multiple sclerosis protected European ancestors from infectious diseases

People with diabetes have disrupted circadian insulin production

Exciting topics

See all 1019

Medicine 97

Fertility 19

Food 22

Protein 125

Screening 32

Hormone 58

Antibodies 24

Plastic 10

DNA 49

Exercise 39

Chemotherapy 13

Schizophrenia 14

Environment 93

Brain 116

Kids 70

Alcohol 27

Bacteria 117

Heart 71

Podcasts 14

Alzheimers 19

Biology 25

Big data 82

Chromosomes 20

Stress 29

Influenza 15

Metabolism 68

Diabetes 131

Fat 64

Cholesterol 19

Immune system 71

Vaccine 46

Treatment 112

Obesity 97

Technology 49

Stem cells 34

Evolution 48

Micromolecules 22

Asthma 9

Antioxidants 4

Depression 28

Mental health 50

Smoking 21

Eyes 9

Autism 23

Sound 9

Language 7

Sleep 22

Diet 48

Puberty 11

Nerves 26

Climate 32

Cells 49

Cystisc fibrosis 13

Liver 39

Microbiome 30

Plants 42

Ageing 28

Dementia 13

Recycling 4

Future 1

Gut 46

Virus 89

Disease 44

Muscles 39