In a massive international collaboration, researchers systematically analysed cancer genome sequences from 2,658 patients across 38 types of cancer. The sequences have been compiled into a database that researchers and clinicians can access to learn about cancer, diagnoses and treatments.
When 1,300 scientists from 37 countries combine to map the human cancer genome, things really happen.
In a mammoth international collaboration, the researchers analysed whole cancer genome sequences from 2,658 patients across 38 types of cancer.
Based on this comprehensive data analysis, they have created a huge open-source catalogue of all the DNA mutations that either caused the cancer or were caused by the development of cancer.
Doctors and researchers worldwide can access the catalogue and use it as a reference in their research or in guiding treatment and diagnosis.
“Technology has evolved, and whole-genome sequencing has become much less expensive. This therefore enabled a comprehensive study of whole genomes from tumours and improved understanding of the genetic background of the development of cancer,” explains a Danish contributor to this enormous research project, Jakob Skou Pedersen, Professor, Department of Clinical Medicine, Aarhus University.
The research collaboration, Pan-Cancer Analysis of Whole Genomes, has lasted 7 years. Jakob Skou Pedersen, together with Gad Getz from the Broad Institute of MIT and Harvard, led the research group analysing point mutations.
The Pan-Cancer Analysis of Whole Genomes project led to this article in Nature and 22 other articles in Nature or other Nature journals.
Researcher involved in the project from the start
Another Danish contributor to the project is Joachim Weischenfeldt, Group Leader, Biotech Research & Innovation Centre, University of Copenhagen and Finsen Laboratory, Rigshospitalet, Copenhagen.
He has been involved since the project started and also helped to develop the analytical tools needed to manipulate the massive quantity of data.
Joachim Weischenfeldt, Rameen Beroukhim, Dana-Farber Cancer Institute, Boston, United States and Peter Campbell, Wellcome Sanger Institute, Hinxton, United Kingdom, led the analysis of the type of mutations called rearrangements. He says that the new database is unique because the researchers have not merely analysed 1% of the protein-coding genes but the whole genome.
“Considerable disease genetics lies outside the protein-coding element of the genome. Examining this region of the genome is therefore important and not just the 1% comprising protein-coding genes. An important goal therefore involved careful analysis and interpretation of this 99% of our DNA for aberrations in relation to cancer and thereby creating the largest whole-genome database ever created on cancer before treatment begins,” says Joachim Weischenfeldt.
Fewer new mutations than expected in the non-coding region of the genome
Cancer causes many protein-coding mutations in the DNA in which some building blocks of DNA replace others, and those mutations that promote cancer development are called driver mutations. For example, mutations that affect genes that control cell division can cause uncontrolled cell division and cancer.
However, all mutations do not necessarily lead to cancer.
Sun-damage to skin and smoking, for example, cause tens of thousands of mutations in DNA, but only a handful are genomic drivers in tumours. The other mutations, passenger mutations, do not have a role in promoting cancer development.
Researchers searching for drivers in tumours have always focused on mutations in genes encoding key proteins in the cells. In the new research, however, Jakob Skou Pedersen and colleagues put great effort into finding significant mutations in the non-coding regions of the genome.
These regions of the genome can play a role in regulating the expression of genes.
The researchers found a few significant mutations in the non-coding region of the genome but far fewer than expected.
“Our analysis shows that only about 10% of the driver mutations lie outside the protein-coding regions of the genome. In this part, we found less than 10 gene regulatory regions that are clearly important in developing cancer, but we expected many more,” says Jakob Skou Pedersen.
Both confirms and refutes previous discoveries
Overall, the research confirms the importance of several drivers in both the protein-coding region and the non-protein-coding region of the genome.
The analysis also casts doubt on the importance of several mutations previously thought to be drivers.
In addition, the analysis identified new mutations that are important for developing various types of cancer.
“Determining whether a mutation is a driver for developing cancer requires gathering enough data on tumours to verify the mutation across multiple patients. But the data analysis tools must also be in place to be able to manage and analyse the vast amount of information available across the entire human genome of more than 2,600 cancer patients,” explains Jakob Skou Pedersen.
Rearrangements in the non-coding region of the genome often cause cancer
Joachim Weischenfeldt’s analysis of DNA rearrangements (also known as structural variants) in both the protein-coding and the non-protein-coding regions of the genome show a different picture than for point mutations.
DNA rearrangements, in which parts of the chromosome are cut and assembled incorrectly, often occur in the non-protein-coding region.
According to Joachim Weischenfeldt, this is because rearrangements can affect how cancer genes are regulated without directly mutating the cancer gene. In addition, rearrangements can affect many genes at the same time, thus amplifying the expression of several cancer genes in an inappropriate direction.
“In the Pan-Cancer Analysis of Whole Genomes, we have identified important driver rearrangements of non-protein-coding regions of the genome” says Joachim Weischenfeldt.
The same mutations cause different types of cancer
This large catalogue of driver mutations in known cancer genes that emerged from the research project may be a source of even more future discoveries related to cancer research, treatment and diagnosis.
For example, researchers can compare their own data with data from the catalogue and thereby identify whether their discovery is likely to play an important role in development of cancer.
But the idea is also that more information will be added to the database in the future so that researchers and doctors can also see how their colleagues around the world have approached treatment and diagnosis of people with different types of cancer that result from specific mutations.
Specific mutations often lead to different types of cancer in different tissues
For example, one mutation can cause both breast cancer and pancreatic cancer.
Thus, although two types of cancer arise in different tissues, determining whether one type of treatment can affect both types may be relevant, since the same mutation causes them.
“Today, doctors can whole-genome sequence a tumour and examine whether it overlaps with the mutations that we and others have identified as pathogenic. This can be used clinically to make diagnosis more accurate and, in some cases, identify medicine that is effective when specific mutations are present,” says Jakob Skou Pedersen.
“Cancer is a genetic disease, and treatment based on genetics is often also important rather than just basing it on the tissue in which the tumour originated. This research supports the efforts to create personalized treatment based not only on the general pattern of the disease but also on the specific mutations the person has. Our mapping of whole cancer genomes provides an important foundation, but there is still a long way to go. Many people with cancer do not currently have any mutations that can be used for personalized treatment, so this requires focusing on translational research to link our catalogue with clinical treatment,” concludes Joachim Weischenfeldt.
“Analyses of non-coding somatic drivers in 2,658 cancer whole genomes” has been published in Nature. In 2018, the Novo Nordisk Foundation awarded a grant to Jakob Skou Pedersen for the project Discovery, Functional Characterization, and Cancer Prognostic Potential of Human Intronic RNAs Based on Analysis of Thousands of Total RNAseq Samples.