Using machine learning to discover potential drug candidates among the body’s peptides

Tech Science 6. dec 2022 3 min Senior Scientist, Ph.D. Christian Toft Madsen Written by Kristian Sjøgren

People’s organs have a multiplicity of natural substances that act like therapeutic drugs to keep the whole body healthy. Now, researchers have developed a machine learning method to search for these natural drugs through the thousands of peptides, which can then be developed into actual drugs to combat disease.

Interested in Tech Science? We can keep you updated for free.

Follow Tech Science

Senior Scientist, Ph.D.

Christian Toft Madsen

Biomarker Identification, Global Translation, Novo Nordisk A/S

Follow Christian

The cells in our body swim in a soup of peptides that result from many biological processes.

Some peptides are bioactive and are absolutely essential to keep the body healthy, such as insulin and the gut hormone GLP-1, both of which have been commercialised to be among the world’s most commonly used drugs.

However, the vast majority of peptides result from the degradation of proteins and are therefore unimportant.

When researchers search for new drugs similar to insulin or GLP-1, they face the needle-in-a-haystack problem of identifying the few peptides that are bioactive. The hundreds of thousands of peptides in the body comprise a very large haystack in which to search for very few bioactive peptides.

Researchers have now solved this problem by training a machine learning model to help identify the few potentially useful peptides in the peptide soup, so that they can be examined individually in the search for new wonder drugs.

“Our problem is that when we examine a biological sample, this is a snapshot of protein degradation. However, because the many thousands of peptide fragments creates high background noise, identifying what is bioactive and what is not is very difficult. Our method finds the interesting signals in this background noise – like a magnet that can pull the needles out of the haystack,” explains a researcher involved in the study, Christian Toft Madsen, Senior Scientist, Novo Nordisk A/S.

The research has been published in Nature Communications.

Investigated 150,000 peptides

The researchers developed a machine learning model to identify the peptides in a biological sample that are probably bioactive and thus have potential as therapeutic drugs.

To show that the model works as intended, the researchers first purified peptides from 48 mice. The researchers individually examined peptides in the liver, muscles, brain, pancreas, intestine and two types of fat tissue.

To identify all the peptides, the researchers used mass spectrometry , which can map the peptide sequences based on the mass and charge of the peptides. This resulted in data on 157,857 unique peptide sequences.

The researchers then used machine learning to identify which of these 157,857 unique peptide sequences appeared to be bioactive.

The researchers had previously trained the machine learning model on known bioactive peptides, and the computer learned various features of bioactive peptides, including how the peptide is positioned in a pattern of other peptides and the contribution of a given peptide within a complex of other peptides.

Then the machine learning model could search for bioactive peptides in the soup of peptides in the sample and tell the researchers how close the individual peptides were to resembling bioactive peptides.

“We get a number from 0 to 1 for the individual peptide. The closer the score is to 1, the more likely the machine model has predicted that the peptide is bioactive,” says Christian Toft Madsen.

Discovered new peptide in insulin

Christian Toft Madsen says that 0.14% of the peptides in the investigated samples from mice appeared to have the most potential and were worth pursuing further in the studies of potential bioactivity.

In particular, the researchers wanted to discover peptides that can lower blood glucose, and they found a very interesting new insulin peptide in the pancreas.

Normally, insulin consists of three peptide fragments, but the researchers identified a fourth that arises from an alternative splicing of RNA into peptide.

This peptide scored very highly in the machine learning analysis of potential bioactivity but did not have the effect that the researchers thought.

“Our model predicted that this peptide definitely has a bioactive role, but we tested it for the ability to lower blood glucose, and it does not. Therefore, it must play another bioactive role, and we are currently considering whether we should proceed with studying this peptide’s role. We will also investigate whether the peptide is only relevant for mice or whether it is also relevant for people,” explains Christian Toft Madsen.

Can identify health-promoting peptides through exercise

Using the machine learning model on peptides from mice is a proof of concept that the method works, and the researchers are now carrying out various studies on samples from people.

The researchers want to determine how the peptide landscape in the body changes during exercise and whether it leads to the secretion of health-promoting bioactive peptides that could be potential therapeutic drug candidates.

Since exercise leads to increased insulin sensitivity in the body, machine learning may be able to pinpoint whether peptides could lead to this beneficial effect.

The researchers are also investigating this method in specific patient groups, including among people with heart failure.

The idea is that the peptide landscape can be a biomarker for heart failure since it differs from that of healthy people. This means that when new drugs are developed for people with heart failure, researchers can easily see whether the drugs actually change the person’s unique peptide landscape.

“We also aim to investigate whether the situation is the same for other organisms, including that perhaps mice or rats have the same heart failure peptide profile as people. This may indicate whether these organisms are good models of heart failure for people,” concludes Christian Toft Madsen.

Follow Tech Science

“Combining mass spectrometry and machine learning to discover bioactive peptides” has been published in Nature Communications. The Novo Nordisk Foundation has supported research by co-author Jesper V. Olsen through the Novo Nordisk Foundation Center for Protein Research, University of Copenhagen.

Senior Scientist, Ph.D.

Christian Toft Madsen

Biomarker Identification, Global Translation, Novo Nordisk A/S

Follow Christian

I have a broad scientific background as Molecular Biologist. I have a combined M.Sc. in Molecular Biology/Chemistry and a Ph.D degree in Plant Molecul...

Tech Science

3. okt 2020 2 min

Using machine learning to discover potential drug candidates among the body’s peptides

Interested in Tech Science? We can keep you updated for free.

Christian Toft Madsen

Investigated 150,000 peptides

Discovered new peptide in insulin

Can identify health-promoting peptides through exercise

Christian Toft Madsen

Related articles

Serotonin has an important role in migraines

Energy and motivation determine your memory

Newborns’ DNA reveals their gestational age

Optimising the use of large biobanks and genetic databases

Researchers discover a link between genes and sudden cardiac arrest

Decisive discovery: how cells forget their past

Researchers can finally investigate glycans in greater detail

Exciting topics

See all 1019

Antibiotics 46

Chemotherapy 13

Kids 70

Environment 93

Immune system 71

COVID-19 94

Heart 71

Plastic 10

Psychology 35

Smoking 21

Genes 176

CRISPR 23

Hormone 58

Vitamins 16

Alcohol 27

Big data 82

Eyes 9

Dementia 13

Stem cells 34

Liver 39

Pregnancy 56

Medicine 97

Mental health 50

Diabetes 131

Recycling 4

Parasites 13

Stress 29

Climate 32

Depression 28

Nanotechnology 28

Screening 32

Organs 25

Teeth 7

Diet 48

Metabolism 68

Sleep 22

Drugs 16

Chromosomes 20

Micromolecules 22

Migraine 9

Alzheimers 19

Vaccine 46

DNA 49

Parkinsons 21

Cholesterol 19

Chemistry 79

Evolution 48

Future 1

Bacteria 117

Ageing 28

Autism 23

Virus 89

Fungi 26

Gut 46

Cells 49

Plants 42

Fertility 19

HPV 13

Disease 44

Enzymes 27

Biology 25

Bones 41

Puberty 11