Openly available algorithms have great potential in the health sciences

Therapy Breakthroughs 19. nov 2021 4 min Senior Data Scientist Adam Hulman Written by Kristian Sjøgren

The internet is awash with algorithms designed to recognise many things, from plants and animals to the sounds of birds and music. With some adjustment, such algorithms can become important tools in clinical health research.

Interested in Therapy Breakthroughs? We can keep you updated for free.

Adam Hulman

Follow Adam

Artificial intelligence and machine learning are still often considered buzzwords but are increasingly becoming integrated into research on health and disease.

Researchers can use artificial intelligence to extract information from large data sets from which meaningful insights could not be extracted previously.

Although clinical researchers have begun to use machine learning, they are nowhere near exploiting the enormous potential of the many algorithms and data sets that are available online.

According to a new study, published as a preprint on medRxiv, clinical researchers should increasingly reuse algorithms that are already available instead of designing their own.

“Today, there is a chasm between clinical research and what data science can do. Transfer learning repurposes algorithms developed and trained for solving one problem using a huge volume of data, such as from the Internet, to solve a different problem, often in another domain. Clinical research should use this approach to a much greater extent than it does today,” explains a researcher behind the study, Adam Hulman, Senior Data Scientist, Steno Diabetes Center Aarhus.

The review article on medRxiv takes a helicopter perspective on the current use of transfer learning and the potential for expansion.

Internet awash with useful algorithms

One example of transfer learning is an algorithm Google designed to identify the objects shown in images.

Data scientists at Google trained the algorithm to recognise various objects using millions of images. Since the algorithm had identified patterns in the data that were specific images from different categories, it could then classify new images very accurately.

However, the algorithms can be used for much more than this. Instead of designing their own machine learning algorithms, clinical researchers can borrow the algorithms available online or from each other and tailor them in their research.

For example, the algorithm designed and trained to classify everyday images can be transferred with small adjustments to learn to recognize patterns in the eyes of people with diabetic eye disease, thereby determining the disease severity.

“The algorithms are designed and trained to recognize everyday things, but they can be quite easily adapted and reused in clinical research. This has also been done for many years in medical image analysis, but the potential is greater than that,” says Adam Hulman.

Algorithms can identify patterns in more than just photos

Clinical researchers often collect data in spreadsheets but may also work with time-series (such as ECG or continuous blood glucose measurements), sound recordings (such as heart sound) and text (such as electronic health records).

Adam Hulman and colleagues explored the use of transfer learning for non-image data in the clinical literature to find examples of algorithms available on the Internet that could boost the development of clinical prediction models.

One example is a study in which the authors used image recognition algorithms to classify heart sounds.

Adam Hulman says that the researchers behind that study had done two things to analyse their data of heartbeats.

They found an algorithm designed to recognize 500 to 600 different sounds in YouTube videos and fine-tuned it to recognize differences in heart sounds to identify people with heart disease.

The researchers then compared the refined algorithm with another model that transformed the sound of the heartbeat into an image so that they could use the algorithm originally designed to classify everyday images to distinguish the images of heart sounds from sick and healthy individuals.

“There are more and more examples of researchers taking available algorithms and using them in their clinical research. In our study, we reviewed thousands of abstracts and read hundreds of articles to gather the examples and then performed basic descriptive analysis to determine, for example, how widely Google’s image recognition algorithms are used to analyse data other than images,” explains Adam Hulman.

Important to bring scientific fields together

Adam Hulman and colleagues identified 83 peer-reviewed clinical studies that used transfer learning on human non-image data.

As many as 63% of the studies had been published within the previous 12 months, which Adam Hulman says indicates that the field is gaining momentum and that more and more clinical researchers are realising the potential.

Adam Hulman and colleagues also examined the authors of the 83 articles and found that 60% included at least one author with a clinical affiliation and at least one with a technical affiliation. Studies with authors with solely technical affiliations (35%) were more common than studies with authors with solely clinical affiliations (5%).

“This suggests that, although many studies have both clinicians and technical researchers as authors, this gap still needs to be closed so that machine learning in clinical research becomes more than just a buzzword, something that clinicians also understand the benefits of,” says Adam Hulman.

Data and code should be shared

Another aspect of the review is the availability of data, especially clinical data sets.

In data science, researchers and algorithm developers often use openly available data sets to develop the algorithms of the future. If the available data sets happen to be images of dogs and cats, they use these to create their algorithms. However, they can just as easily use available patient data, and Adam Hulman says that this can benefit clinical researchers.

“If you are a data scientist working on a research project and the question is whether you will use an openly available data set or privately owned one, the answer is quite obvious. A culture with greater openness on clinical data is therefore also needed,” he explains.

A third aspect of the review is investigating whether the researchers who reuse algorithms from the Internet also shared their data to benefit other researchers.

Only 27% chose to share the code of the algorithm they had created based on the work of other researchers.

“There is great potential in researchers sharing more data and algorithms. If I developed an algorithm for identifying people with diabetes, and another researcher wants to study athletes with diabetes, they may have great difficulty getting enough data for the analysis. Sharing algorithms enables clinical researchers to analyse subgroups of patients much more easily, for example, if they cannot collect enough data or develop the right algorithm from scratch,” says Adam Hulman, adding that three doctors, a mathematician and a statistician conducted their review and that involving researchers with different backgrounds was an important feature of the project.

Follow Therapy Breakthroughs

“Transfer learning for non-image data in clinical research: a scoping review” has been published as a preprint on medRxiv. Steno Diabetes Center Aarhus is partially funded by the Novo Nordisk Foundation.

Senior Data Scientist

Adam Hulman

Steno Diabetes Center Aarhus

Follow Adam

Adam Hulman is an applied mathematician by training, with 10 years of experience in diabetes epidemiology research. The Novo Nordisk Foundation awarde...

Therapy Breakthroughs

9. nov 2021 3 min

Openly available algorithms have great potential in the health sciences

The internet is awash with algorithms designed to recognise many things, from plants and animals to the sounds of birds and music. With some adjustment, such algorithms can become important tools in clinical health research.

Interested in Therapy Breakthroughs? We can keep you updated for free.

Adam Hulman

Internet awash with useful algorithms

Algorithms can identify patterns in more than just photos

Important to bring scientific fields together

Data and code should be shared

Adam Hulman

Related articles

The stem cells that generate connective tissue may be therapeutic targets for various diseases

Antiviral medicine slows the development of type 1 diabetes

Severe risk of bias in the results from clinical trials of treatment for anorexia

New way of treating deadly teenage cancer

A Trojan horse can sneak medicine through the gut wall

More and more people need hospital treatment for tattoos

Antibiotics seldom necessary for people with COVID-19

Exciting topics

See all 1019

Kids 70

Autism 23

Vaccine 46

Lungs 21

Cells 49

Genes 176

Eyes 9

Podcasts 14

Organs 25

Cystisc fibrosis 13

COVID-19 94

Screening 32

Fertility 19

Sleep 22

Brain 116

Blood 62

Migraine 9

Parasites 13

Psychology 35

Vitamins 16

Micromolecules 22

Climate 32

Obesity 97

Birds 6

Recycling 4

Parkinsons 21

Exercise 39

Fungi 26

Dementia 13

Evolution 48

Ageing 28

Chemistry 79

Hormone 58

Teeth 7

Smoking 21

Depression 28

Drugs 16

Technology 49

Antibodies 24

Immune system 71

Disease 44

Language 7

CRISPR 23

Gut 46

Mental health 50

DNA 49

Treatment 112

Muscles 39

Stem cells 34

Diet 48

Schizophrenia 14

Nanotechnology 28

Antioxidants 4

Skin 22

Stress 29

Medicine 97

Cholesterol 19

Bacteria 117

Virus 89

Chromosomes 20

Metabolism 68