Researchers develop a method to identify signs of manipulation in large data sets

Tech Science 21. may 2024 3 min Associate Professor Arijit Khan Written by Kristian Sjøgren

Many things risk manipulation, such as social media and cryptocurrency. Researchers have developed a fully automated method to find signs of manipulation in large data sets in real time. The method can also explain the decisions to users.

Interested in Tech Science? We can keep you updated for free.

Follow Tech Science

Associate Professor

Arijit Khan

Department of Computer Science, Aalborg University, Denmark.

Follow Arijit

The world is increasingly becoming more and more digitalised and thus also more complex.

One great challenge is the huge data sets that have become an integral part of daily life, such as all the data comprising social media and the Internet as a whole.

Data can be difficult to interpret and understand and can be manipulated. Examples include the manipulation of data from social media or data involved in transactions with cryptocurrency through blockchain technology.

Manipulating data on social media can create a false reality, and manipulating cryptocurrency data can move money from your pocket to other pockets without anyone ever detecting it.

The risk of data manipulation is high and the chances of detecting it are slight, but researchers have developed a method to identify signs of manipulation in large data sets in real time.

The method should enable both detecting signs of manipulation in large data sets and finding the source.

“The challenge is that many data sets are very large, anomalies cannot be identified manually, and understanding the data sets in general is very difficult for most people. Our method can identify the anomalies and enable the people working with data to better understand their data and the source of the anomalies,” explains a researcher behind the development of the method, Arijit Khan, Associate Professor, Department of Computer Science, Aalborg University, Denmark.

Nodes and edges

Arijit Khan and colleagues are working to improve insight into graph data, a method for visualising how data interact in large data sets.

For example, data in social media can be stored as graph data. All people, images, comments, videos, links, groups, pages and events are nodes in data, and these nodes are connected by edges, which means interactions between the nodes.

For example, posting a photo on your profile creates an edge between the two nodes: the person and the photo.

The whole of Facebook is a collection of nodes and edges, because Facebook stores data as graph data.

Blockchain technology works similarly for cryptocurrency transactions. The users are the nodes, and the transactions are the edges.

This also applies to data on the effects of drugs, with the drugs and the molecular structure being the nodes and the effects the edges.

Detecting signs of manipulation

Data risk being manipulated. Social media data can be manipulated or cryptocurrency can be stolen by forging transactions between accounts. This is very difficult to detect.

The methods that Arijit Khan and colleagues have developed can precisely identify anomalies as signs of manipulation in large graph data sets.

The software uses scalable algorithms and artificial intelligence to not only identify anomalies or signs of manipulation in large graph data sets but also to explain the signs of manipulation to the people working with the data.

“Artificial intelligence can find patterns and thereby anomalies in data sets, but the explanation is missing. You can therefore be told that something is problematic but not why. Our method not only identifies the anomalies but also explains the reason for the anomaly,” says Arijit Khan.

Identifying signs of tampering in real time

The researchers developed a method that can identify signs of manipulation in the blockchain.

In a study recently published in Frontiers in Blockchain, the researchers showed that the method could automatically analyse correctly three verified manipulations of blockchain technology in currency trading.

The method could identify both suspicious transactions and the actors behind them.

“Blockchain technology provides a constant flow of data that is impossible to monitor manually. You need automated methods to evaluate data in real time and identify anomalies. When the anomalies are identified, an explanation also needs to follow, and our method does this,” explains Arijit Khan.

In another study published in Proceedings of the ACM on Management of Data, the researchers have shown that they can identify drug properties that are associated with an increased risk of mutagenicity in the same way.

“We can not only show that the data indicate that some drug is associated with an increased risk of mutagenicity but also what in the molecular structure gives rise to this property,” says Arijit Khan.

He says that the methods the researchers develop can be adapted to different data sets with different characteristics.

In addition, the researchers have made the methods freely available so that others can use them and develop them further.

“Data depth and core-based trend detection on blockchain transaction networks” has been published in Frontiers in Blockchain. The research was supported by grants from the Natural Sciences and Engineering Research Council of Canada and the Novo Nordisk Foundation. “View-based explanations for graph neural networks” has been published in Proceedings of the ACM on Management of Data. The research was supported by grants from the National Natural Science Foundation of China, Ningbo Yongjiang Talent Introduction Programme, United States National Science Foundation and the Novo Nordisk Foundation.

Follow Tech Science

”Data depth and core-based trend detection on blockchain transaction networks” has been published in Frontiers in Blockchain. The research was supported by grants from the Natural Sciences and Engineering Research Council of Canada and the Novo Nordisk Foundation. ”View-based explanations for graph neural networks” was published in Proceedings of the ACM on Management of Data. The research was supported by National Natural Science Foundation of China, Ningbo Yongjiang Talent Introduction Programme, United States National Science Foundation and the Novo Nordisk Foundation.

Associate Professor

Arijit Khan

Department of Computer Science, Aalborg University, Denmark.

Follow Arijit

Data management and Artificial Intelligence for the emerging problems in large graphs, with a focus on user-friendly, efficient, approximate, and expl...

Tech Science

15. okt 2023 3 min

Researchers develop a method to identify signs of manipulation in large data sets

Many things risk manipulation, such as social media and cryptocurrency. Researchers have developed a fully automated method to find signs of manipulation in large data sets in real time. The method can also explain the decisions to users.

Interested in Tech Science? We can keep you updated for free.

Arijit Khan

Nodes and edges

Detecting signs of manipulation

Identifying signs of tampering in real time

Arijit Khan

Related articles

A new algorithm for analysing microscopy images of organisms at high density

The brain’s fear centre also controls hunger

Protein’s kiss of death regulates muscles after exercise

Children of mothers with maternal diabetes have an increased risk of needing corrective lenses

More evidence that viruses can trigger type 1 diabetes

Finding what caused pregnancy loss through a blood test

Mapping the brain’s control centre

Exciting topics

See all 1033

Antioxidants 4

Kids 70

Protein 125

Fat 64

Brain 116

Hormone 59

Future 1

Migraine 9

Micromolecules 22

Pregnancy 58

Fertility 19

Liver 39

Sleep 22

Cystisc fibrosis 13

Exercise 40

Language 8

Medicine 98

Enzymes 27

Chemistry 79

Biology 26

HPV 13

Big data 84

Antibodies 24

Parkinsons 21

Treatment 112

Dementia 13

Alzheimers 19

Chromosomes 20

Virus 92

Cells 49

CRISPR 23

Eyes 9

Muscles 40

Recycling 4

Depression 28

Plastic 10

Mental health 51

Organs 25

Smoking 21

Bones 41

Sound 9

Autism 23

Microbiome 31

Asthma 9

Obesity 100

Genes 178

Lungs 21

Microorganisms 38

Cancer 141

Influenza 16

Cholesterol 19

Antibiotics 47

Skin 22

Stress 29

Diabetes 133

Stem cells 34

Schizophrenia 14

Environment 94

Immune system 74

Puberty 11

Chemotherapy 13

Food 23