• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

HSE Researchers Train Neural Network to Predict Protein–Protein Interactions More Accurately

HSE Researchers Train Neural Network to Predict Protein–Protein Interactions More Accurately

© iStock

Scientists at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a model capable of predicting protein–protein interactions with 95% accuracy. GSMFormer-PPI integrates three types of protein data (including information about protein surface properties) to analyse relationships between proteins, rather than simply combining datasets as in previous models. The solution could accelerate the discovery of disease molecular mechanisms, biomarkers, and potential therapeutic targets. The paper has been published in Scientific Reports.

Almost all cellular processes depend on interactions between proteins. Cells use these interactions to transmit signals, initiate and regulate chemical reactions, and form molecular complexes essential for proper functioning. When such interactions are disrupted, cellular processes can malfunction, potentially leading to disease.

Therefore, to study disease mechanisms and identify therapeutic targets, it is important for scientists to understand which proteins can interact and which cannot. Determining this experimentally is difficult: when dozens or hundreds of proteins are considered, the number of possible pairs becomes too large to test individually. As a result, biologists use machine learning methods to predict these interactions based on the structure and properties of molecules.

HSE researchers have developed the GSMFormer-PPI system, which takes into account three types of data for each protein in a candidate pair: the amino acid sequence, the three-dimensional structure, and the properties of the molecular surface. To process this information, the authors used existing models that convert this data into numerical representations. A protein language model analyses the amino acid sequence—the order of amino acids that make up the protein. The three-dimensional structure of the protein is represented as a graph, in which amino acids are treated as nodes and their spatial contacts as edges; this representation is processed by a graph neural network. In addition, a separate algorithm captures protein surface properties—the shape and physicochemical characteristics of the regions through which proteins recognise one another.

These numerical representations of proteins were then fed into a transformer module developed by the authors—a neural network that jointly analyses different types of protein data. In contrast to many previous approaches, where features were often simply concatenated into a single vector, this model does not combine them mechanically but instead captures the relationships between them.

Maria Poptsova

'When proteins interact, their surface is particularly important: it is through the surface that molecules recognise one another, and it is where the physicochemical properties that determine binding are concentrated. In our model, we sought to incorporate this information alongside the protein’s sequence and three-dimensional structure and not merely concatenate these features but enable the algorithm to analyse the relationships between them. This is what allowed us to predict protein–protein interactions more accurately,' comments one of the authors, Maria Poptsova, Director of the Centre for Biomedical Research and Technology at the HSE FCS AI and Digital Science Institute.

General schema of the proposed GSMFormer-PPI model. Panel A illustrates the different types of protein representations used by the model: structural, sequential, and surface-based. Panel B shows how these representations are projected to a common dimensional space, processed by a transformer, and then used to generate the final prediction of the interaction.
© Arteaga, D., Chervov, N. & Poptsova, M. Multimodal graph, surface, and language-based model for protein protein interaction prediction. Sci Rep 16, 4772 (2026).

The researchers tested the new model’s performance on the PINDER dataset, a large database of known protein interactions. In these experiments, GSMFormer-PPI achieved an accuracy of 95.7%, outperforming popular graph-based models such as GCN and GAT. The researchers also tested a simpler version of GSMFormer-PPI—without the module that analyses relationships between different types of data. This version performed worse, demonstrating that it is not only the protein data itself but also how the model integrates and compares it that drives its accuracy.

Additional tests showed that all three types of data—sequence, spatial structure, and surface properties—are essential for accurate predictions. When the researchers removed any one component, prediction accuracy declined. In other words, the model performs better precisely because it considers the protein on multiple levels simultaneously. In the future, such systems could help identify protein pairs more efficiently when studying disease mechanisms and searching for drug targets.

The work was supported by a grant for research centres in AI provided by the Ministry of Economic Development of the Russian Federation and implemented at HSE University.

See also:

Neurolinguists Assist in Awake Surgery on 11-Year-Old Patient with Epilepsy

Researchers at the HSE Centre for Language and Brain took part in a rare awake neurosurgical procedure performed on an 11-year-old patient with drug-resistant epilepsy. Working alongside surgeons at the Voyno-Yasenetsky Centre of Specialised Medical Care for Children in Solntsevo, they monitored the resection of a portion of the left temporal lobe, where the epileptic focus had been identified.

Scientists Explain How Emotions Shape Attitudes Toward Digital Governance

Today, interactions between citizens and government increasingly take place through digital governance platforms, including digital public services, AI-powered systems, and algorithmic decision-making tools. Until now, however, these technologies have largely been viewed as technical instruments, with their effectiveness assessed primarily in terms of efficiency and user-friendliness. The authors of a new study propose a broader perspective, arguing that digital governance should also be understood as an emotional experience that directly shapes citizens' trust in public institutions.

Neural Network Maps as a Method for Constructing Mathematical Models

Scientists from HSE University–Nizhny Novgorod and the Institute of Physics Belgrade, Serbia, are jointly exploring the application of machine learning techniques and neural networks to the study of nonlinear dynamics. Natalya Stankevich, Leading Research Fellow at the Laboratory of Topological Methods in Dynamics of the Faculty of Informatics, Mathematics, and Computer Science at HSE University–Nizhny Novgorod, spoke to the HSE News Service about this international project.

HSE Scientists Develop Method to Compress Large Language Models Without Losing Quality

Researchers from the AI and Digital Science Institute at the HSE Faculty of Computer Science have developed a new compression method for large language models such as GPT and LLaMA that reduces their size by 25–36% without additional training or significant loss of accuracy. This is the first approach to use mathematical transformations—specifically, rotations of model weights—to make models more amenable to compression with structured matrices. The study results have been published in ACL Findings 2025. The code is available on GitHub.

Machine Learning Models Can Help Reduce Volatility and Boost Stock Market Returns

The use of machine learning models makes it possible to achieve greater accuracy in predicting risks in the Russian stock market compared to classical econometric approaches. The predictive power of these models increases by 23%, while the average investor’s return can reach up to 13% per annum. These conclusions were drawn by Nikita Lysenok from the Department of Financial Market Infrastructure at the HSE Faculty of Economic Sciences. The paper has been published in Fundamental and Applied Mathematics.

Pocket Money, Personal Interest, and Family Practices: What Shapes Students’ Economic Literacy?

University students' economic literacy depends not only on their field of study but also on their interest in economics, the learning environment, and family financial practices. For example, students who received pocket money irregularly tend to perform better on economic literacy tests than their peers who received financial support on a regular basis. These findings come from a study conducted by HSE University involving more than 1,100 students from five Russian universities. The findings have been published in Cakrawala Pendidikan.

HSE Study Reveals Imbalance in the Generative AI Market

Researchers at HSE University analysed how effectively the global generative artificial intelligence market converts investment into real revenue, concluding that AI is currently developing faster than it is paying off. The results have been published in the journal Foresight and STI Governance.

‘Entering Robotics Now Means Growing with the Area’

Unmanned vehicles, courier robots, and smart speakers are rapidly becoming a part of our lives. In 2026, the HSE Faculty of Computer Science opens its new Bachelor’s Programme ‘Design of Intelligent Robotic Systems’ (DIRS). It will train specialists at the intersection of IT, artificial intelligence, and robotics. Academic Supervisor of DIRS Vadim Morgachev explains how studies are organised and why graduates of the programme ‘will definitely be accepted into the future.’

HSE Scientists Train Neural Network to 'Hear' Faults in Electric Motors

Researchers at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a new method—the Signature-Guided Data Augmentation (SGDA) framework—that achieves 99% accuracy in motor fault detection and 86% accuracy in fault classification. The application of this approach can reduce industrial equipment repair costs, minimise downtime, and improve production safety. The study results have been published in Engineering Applications of Artificial Intelligence.

HSE Graduate’s AI Project Wins at TECH & AI Awards

Daria Davydova, graduate of the HSE Graduate School of Business and Head of the AI Implementation Unit at the Artificial Intelligence Department of Alfa-Bank, received a prize at the TECH & AI Awards. She was awarded for the best AI solution for optimising business processes. The winners were determined as part of the VII Russian Summit and Awards on Digital Transformation (CDO/CDTO Summit & Awards).