• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

How Neural Networks Detect and Interpret Wordplay: New Insights from HSE Researchers

How Neural Networks Detect and Interpret Wordplay: New Insights from HSE Researchers

© iStock

An international team including researchers from the HSE Faculty of Computer Science has presented KoWit-24, an annotated dataset of 2,700 Russian-language Kommersant news headlines containing wordplay. The dataset enables an assessment of how artificial intelligence detects and interprets wordplay. Experiments with five large language models show that even advanced systems still make mistakes, and that interpreting wordplay is more challenging for them than detecting it. The results were presented at the RANLP conference; the paper is available on Arxiv.org, and the dataset and the code for reproducing the experiments are available on GitHub.

Wordplay refers to deliberate use of language that violates linguistic norms in order to attract attention, entertain, or amuse the reader. It is common in Russian news headlines and can take various forms. For example, the headline ‘Osobo bumazhnye persony’ plays on the phrase ‘Osobo vazhnye persony’ (Russian for ‘very important persons’). The word vazhnye (‘important’) is replaced with bumazhnye (‘paper-related’), which rhymes with the original and shifts the meaning toward the topic of paper production. Another example is ‘Kod naklikal,’ the headline of an article about open-source code. It closely resembles ‘kot naplakal,’ an idiom meaning ‘very little,’ thereby creating a humorous ambiguity. 

For human readers, such wordplay in headlines is immediately apparent and requires no explanation. However, large language models such as ChatGPT or GigaChat Max are often at a loss, struggling not only to detect the wordplay but even more so to explain the joke. One reason for this difficulty is the limited humour datasets on which LLMs are trained. In most cases, humour in these datasets is represented by canned internet jokes explicitly labelled as ‘jokes,’ which is insufficient for the models to learn why something is funny. In addition, such datasets contain almost no annotation—there are no machine- or human-readable layers of description indicating whether wordplay is present, what type of technique is used, what the headline refers to, and so on.

Researchers from the HSE Faculty of Computer Science, in collaboration with colleagues from IT:U—Interdisciplinary Transformation University Austria—and independent researchers, have created KoWit-24, a dataset dedicated to wordplay. It comprises 2,700 headlines from the Russian business daily Kommersant published between January 2021 and December 2023, along with contextual information: each headline is accompanied by a short description of the news story (the lead) and a summary. For each instance of wordplay, the authors manually annotated the type of technique, identified the anchors—the words that trigger the wordplay—and, where possible, linked the original expressions to relevant Wikipedia articles.

The authors adopted linguist Alan Scott Partington’s definition of wordplay, according to which wordplay occurs when the same expression can be interpreted in at least two ways and this effect is intentional. Wordplay can arise in several ways. One case involves ambiguity inherent in a word or its sound. For example, in the headline ‘Volgu ne mogut zastavit’ tech’ bystree,’ the word Volgu (Volga) refers both to the river and to a federal highway with the same name. Another case involves a slight modification of a well-known phrase or title, in which the author alters the wording while relying on the reader to recognise the original and complete the joke. For instance, ‘Missiya sokratima’ alludes to ‘Missiya nevypolnima,’ the Russian title of the film Mission: Impossible, while the headline itself suggests that a diplomatic mission can be downsized.

The researchers also distinguished ‘nonce words’—coined for a single occasion—and oxymorons, which combine two contradictory meanings. This approach not only allowed them to collect and describe examples but also to compare the performance of different language models.

After annotation, the authors tested the dataset on five LLMs: GPT-4o, YandexGPT-4, GigaChat Lite, GigaChat Max, and Mistral NeMo. Each model was provided with a headline and the corresponding news lead and asked to perform two tasks: first, to determine whether the headline contained wordplay, and second, to interpret it by identifying the original phrase or reference. The researchers compared the effects of two types of prompts: a simple prompt asking whether the headline contained wordplay, and an extended prompt providing a definition along with examples of different wordplay types. The extended prompt improved performance on the detection task for three of the five models, while GPT-4o demonstrated the strongest performance in both detection and interpretation. For all models, interpreting the source of the joke proved significantly more difficult than simply detecting the presence of wordplay.

Pavel Braslavski

‘KoWit-24 addresses two key limitations of earlier datasets: it provides context for each headline and includes multi-level annotation. This transforms a collection of examples into a full-fledged “testbed” for AI. It now allows for an objective comparison of models—whether a model can detect wordplay, identify the anchor, and correctly recall the original phrase or reference. Such verifiable metrics not only allow for a more accurate evaluation of current systems but also support their intentional improvement through selection of prompts, training examples, and fact-checking strategies. In the future, we plan to investigate whether this dataset can be used to enhance humour generation,’ says Pavel Braslavski, Associate Professor at the HSE Faculty of Computer Science and co-author of the paper.

In addition, the dataset establishes a common and transparent standard for evaluation, as researchers use the same data and experimental scripts. This reduces variability in the results and helps develop models that better understand natural language, rather than merely following the logical structure of the text.

See also:

HSE Scientists Uncover Mechanism Behind Placental Lipid Metabolism Disorders in Preeclampsia

Scientists at HSE University have discovered that in preeclampsia—one of the most severe complications of pregnancy—the placenta remodels its lipid metabolism, reducing its own cholesterol synthesis while increasing cholesterol transfer to the foetus. This compensatory mechanism helps sustain foetal nutrition but accelerates placental deterioration and may lead to preterm birth. The study findings have been published in Frontiers in Molecular Biosciences.

HSE Experts Reveal Low Accuracy of Technology Forecasts in Transportation

HSE researchers evaluated the accuracy of technology forecasts in the transportation sector over the past 50 years and found that the average accuracy rate does not exceed 25%, with the lowest accuracy observed in aviation and rail transport. According to the scientists, this is due to limitations of the forecasting method and the inherent complexities of the sector. The study findings have been published in Technological Forecasting and Social Change.

Wearable Device Data and Saliva Biomarkers Help Assess Stress Resilience

A team of scientists, including researchers from HSE University, has proposed a method for assessing stress resilience using physiological markers derived from wearable devices and saliva samples. The participants who adapted better to stress showed higher heart rate variability, higher zinc concentrations in saliva, and lower potassium levels.  The findings were published in the Journal of Molecular Neuroscience.

HSE Unveils Anthropomorphic Courier Robot

From April 1 to 3, 2026, the Fourth Robotics Festival took place, with the HSE Faculty of Computer Science acting as the main organiser. The event featured the presentation of the anthropomorphic courier robot Arkus. The humanoid was introduced by the Institute for Robotic Systems, established jointly by HSE University and the EFKO Group of Companies.

When Circumstances Are Stronger Than Habits: How Financial Stress Affects Smoking Cessation

HSE researchers have found that the likelihood of quitting smoking rises with increasing financial struggles. While low levels of financial difficulties do not affect smoking behaviour, moderate financial stress can increase the probability of quitting by 13% to 21%. Responses to high financial stress differ by gender: men are almost 1.5 times more likely to give up cigarettes than under normal conditions, whereas no significant effect is observed on women’s decisions to quit smoking. These conclusions are based on data from the Russia Longitudinal Monitoring Survey (RLMS-HSE) for 2000–2023 and have been published in Monitoring of Public Opinion: Economic and Social Changes

HSE Researchers Propose New Method of Verbal Fluency Analysis for Early Detection of Cognitive Impairment

Researchers from the HSE Center for Language and Brain and the Mental Health Research Centre have proposed a new method of linguistic analysis that enables the distinction between normal and pathological ageing. Using this approach, they showed that patterns in patients’ word choices during verbal fluency tests allow clinicians to more accurately differentiate clinically significant impairments from subjective memory complaints. Incorporating this type of analysis into clinical practice could improve the accuracy of early dementia diagnosis. The results have been published in Applied Neuropsychology: Adult.

How the Brain Processes a Word: HSE Researchers Compare Reading Routes in Adults and Children

Researchers from the HSE Center for Language and Brain used magnetoencephalography to study how the brains of adults and children respond to words during reading. They showed that in children the brain takes longer to process words that are frequently used in everyday speech, while rare words and pseudowords are processed in the same way—slowly and in parts. With age, the system is reorganised: high-frequency words shift to a fast route, whereas new letter combinations are still analysed slowly. The study was published in the journal Psychophysiology.

HSE Economists Find That Auction Prices Depend on Artist’s Life Story

Researchers from the Centre for Big Data in Economics and Finance at the HSE Faculty of Economic Sciences have found that facts from an artist’s life are statistically significant in pricing a painting, alongside such traditional characteristics as the material, the size of the canvas, or the presence of the artist’s signature. This conclusion is based on an analysis of prices for 15,000 works by 158 artists sold since 1999 by the major auction houses Sotheby’s and Christie’s. The article has been published in the journal Empirical Studies of the Arts.

HSE Physicists Propose Unified Theory for Describing Electric Double Layer

To develop more efficient batteries and catalysts, it is essential to understand the processes occurring at the metal–solution interface in the electric double layer (EDL). Physicists at HSE MIEM have proposed a unified theoretical model of the EDL that simultaneously accounts for selective adsorption of ions on the surface and partial charge transfer between ions and the metal—phenomena that had previously been described separately. The model’s predictions are consistent with experimental data. In the future, it may be used in the development of batteries, supercapacitors, and catalysts. The study has been published in Electrochimica Acta. 

HSE Researchers Experimentally Demonstrate Positive Effects of Urban Parks on the Brain

Scientists at HSE University have investigated the effect of parks on the cognitive and emotional resources of city dwellers. The researchers compared brain electrical activity in 30 participants while they watched videos of walks through parks and along busy highways. The results showed that green urban environments with trees produce a consistent effect across individuals, helping the brain calm down and relax. By contrast, walks along busy streets were found to be distracting. The findings have been published in Scientific Reports.