About the project

Objective
This project will explore the neural correlates of human-human and human-robot conversations, with the goal of creating adaptive social robots capable of fostering more meaningful interactions. Social robots can assist people in societal situations such as health care, elderly care, education, public spaces and homes.

Our newly developed telepresence system for human-robot interaction, allowed participants to situate themselves in natural conversations while physically in a functional magnetic resonance imaging scanner. Each participant interacted directly with a human-like robot or ahuman actor while lying in the scanner. In our previous research pairs project, we used this telepresence interface to create the pioneering NeuroEngage fMRI dataset.

This project aims to advance the understanding of conversational engagement by integrating neuroscience, human-robot interaction (HRI), and artificial intelligence. Engagement plays a crucial role in effective communication, yet its underlying brain mechanisms and real-time detection remain largely unexplored. We will use the NeuroEngage dataset and complement it with additional multimodal features like facial expressions, audio embeddings, and detailed annotations of engagement levels. By using multimodal machine learning (MML), this research will develop models capable of detecting and responding to engagement levels in social interactions.

Background
In everyday conversations, a speaker and a listener are involved in a common project that relies on close coordination, requiring each participant’s continuous attention and related engagement. However, current engagement detection methods lack robustness and often rely on superficial behavioral cues, without considering the underlying neural mechanisms that drive engagement. Prior research has demonstrated the feasibility of engagement detection using multimodal signals, but most existing datasets are limited in their scope and do not incorporate neuroimaging data.

In our previous work, by analyzing two different datasets, we have shown that listening to arobot recruits more activity in sensory regions, including auditory and visual areas. We also have observed strong indications that speaking to a human, compared to the robot, recruitsmore activity in frontal regions associated with socio-pragmatic processing, i.e. considering the other’s viewpoint and factoring in what to say next. Additional comparisons of this sort will be enabled by expanding our dataset and refining machine learning models for engagement prediction. As a result, this project will help with AI-driven conversational adaptivity, advancing research in both HRI and neuroscience.

Crossdisciplinary collaboration
The researchers in the team represent the Department of Intelligent Systems, division of Speech Music and Interaction at KTH EECS, the Psychology Department and the Linguistics Department at Stockholm University. This project integrates neuroscience, linguistics, social robotics, and AI to study how humans engage in conversations with both humans and robots.

About the project

Objective
Medical doctors often face difficulty to choose a set of medicines from many options available for a patient. Medication is expected to be disease-specific as well as person-specific. Individual patients may respond differently to the same medication, so the selection of medication should be personalized to everyone’s needs and medical history. In this project, we will explore how AI (artificial intelligence) can help doctors to identify existing medications and/or therapies that can be repurposed for the treatment of dementia.

Dementia is a large-scale health care problem, where around 10% of the population more than 65 years of age suffers from it. Therefore, if AI can assist clinicians in medication selection for dementia patients, it would lead to a significant improvement in the efficiency of treatment. AI can also predict decline (or worsening) of a patient’s health condition over time. Clinicians and healthcare systems will then get precious time to decide life-saving interventions. This heralds use of AI-based medications. The pressing question: can we trust AI systems, mainly its core called machine learning (ML) for patient data analysis and predictions to doctors? Can the ML algorithms explain their predictions to the doctors? In a joint collaboration with Karolinska University Hospital (KUH), Karolinska Institute (KI) and KTH, we will develop trust-worthy ML algorithms with explainable results, and then explore the algorithms to discover new uses for approved medications that were originally developed for different medical conditions.

Background
XML based medication repurposing for dementia (XMLD) refers to the development and application of XML algorithms to identify potential drugs among existing drugs or medications that can be repurposed for the treatment or management of dementia. The goal is to develop XML algorithms to discover new uses for approved drugs or therapies that were originally developed for different medical conditions in patients with dementia. Therefore, for a patient and/or a class of patients, identification of potential drugs among many existing drugs is a variable selection problem, where XML can help.

Therefore, The XML algorithms will be developed to analyze and identify patterns, relationships, and potential associations between drug characteristics, disease severity, and patient outcomes. There are many advantages of medication repurposing for dementia using XML, such as cost and time saving, safety profile, broad range of medication candidates, and improved treatment efficiency. Overall, it addresses a pressing healthcare problem with potentially widespread impact. While our focus is dementia in this project, the accumulated technological knowledge can be used for medication repurposing of many other health problems and diseases in clinics. The proposed XMLD project will establish a strong cooperation between medical doctors and ML researchers in the clinical environment.

Partner Postdoc
Xinqi Bao

Main supervisor
Saikat Chatterjee

Co-supervisor(s)
Martina Scolamiero

About the project

Objective
The SENZ-Lab project develops and validates a cost-efficient, real-time, dynamic sparse sensing approach for urban traffic monitoring and environmental footprint assessment in Stockholm’s Environmental Zone Class 3. Using acoustic sensors and AI-driven modelling, it seeks to establish a 2D digital twin of the city’s traffic, enabling real-time monitoring of noise, air pollution, and vehicle-level activity. The goal is to enhance traffic management, reduce emissions, and support sustainable urban mobility.

Background
As cities expand, noise and air pollution pose significant health risks. Traditional monitoring methods struggle with real-world complexity, requiring new solutions. Building on previous research in Stockholm’s Hornsgatan innovation zone, this project integrates IoT, AI, and real-time traffic simulations to improve monitoring accuracy and inform urban policy. The initiative aligns with Stockholm’s environmental goals and KTH’s strategic pillars of sustainability and digitalization.

Crossdisciplinary collaboration
The project brings together experts from multiple fields, including urban sensing, traffic modeling, AI, and GIS-based visualization. The consortium includes two research teams at KTH specialized in Acoustics and Geoinformatics, supported by the City of Stockholm, combining academic research with real-world urban planning needs. By integrating cutting-edge technology with policy-driven insights, the project provides practical solutions for creating quieter, healthier, and more sustainable cities.

About the project

Objective
We develop a novel multimodal imaging database, PelvicMIM, by integrating next-generation digital diagnostic technologies to advance the evaluation of childbirth-related pelvic floor muscle injuries. This effort includes the development and validation of cutting-edge imaging modalities—Shear Wave Elastography (SWE), Magnetic Resonance Elastography (MRE), and Diffusion Tensor Imaging (DTI). These techniques will be applied in vivo to quantify the biomechanical and structural properties of pelvic floor muscles. A deep learning-based image processing framework will be designed for multimodal image registration, enabling the overlay of stiffness maps from MRE/SWE and fiber orientations from DTI onto MRI and ultrasound images. Our proposed approach facilitates cross-modality findings, offering deeper insights into muscle function and injury mechanisms.

Background
One in two middle-aged women suffer from pelvic floor dysfunction such as urinary and fecal incontinence or prolapse of the pelvic organs into the vagina, which profoundly impair quality of life. Injuries to the pelvic floor muscles due to birth are highly associated with pelvic floor dysfunction later in life. Nevertheless, injuries to these muscles, which cannot be surgically repaired, have been largely ignored and poorly studied. The Swedish Agency for Health Technology Assessment, SBU, has identified birth-related injuries to the levator ani muscle (LAM), the three largest muscles of the pelvis, as a priority area for research (April 2019). Although recent research also highlights the urgent need for quantitative assessment of LAM injuries, clinical practice still relies on conventional ultrasound, which lacks the ability to quantify biomechanical or structural properties that are important indicators of soft tissue health. These properties are crucial for the assessment of the LAM, as it is a complex structure of three muscles working together in a sheet-like shape with different layers and fiber directions.

Crossdisciplinary collaboration
The team of researchers is composed of members from the KTH School of Engineering Sciences in Chemistry, Biotechnology and Health, Department of Biomedical Engineering and Health Systems and KTH School of Engineering Science, Department of Engineering Mechanics. The project is conducted in close collaboration with clinical partners at Karolinska University Hospital.

About the project

Objective
The project envisions a mobile cyber-physical system where people carrying mobile sensors (e.g., smartphones, smartcards) generate large amounts of trajectory data that is used to sense and monitor human interactions with physical and social environments. Built upon the static causal inference results in the cAIMBER project, the CIML4MOB project aims to build causally informed machine learning models for predicting adoption time of individuals and subpopulations and their risks of attrition by input dates. Such dynamic causal models may then drive policy design strategies for lasting behavioral changes (the ultimate purpose of behavior interventions).

Background
The ever-changing mobility landscape and climate change continue to challenge existing operating models and the responsiveness of city planners, policymakers, and regulators. City authorities have growing investment needs that require more focused operations and management strategies that align mobility portfolios to societal goals. The project targets the root cause of traffic (human) and proposes causally informed machine learning to learn and predict human mobility dynamics from pervasive mobile sensing data that helps cities meet both sustainability challenges and improve urban resilience to disruptive events.

The human mobility dynamic problem is defined to predict travel choice decisions given a set of factors, including for example individual traits, travel contexts, and interventions. The research pair project (cAIMBER, 2022-2024) developed the data-driven causal inference method to discover the static causal graph of behavior responses to interventions in public transport. The cAIMBER causal model allows for analysis and prediction of human behavior based on population features, but without regard to when individuals or other subpopulations will adopt the desired behavior of a certain incentivization program. From the perspective of city planning and utility costs, two fundamental questions are (1) how to incentivize early adoption of the desired behavioral shift (adoption time) and (2) given an individual has shifted their behavior, how to prevent reversion to baseline behavior (attrition time). The research consolidator project, CIML4MOB, aims to build upon cAIMBER results to build causally informed machine learning models for predicting adoption time of individuals and subpopulations and their risks of attrition by input dates.

Crossdisciplinary collaboration
The research collaborates between researchers in transportation science and mathematics at KTH.

About the project

Objective
Large Multimodal Models (LμMs) have the potential to transform engineering education by supporting hands-on, experiential learning. LμMs can process images, audio, video, and other data types, making them ideal for supporting physical engineering design tasks. However, these tools must be carefully designed to align with educational theories and support, rather than hinder, student learning. This project aims to develop and evaluate a pedagogically-aligned virtual teaching assistant (μTA) powered by LμMs to support problem-solving with physical systems in real-world settings for engineering education. The project addresses the challenges students face when dealing with complex, ill-defined problems in engineering design courses and other experiential learning contexts and the limitations of current AI tools in these settings.

Background
Generative AI tools, like large language models (LLMs), have revolutionized education but remain largely confined to screen-based, text-centric tasks such as programming and writing. Recent advancements in Large Multimodal Models (LµMs) enable processing of diverse inputs, such as text, images, and videos, offering opportunities to extend AI’s benefits to experiential learning environments like workshops and labs. While current research focuses on screen-based applications, little is known about how LµMs can support hands-on, ill-defined problem-solving tasks central to engineering education. This project pioneers the integration of LµMs into these settings, co-designing tools with students and educators to foster skills critical for engineering innovation, their studies, and work success.

Crossdisciplinary collaboration
The project is led by two principal investigators from the KTH Royal Institute of Technology: Associate Professor Olga Viberg (Human Centered Technology/EECS) and Assistant Professor Richard Lee Davis (Learning in Engineering Sciences/ITM). This cross-disciplinary collaboration integrates Viberg’s expertise in the design and evaluation of educational technologies—with a strong focus on AI adoption in STEM education and participatory design methods—with Davis’s experience in designing AI-driven tools for experiential learning, integrating multimodal systems, and advancing pedagogical alignment for generative AI technologies.

About the project

Objective
This project aims to develop a collaborative spatial perception framework that constructs various levels of abstract representations in a city-scale area, incorporating LiDAR point clouds, RGBD images, and remote sensing images collected by various agents in a collaborative autonomous system.

Background
The concept of digital twins, involving the creation of virtual representations or models that accurately mirror physical entities or systems, has garnered growing research attention in the realm of smart cities. However, a critical challenge in realizing digital twins lies in efficiently collecting data and recreating the real world, a task that typically demands substantial human effort. To address this gap, autonomous robots, originally designed to reduce human workload, hold immense potential in shaping the future of digital twinning. These robots can potentially assume a pivotal role in autonomously creating and updating the complete mirroring of the physical world, paving the way for the next generation of digital twinning.

About the Digital Futures Postdoc Fellow
Yixi Cai completed his PhD degree in Robotics at Mechatronics and Robotic Systems (MaRS) Laboratory from Department of Mechanical Engineering, University of Hong Kong. His research focuses on efficient LiDAR-based mapping with applications on Robotics. During his PhD journey, he explored the potential of LiDAR technology to enhance the autonomous capabilities of mobile robots, particularly unmanned aerial vehicles (UAVs). He developed ikd-Tree, FAST-LIO2, and D-Map that have been widely used in LiDAR community. He is deeply interested in exploring elegant representations of the world, which would definitely unlock the boundless possibilities in Robotics.

You might find more information about him from his personal website: yixicai.com

Main supervisor
Patric Jensfelt, Professor, Head of Division of Robotics, Perception, and Learning at KTH Royal Institute of Technology, Digital Futures Faculty

Co-supervisor
Olov Andersson, Assistant Professor at Division of Robotics, Perception, and Learning at KTH Royal Institute of Technology, Digital Futures Faculty