Skip to main content

Activities & Results

Find out what’s going on!


The main results achieved in the first half of the project towards the 3 main objectives enumerated in the proposal:

1. Understanding and Design for the Smart Kitchen

Deeper understanding of the cooking process (Kuoppamäki et al, 2021) led to a number of studies on interaction in and around cooking, including list advancement (Jaber & McMillan, 2022), content navigation (Zhao et al., 2022), designing conversational interaction (Kuoppamäki et al., 2023), the impact of contextual awareness on command construction and delivery (Jaber et al., in submission) and an ongoing study comparing proactive organisational support between young adults and those over 65 (Kuoppamäki et al.).

2. Perception and Representation of Long Term Human Action

Research showing that thermal imaging is a modality relevant for detecting frustration in human-robot interaction (Mohamed et al., 2022). The models to predict frustration based on a dataset of 18 participants interacting with a Nao social robot in our lab were tested using features from several modalities: thermal, RGB, Electrodermal Activity (EDA), and all three combined. The models reached an accuracy of 89% with just RGB features, 87% using only thermal features, 84% using EDA, and 86% when using all modalities. We are also investigating the accuracy of frutration prediction models using data collected at the KTH Library where students ask directions to a Furhat robot.

3. Input, Output, and Interaction for Smart Assistive Technologies

In order to fulfil the goal of adaptation of virtual agent and robot behaviour to different contexts, we have researched interlocutor-aware facial expressions in dyadic interaction (Jonell et al, 2020) as well as adaptive facial expressions in controllable generation of speech and gesture: we have developed techniques to generate conversational speech, with control over speaking style to signal e.g. certainty or uncertainty (Wang et al, 2022, Kirkland et al 2022) as well as models that are able to generate coherent speech and gesture from a common representation (Wang et al 2021). Another direction concerns adaptation to spatial contexts and environments, where we have used imitation learning and physical simulation to produce referential gestures (Deichler et al, 2022).