Javascript must be enabled for the correct page display

Sim2Real Transfer of Visual Grounding for Natural Human-Robot Interaction

Tziafas, Georgios (2021) Sim2Real Transfer of Visual Grounding for Natural Human-Robot Interaction. Master's Thesis / Essay, Artificial Intelligence.

[img]
Preview
Text
THESIS.pdf

Download (12MB) | Preview
[img] Text
toestemming.pdf
Restricted to Registered users only

Download (97kB)

Abstract

Service robots need to interact naturally with non-expert human users, not only to help them in various tasks but also to receive verbal guidance from users in order to resolve ambiguities, e.g. through visual grounding of objects to natural language queries. Even though modern visual grounding methods can be applied in an open-ended fashion, they heavily rely on model size, and their transfer performance in robotic-specific domains suffers due to high domain discrepancy between the benchmark and real-time sensory data. In this work, we seek to address these limitations by adopting the Sim2Real transfer methodology for visual grounding in tabletop RGB-D domains. Towards this goal, we develop a novel multi-modal deep learning model for visual grounding. To train the network, we design an algorithm to generate visual grounding queries for tabletop RGB-D scenes, by extracting and parsing scene graph representations of the objects. We deal with the domain discrepancy problem by pre-segmenting objects and perform learning on the extracted RGB crops instead of entire scenes. We evaluate the proposed method using a subset of the Washington RGB-D scene dataset. Experimental results show that the synthetic model can achieve transfer accuracy parallel to the one trained in real data with minimal fine-tuning, while gaining performance boosts when used solely as a pre-training resource.

Item Type: Thesis (Master's Thesis / Essay)
Supervisor name: Mohades Kasaei, S.H. and Schomaker, L.R.B.
Degree programme: Artificial Intelligence
Thesis type: Master's Thesis / Essay
Language: English
Date Deposited: 04 Jan 2022 10:57
Last Modified: 05 Jan 2022 13:13
URI: https://fse.studenttheses.ub.rug.nl/id/eprint/26350

Actions (login required)

View Item View Item