A Conversational System for Interactive Image Editing

A Conversational System for Interactive Image Editing

Seitaro Shinagawa

生駒 : 奈良先端科学技術大学院大学, 2021.6

Lecture Archive
Contents Intro.

Systems with natural language interfaces such as conversational interface are useful for human users in human-system collaboration tasks. Interactive image editing task is a task that uses natural language interfaces, which is a potential application for non-skilled users. If users want to create an imagined image, they can ask the system to create the image as we usually do with skilled workers. This thesis presents an interactive image editing system based on neural network image generative models, which proactively communicates with users to create the desired image.The interactive image editing task is challenging on the following two aspects: 1) the system has to handle various editing requests from the users in natural language, 2) the system has to handle the uncertainty of the generated images due to the diversity of editing requests. For the first problem, we propose an interactive image editing framework based on neural network-based image generative models. This framework aims at training a model to automatically learn relationships between the change of images and the natural language editing requests. We demonstrate that our model can successfully edit a given image according to the editing requests.For the second problem, a naive solution is to show the multiple images generated from multiple editing models and ask the user to confirm the most relevant image to the editing request every time. However, this strategy makes the interactive process redundant. To solve this problem, we propose a proactive confirmation method that enables the system to confirms with the user when the system is tentative about selecting a better image to match the editing requests. We defined an uncertainty score by using the entropy of the generated image to decide the system action to confirm. We demonstrate that our method achieves fewer confirmations to the users with better image qualities through the dialogues.

Volume No.

No. Printing year Location Call Number Material ID Circulation class Status Waiting

1

  • LA-I-R[MPDASH][Mobile]

M019140

Details

Publication year

2021

Form

電子化映像資料(38分08秒)

Series title

情報科学領域・コロキアム ; 2021年度

Note

講演者所属: 情報科学領域

講演日: 2021年6月7日 3限

講演場所: 情報科学棟中講義室(L2)

Country of publication

Japan

Title language

English (eng)

Language of texts

English (eng)

Author information

Shinagawa, Seitaro