Bayesian Process Policy Search with Latent Variables in Real-world Environments

Bayesian Process Policy Search with Latent Variables in Real-world Environments

Hikaru Sasaki

生駒 : 奈良先端科学技術大学院大学, 2022.6

授業アーカイブ

巻号情報

全1件
No. 刷年 所在 請求記号 資料ID 貸出区分 状況 予約人数

1

  • LA-I-R[MPDASH][Mobile]

M020770

内容紹介

Policy search reinforcement learning has been drawing much attention as a method for learning robot control. In particular, policy search using Gaussian process regression as the policy model can learn optimal actions. However, it is difficult to naively apply Gaussian process policy search to real-world tasks because real-world task environments, such as robotics, often involve various uncertainties. This is because the uncertainty of the environment often requires a very complex state-to-action mapping to obtain high-performance actions. To overcome complexity, this study focuses on the idea of latent variable modeling. This study explores a latent variable modeling approach in Gaussian process policy search to capture the data complexity sampled in uncertain environments. Then an algorithm is derived and simultaneously performs latent variable inference and policy learning and aims to make policy search applicable to various real-world tasks with uncertainty. We focus on two complexities: 1) multiple optimal actions emerging from a reward function with ambiguous specification, and 2) weak observations from the environment that contain little information about the state. We designed a policy model for each complexity by introducing latent variables into the Gaussian process and derived the policy update schemes based on variational Bayesian learning. The performance of the proposed policy search method was verified by simulation and a task using a robot manipulator. Finally, a policy learning framework based on Bayesian optimization with latent variables is proposed for application to actual heavy machinery, and its performance was verified using a real waste crane.

詳細情報

刊年

2022

形態

電子化映像資料(分秒)

シリーズ名

情報科学領域・コロキアム ; 2022年度

注記

講演者所属: 情報科学領域

講演日: 2022年6月24日 3限

講演場所: 情報科学棟 エーアイ大講義室(L1)

標題言語

英語 (eng)

本文言語

英語 (eng)

著者情報

Sasaki,Hikaru