The recent success of Reinforcement Learning and related methods can be attributed to several key factors. First, it is driven by reward signals obtained through the interaction with the environment. Second, it is closely related to the human learning behavior. Third, it has a solid mathematical foundation. Nonetheless, conventional Reinforcement Learning theory exhibits some shortcomings particularly in a continuous environment or in considering...