مقاله انگلیسی رایگان در مورد واحد های خطی S شکل وزن دار برای تقریب تابع – الزویر 2018

 

مشخصات مقاله
ترجمه عنوان مقاله واحد های خطی S شکل وزن دار برای تقریب تابع شبکه عصبی در یادگیری تقویتی
عنوان انگلیسی مقاله Sigmoid-weighted linear units for neural network function approximation in reinforcement learning
انتشار مقاله سال 2018
تعداد صفحات مقاله انگلیسی 26 صفحه
هزینه دانلود مقاله انگلیسی رایگان میباشد.
پایگاه داده نشریه الزویر
نوع نگارش مقاله
مقاله پژوهشی (Research article)
مقاله بیس این مقاله بیس نمیباشد
نمایه (index) MedLine – scopus – master journals – JCR
نوع مقاله ISI
فرمت مقاله انگلیسی  PDF
ایمپکت فاکتور(IF)
7.197 در سال 2017
شاخص H_index 121 در سال 2018
شاخص SJR 2.359 در سال 2018
رشته های مرتبط مهندسی کامپیوتر، فناوری اطلاعات
گرایش های مرتبط هوش مصنوعی، شبکه های کامپیوتری
نوع ارائه مقاله
ژورنال
مجله شبکه های عصبی – Neural Networks
دانشگاه  Dept. of Brain Robot Interface – ATR Computational Neuroscience Laboratories – Japan
کلمات کلیدی یادگیری تقویتی، واحد خطی وزن سیگموئیدی، تابع تقریبی، تتریس، آتاری 2600، یادگیری عمیق
کلمات کلیدی انگلیسی reinforcement learning، sigmoid-weighted linear unit، function
شناسه دیجیتال – doi
https://doi.org/10.1016/j.neunet.2017.12.012
کد محصول  E10530
وضعیت ترجمه مقاله  ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید.
دانلود رایگان مقاله دانلود رایگان مقاله انگلیسی
سفارش ترجمه این مقاله سفارش ترجمه این مقاله

 

فهرست مطالب مقاله:
Abstract

1- Introduction

2- Method

3- Experiments

4- Analysis

5- Conclusions

6- Acknowledgments

References

بخشی از متن مقاله:

Abstract

In recent years, neural networks have enjoyed a renaissance as function approximators in reinforcement learning. Two decades after Tesauro’s TD-Gammon achieved near top-level human performance in backgammon, the deep reinforcement learning algorithm DQN achieved human-level performance in many Atari 2600 games. The purpose of this study is twofold. First, we propose two activation functions for neural network function approximation in reinforcement learning: the sigmoid-weighted linear unit (SiLU) and its derivative function (dSiLU). The activation of the SiLU is computed by the sigmoid function multiplied by its input. Second, we suggest that the more traditional approach of using on-policy learning with eligibility traces, instead of experience replay, and softmax action selection can be competitive with DQN, without the need for a separate target network. We validate our proposed approach by, first, achieving new state-of-the-art results in both stochastic SZ-Tetris and Tetris with a small 10×10 board, using TD(λ) learning and shallow dSiLU network agents, and, then, by outperforming DQN in the Atari 2600 domain by using a deep Sarsa(λ) agent with SiLU and dSiLU hidden units.

Introduction

Neural networks have enjoyed a renaissance as function approximators in reinforcement learning (Sutton and Barto, 1998) in recent years. The DQN algorithm (Mnih et al., 2015), which combines Q-learning with a deep neural network, experience re5 play, and a separate target network, achieved human-level performance in many Atari 2600 games. Since the development of the DQN algorithm, there have been several proposed improvements, both to DQN specifically and deep reinforcement learning in general. Van Hasselt et al. (2015) proposed double DQN to reduce overestimation of the action values in DQN and Schaul et al. (2016) developed a framework for more 10 efficient replay by prioritizing experiences of more important state transitions. Wang et al. (2016) proposed the dueling network architecture for more efficient learning of the action value function by separately estimating the state value function and the advantages of each action. Mnih et al. (2016) proposed a framework for asynchronous learning by multiple agents in parallel, both for value-based and actor-critic methods. 15 To date, the most impressive application of using deep reinforcement learning is AlphaGo (Silver et al., 2016, 2017), which has achieved superhuman performance in the ancient board game Go. The purpose of this study is twofold. First, motivated by the high performance of the expected energy restricted Boltzmann machine (EE-RBM) in our earlier studies 20 (Elfwing et al., 2015, 2016), we propose two activation functions for neural network function approximation in reinforcement learning: the sigmoid-weighted linear unit (SiLU) and its derivative function (dSiLU). The activation of the SiLU is computed by the sigmoid function multiplied by its input. After we first proposed the SiLU (Elfwing et al., 2017), Ramachandran et al. (2017) recently performed a comprehensive compar25 ison between the SiLU, the rectifier linear unit (ReLU; Hahnloser et al., 2000), and 6 other activation functions in the supervised learning domain. They found that the SiLU consistently outperformed the other activation functions when tested in 3 deep architectures on CIFAR-10/100 (Krizhevsky, 2009), in 5 deep architectures on ImageNet (Deng et al., 2009), and on 4 test sets for English-to-German machine translation

دیدگاهتان را بنویسید

نشانی ایمیل شما منتشر نخواهد شد. بخش‌های موردنیاز علامت‌گذاری شده‌اند *

دکمه بازگشت به بالا