مقاله انگلیسی رایگان در مورد یادگیری مادام العمر اعمال انسان با شبکه عصبی عمیق – الزویر ۲۰۱۷

مقاله انگلیسی رایگان در مورد یادگیری مادام العمر اعمال انسان با شبکه عصبی عمیق – الزویر ۲۰۱۷

 

مشخصات مقاله
ترجمه عنوان مقاله یادگیری مادام العمر اعمال انسان با شبکه عصبی عمیق خودسازمان یابی
عنوان انگلیسی مقاله Lifelong learning of human actions with deep neural network self-organization
انتشار مقاله سال ۲۰۱۷
تعداد صفحات مقاله انگلیسی ۱۳ صفحه
هزینه دانلود مقاله انگلیسی رایگان میباشد.
پایگاه داده نشریه الزویر
نوع نگارش مقاله
مقاله پژوهشی (Research article)
مقاله بیس این مقاله بیس نمیباشد
نمایه (index) scopus – master journals – JCR – MedLine
نوع مقاله ISI
فرمت مقاله انگلیسی  PDF
ایمپکت فاکتور(IF)
۷٫۱۹۷ در سال ۲۰۱۷
شاخص H_index ۱۲۱ در سال ۲۰۱۷
شاخص SJR ۲٫۳۵۹ در سال ۲۰۱۷
رشته های مرتبط مهندسی کامپیوتر، فناوری اطلاعات
گرایش های مرتبط الگوریتم ها و محاسبات، هوش مصنوعی، شبکه های کامپیوتری
نوع ارائه مقاله
ژورنال
مجله / کنفرانس شبکه های عصبی – Neural Networks
دانشگاه Department of Informatics – University of Hamburg – Germany
کلمات کلیدی یادگیری مادام العمر، تشخیص عمل، یادگیری عمیق بدون ناظر، شبکه های عصبی خودسازماندهی شده
کلمات کلیدی انگلیسی Lifelong learning, Action recognition, Unsupervised deep learning, Self-organizing neural networks
شناسه دیجیتال – doi
http://dx.doi.org/10.1016/j.neunet.2017.09.001
کد محصول E10383
وضعیت ترجمه مقاله  ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید.
دانلود رایگان مقاله دانلود رایگان مقاله انگلیسی
سفارش ترجمه این مقاله سفارش ترجمه این مقاله

 

فهرست مطالب مقاله:
Abstract
Keywords
۱ Introduction
۲ Related work
۳ Proposed method
۴ Experiments and results
۵ Discussion
۶ Conclusion
Acknowledgments
References

بخشی از متن مقاله:
abstract

Lifelong learning is fundamental in autonomous robotics for the acquisition and fine-tuning of knowledge through experience. However, conventional deep neural models for action recognition from videos do not account for lifelong learning but rather learn a batch of training data with a predefined number of action classes and samples. Thus, there is the need to develop learning systems with the ability to incrementally process available perceptual cues and to adapt their responses over time. We propose a selforganizing neural architecture for incrementally learning to classify human actions from video sequences. The architecture comprises growing self-organizing networks equipped with recurrent neurons for processing time-varying patterns. We use a set of hierarchically arranged recurrent networks for the unsupervised learning of action representations with increasingly large spatiotemporal receptive fields. Lifelong learning is achieved in terms of prediction-driven neural dynamics in which the growth and the adaptation of the recurrent networks are driven by their capability to reconstruct temporally ordered input sequences. Experimental results on a classification task using two action benchmark datasets show that our model is competitive with state-of-the-art methods for batch learning also when a significant number of sample labels are missing or corrupted during training sessions. Additional experiments show the ability of our model to adapt to non-stationary input avoiding catastrophic interference.

Introduction

The robust recognition of other people’s actions represents a crucial component underlying social cognition. Neurophysiological studies have identified a specialized area for the visual coding of articulated motion in the mammalian brain (Perrett, Rolls, & Caan, 1982), comprising neurons selective to biological motion in terms of time-varying patterns of form and motion features with increasing complexity of representation (Giese & Rizzolatti, 2015). The hierarchical organization of the visual cortex has inspired computational models for action recognition from videos, with deep neural network architectures producing state-of-the-art results on a set of benchmark datasets (e.g. Baccouche, Mamalet, Wolf, Garcia, & Baskurt, 2011; Jain, Tompson, LeCun, & Bregler, 2015; Jung, Hwang, & Tani, 2015). Typically, visual models using deep learning comprise a set of convolution and pooling layers trained in a hierarchical fashion for yielding action feature representations with an increasing degree of abstraction (Guo, Liu, Oerlemans, Lao, Wu, & Lew, 2016). This processing scheme is in agreement with neurophysiological studies supporting the presence of functional hierarchies with increasingly large spatial and temporal receptive fields along cortical pathways (Hasson, Yang, Vallines, Heeger, & Rubin, 2008; Taylor, Hobbs, Burroni, & Siegelmann, 2015). The training of deep learning models for action sequences has been proven to be computationally expensive and to require an adequately large number of training samples for the successful learning of spatiotemporal filters. The supervised training procedure comprises two stages: (i) a forward stage in which the input is represented by the current network parameters and the prediction error is used to compute the loss cost from ground-truth sample labels, and (ii) a backward stage which computes the gradients with respect to the parameters and updates them using back-propagation through time (BPTT, Mozer, 1995). Different regularization methods have been proposed to boost performance such as parameter sharing and dropout. However, the training process requires samples to be (correctly) labeled in terms of input–output pairs. Consequently, standard deep learning models for action recognition do not account for learning scenarios in which the number of training samples is not be sufficiently high and ground-truth labels may be occasionally missing or noisy.

ثبت دیدگاه