مقاله انگلیسی رایگان در مورد یادگیری برنامه درسی خوشه های بصری – الزویر ۲۰۱۸

مقاله انگلیسی رایگان در مورد یادگیری برنامه درسی خوشه های بصری – الزویر ۲۰۱۸

 

مشخصات مقاله
ترجمه عنوان مقاله یادگیری برنامه درسی خوشه های بصری برای طبقه بندی چند کاره
عنوان انگلیسی مقاله Curriculum learning of visual attribute clusters for multi-task classification
انتشار مقاله سال ۲۰۱۸
تعداد صفحات مقاله انگلیسی ۱۵ صفحه
هزینه دانلود مقاله انگلیسی رایگان میباشد.
پایگاه داده نشریه الزویر
نوع نگارش مقاله مقاله پژوهشی (Research article)
مقاله بیس این مقاله بیس نمیباشد
نمایه (index) scopus – master journals – JCR
نوع مقاله ISI
فرمت مقاله انگلیسی  PDF
ایمپکت فاکتور(IF) ۳٫۹۶۲ در سال ۲۰۱۷
شاخص H_index ۱۶۸ در سال ۲۰۱۸
شاخص SJR ۱٫۰۶۵ در سال ۲۰۱۸
رشته های مرتبط علوم تربیتی
گرایش های مرتبط مدیریت آموزشی
نوع ارائه مقاله ژورنال
مجله / کنفرانس الگو شناسی – Pattern Recognition
دانشگاه Department of Computer Science – University of Houston – USA
کلمات کلیدی یادگیری برنامه درسی، طبقه بندی چند کاره، ویژگی های ویژوال
کلمات کلیدی انگلیسی Curriculum learning, Multi-task classification. Visual attributes
شناسه دیجیتال – doi
https://doi.org/10.1016/j.patcog.2018.02.028
کد محصول E9704
وضعیت ترجمه مقاله  ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید.
دانلود رایگان مقاله دانلود رایگان مقاله انگلیسی
سفارش ترجمه این مقاله سفارش ترجمه این مقاله

 

فهرست مطالب مقاله:
Highlights
Abstract
Keywords
۱ Introduction
۲ Related work
۳ Methodology
۴ Experiments
۵ Ablation studies and performance analysis
۶ Conclusion
Acknowledgment
References
Vitae

بخشی از متن مقاله:
abstract

Visual attributes, from simple objects (e.g., backpacks, hats) to soft-biometrics (e.g., gender, height, clothing) have proven to be a powerful representational approach for many applications such as image description and human identification. In this paper, we introduce a novel method to combine the advantages of both multi-task and curriculum learning in a visual attribute classification framework. Individual tasks are grouped after performing hierarchical clustering based on their correlation. The clusters of tasks are learned in a curriculum learning setup by transferring knowledge between clusters. The learning process within each cluster is performed in a multi-task classification setup. By leveraging the acquired knowledge, we speed-up the process and improve performance. We demonstrate the effectiveness of our method via ablation studies and a detailed analysis of the covariates, on a variety of publicly available datasets of humans standing with their full-body visible. Extensive experimentation has proven that the proposed approach boosts the performance by 4%–۱۰%.

Introduction

Vision as reception. Vision as reflection. Vision as projection. –Bill Viola, note 1986 When we are interested in providing a description of an object or a human, we tend to use visual attributes to accomplish this task. For example, a laptop can have a wide screen, a silver color, and a brand logo, whereas a human can be tall, female, wearing a blue t-shirt and carrying a backpack. Visual attributes in computer vision are equivalent to the adjectives in our speech. We rely on visual attributes since (i) they enhance our understanding by creating an image in our head of what this object or human looks like; (ii) they narrow down the possible related results when we want to search for a product online or when we want to provide a suspect description; (iii) they can be composed in different ways to create descriptions; (iv) they generalize well as with some finetuning they can be applied to recognize objects for different tasks; and (v) they are a meaningful semantic representation of objects or humans that can be understood by both computers and humans. However, effectively predicting the corresponding visual attributes of a human given an image remains a challenging task [1]. In reallife scenarios, images might be of low-resolution, humans might be partially occluded in cluttered scenes, or there might be significant pose variations. Estimating the visual attributes of humans is an important computer vision problem with applications ranging from finding missing children to virtual reality. When a child goes missing or the police is looking for a suspect, a short description is usually provided that comprises such attributes (for example, tall white male, with a black shirt wearing a hat and carrying a backpack). Thus, if we could efficiently identify which images or videos contain images of humans with such characteristics we could potentially reduce dramatically the labor and the time required to identify them [2]. Another interesting application is the 3D reconstruction of the human body in virtual reality [3]. If we have such attribute information we can facilitate the reconstruction by providing the necessary priors. For example it is easier to reconstruct accurately the body shape of a human if we already know that it is a tall male with shorts and sunglasses than if no information is provided. In this work, we introduce CILICIA (CurrIculum Learning multItask ClassIfication Attributes) to address the problem of visual attribute classification from images of standing humans. Instead of using low-level representations, which would require extracting hand-crafted features, we propose a deep learning method to solve multiple binary classification tasks. CILICIA differentiates itself from the literature as: (i) it performs end-to-end learning by feeding a single ConvNet with the entire image of a human without making any assumptions about predefined connection between body parts and image regions; and (ii) it exploits the advantages of both multi-task and curriculum learning. Tasks are split into groups based on their labels’ cross-correlation using hierarchical agglomerative clustering. The groups of tasks are learned in a curriculum learning scenario, starting with the one with the highest within-group cross-correlation and moving to the less correlated ones by transferring knowledge from the former to the latter. The tasks in each group are learned in a typical multi-task classification setup. Parts of this publication appear in our previous work [4].

ثبت دیدگاه