مشخصات مقاله | |
ترجمه عنوان مقاله | ادغام تانسور سوپرپیکسل برای ردیابی بصری با استفاده از ترکیب نشانه های بصری سطح متوسط چندگانه |
عنوان انگلیسی مقاله | Superpixel Tensor Pooling for Visual Tracking Using Multiple Midlevel Visual Cues Fusion |
انتشار | مقاله سال 2019 |
تعداد صفحات مقاله انگلیسی | 8 صفحه |
هزینه | دانلود مقاله انگلیسی رایگان میباشد. |
پایگاه داده | نشریه IEEE |
نوع نگارش مقاله |
مقاله پژوهشی (Research Article) |
مقاله بیس | این مقاله بیس نمیباشد |
نمایه (index) | Scopus – Master Journals List – JCR |
نوع مقاله | ISI |
فرمت مقاله انگلیسی | |
ایمپکت فاکتور(IF) |
4.641 در سال 2018 |
شاخص H_index | 56 در سال 2019 |
شاخص SJR | 0.609 در سال 2018 |
شناسه ISSN | 2169-3536 |
شاخص Quartile (چارک) | Q2 در سال 2018 |
مدل مفهومی | ندارد |
پرسشنامه | ندارد |
متغیر | ندارد |
رفرنس | دارد |
رشته های مرتبط | مهندسی کامپیوتر |
گرایش های مرتبط | مهندسی الگوریتم و محاسبات |
نوع ارائه مقاله |
ژورنال |
مجله / کنفرانس | دسترسی – IEEE Access |
دانشگاه | Department of Electrical Engineering, City University of Hong Kong, Hong Kong |
کلمات کلیدی | یادگیری فضاهای فرعی مثبت و منفی افزایشی، ترکیب نشانه های بصری سطح متوسط چندگانه، ادغام تانسور سوپرپیکسل، ردیابی بصری |
کلمات کلیدی انگلیسی | Incremental positive and negative subspaces learning, multiple midlevel visual cues fusion, superpixel tensor pooling, visual tracking |
شناسه دیجیتال – doi |
https://doi.org/10.1109/ACCESS.2019.2946939 |
کد محصول | E13857 |
وضعیت ترجمه مقاله | ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید. |
دانلود رایگان مقاله | دانلود رایگان مقاله انگلیسی |
سفارش ترجمه این مقاله | سفارش ترجمه این مقاله |
فهرست مطالب مقاله: |
Abstract I. Introduction II. Superpixel Tensor Pooling Tracker III. Experiments and Results IV. Conclusion Authors Figures References |
بخشی از متن مقاله: |
Abstract
In this paper, we propose a method called superpixel tensor pooling tracker which can fuse multiple midlevel cues captured by superpixels into sparse pooled tensor features. Our method first adopts the superpixel method to generate different patches (superpixels) from the target template or candidates. Then for each superpixel, it encodes different midlevel cues including HSI color, RGB color, and spatial coordinates into a histogram matrix to construct a new feature space. Next, these matrices are formed to a third order tensor. After that, the tensor is pooled into the sparse representation. Then the incremental positive and negative subspaces learning is performed. Our method has both good characteristics of midlevel cues and sparse representation hence is more robust to large appearance variations and can capture compact and informative appearance of the target object. To validate the proposed method, we compare it with state-ofthe-art methods on 24 sequences with multiple visual tracking challenges. Experiment results demonstrate that our method outperforms them significantly. Introduction The study of visual tracking has been achieved great successes in recent years. Visual tracking is a process of locating a moving object or multiple objects over time in a video stream or using a camera. It can be divided into three steps: (1) object detection; (2) location prediction; (3) data association. Before using tracking algorithm to perform these steps, for each video application, a shot boundary detection needs to be performed to extract the sequence [1]. However, because of the heavy occlusion, drifts, fast motion, severe scale variation, large shape deformation, etc., visual tracking is still a challenge in computer vision [2]–[4]. Many advanced visual tracking methods have been developed to solve these challenges, such as sparse representation based approaches, correlation filter (CF) based methods, deep learning (DL) based methods, etc. Sparse representation has been introduced successfully into the construction of the appearance model in visual tracking [3]–[5]. It uses the sparse linear representation to represent the candidates [3], [5]. It can use very few but most related target templates to reduce impacts of background noise [4]. Moreover, it can use local sparse codes to model the target appearance adaptively and exploit the discriminative nature [4]. |