مقاله انگلیسی رایگان در مورد یک روش سریع کاوش مجموعه اقلام پرتکرار مبتنی بر GPU – الزویر 2018

 

مشخصات مقاله
ترجمه عنوان مقاله : یک روش سریع کاوش مجموعه اقلام پرتکرار مبتنی بر GPU برای داده های در مقیاس بزرگ: GMiner
عنوان انگلیسی مقاله A fast GPU-based frequent itemset mining method for large-scale data: GMiner
انتشار مقاله سال 2018
تعداد صفحات مقاله انگلیسی 45 صفحه
هزینه دانلود مقاله انگلیسی رایگان میباشد.
پایگاه داده نشریه الزویر
نوع نگارش مقاله
مقاله پژوهشی (Research Article)
مقاله بیس این مقاله بیس میباشد
نمایه (index) Scopus – Master Journal List – JCR
نوع مقاله ISI
فرمت مقاله انگلیسی  PDF
ایمپکت فاکتور(IF)
5.080 در سال 2017
شاخص H_index 142 در سال 2019
شاخص SJR 1.635 در سال 2017
شناسه ISSN 0020-0255
شاخص Quartile (چارک) Q1 در سال 2017
رشته های مرتبط مهندسی کامپیوتر
گرایش های مرتبط مهندسی نرم افزار، مهندسی الگوریتم ها و محاسبات
نوع ارائه مقاله
ژورنال
مجله  مجله علوم اطلاعات – Information Sciences
دانشگاه  DGIST (Daegu Gyeongbuk Institute of Science and Technology), Daegu, Republic of Korea
کلمات کلیدی کاوش مجموعه اقلام پرتکرار، واحد پردازش گرافیکی، الگوریتم موازی، ریزش کاری
کلمات کلیدی انگلیسی frequent itemset mining، graphics processing unit، parallel algorithm، workload skewness
شناسه دیجیتال – doi
https://doi.org/10.1016/j.ins.2018.01.046
کد محصول  E10893
وضعیت ترجمه مقاله  ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید.
دانلود رایگان مقاله دانلود رایگان مقاله انگلیسی
سفارش ترجمه این مقاله سفارش ترجمه این مقاله

 

فهرست مطالب مقاله:
Abstract

1- Introduction

2- Related work

3- TFL strategy

4- HIL Strategy

5- Multiple GPUs and cost model

6- Performance evaluation

7- Summary

References

بخشی از متن مقاله:

Abstract

Frequent itemset mining is widely used as a fundamental data mining technique. However, as the data size increases, the relatively slow performances of the existing methods hinder its applicability. Although many sequential frequent itemset mining methods have been proposed, there is a clear limit to the performance that can be achieved using a single thread. To overcome this limitation, various parallel methods using multi-core CPU, multiple machine, or many-core graphic processing unit (GPU) approaches have been proposed. However, these methods still have drawbacks, including relatively slow performance, data size limitations, and poor scalability due to workload skewness. In this paper, we propose a fast GPU-based frequent itemset mining method called GMiner for large-scale data. GMiner achieves very fast performance by fully exploiting the computational power of GPUs and is suitable for large-scale data. The method performs mining tasks in a counterintuitive way: it mines the patterns from the first level of the enumeration tree rather than storing and utilizing the patterns at the intermediate levels of the tree. This approach is quite effective in terms of both performance and memory use in the GPU architecture. In addition, GMiner solves the workload skewness problem from which the existing parallel methods suffer; as a result, its performance increases almost linearly as the number of GPUs increases. Through extensive experiments, we demonstrate that GMiner significantly outperforms other representative sequential and parallel methods in most cases, by orders of magnitude on the tested benchmarks.

Introduction

As a fundamental data mining technique, frequent itemset mining is widely used in a wide range of disciplines such as market basket analysis, web usage mining, social network analysis, intrusion detection, bioinformatics, and recommendation systems. However, the deluge of data generated by automated systems for diagnostic or analysis purposes makes it difficult or even impossible to apply mining techniques in many realworld applications. The existing methods often fail to find frequent itemsets in such big data within a reasonable amount of time. Thus, in terms of computational time, itemset mining is still a challenging problem that has not yet been completely solved. Many sequential frequent itemset mining methods such as Apriori [2], Eclat [35], FP-Growth [14], and LCM [30] use a single CPU thread. However, these singlethreaded applications all have a fundamental mining performance limit because CPU clock speed is generally no longer increasing. To overcome the single-thread performance limit, multiple parallel frequent itemset mining methods have been proposed. These methods can be categorized into three main groups: (1) (CPU-based) multithreaded methods, (2) distributed methods, and (3) graphic processing unit (GPU)- based methods. We omit the term ”multi-thread” from the GPU-based methods because they are obviously multi-threaded. The first group focuses on accelerating the performance of the single-threaded methods by exploiting multi-core CPUs [20, 24, 26, 27, 29], while the second group tries to accelerate the performance by exploiting multiple machines [12, 17, 19]. Details about these methods are available in recent survey studies.

دیدگاهتان را بنویسید

نشانی ایمیل شما منتشر نخواهد شد. بخش‌های موردنیاز علامت‌گذاری شده‌اند *

دکمه بازگشت به بالا