مقاله انگلیسی رایگان در مورد الگوریتم پیشرفته ماشین بردار پشتیبانی برای طبقه بندی کلان داده – IEEE 2018

 

مشخصات مقاله
ترجمه عنوان مقاله تحقیق درباره الگوریتم پیشرفته ماشین بردار پشتیبانی (SVM) برای طبقه بندی کلان داده
عنوان انگلیسی مقاله Research on SVM improved algorithm for large data classification
انتشار مقاله سال 2018
تعداد صفحات مقاله انگلیسی 5 صفحه
هزینه دانلود مقاله انگلیسی رایگان میباشد.
پایگاه داده نشریه IEEE
مقاله بیس این مقاله بیس نمیباشد
فرمت مقاله انگلیسی  PDF
رشته های مرتبط مهندسی کامپیوتر، فناوری اطلاعات
گرایش های مرتبط الگوریتم ها و محاسبات، هوش مصنوعی، مدیریت سیستم های اطلاعاتی
نوع ارائه مقاله
کنفرانس
مجله / کنفرانس کنفرانس بین المللی تحلیل کلان داده – IEEE 3rd International Conference on Big Data Analysis
دانشگاه Liaoning University of Science and Technology – Anshan LiaoNing
کلمات کلیدی ماشین بردار پشتیبانی (SVM)؛ کلان داده؛ چند طبقه بندی؛ فاصله اقلیدس؛ تابع هسته انتگرالی شعاعی
کلمات کلیدی انگلیسی support vector machine (SVM); large data; multiclassification;Euclidean distance; radial integral kernel function
شناسه دیجیتال – doi
https://doi.org/10.1109/ICBDA.2018.8367673
کد محصول E10335
وضعیت ترجمه مقاله  ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید.
دانلود رایگان مقاله دانلود رایگان مقاله انگلیسی
سفارش ترجمه این مقاله سفارش ترجمه این مقاله

 

فهرست مطالب مقاله:
Abstract
I INTRODUCTION
II THE WEIGHTED EUCLIDEAN DISTANCE AND THE RADIAL PRODUCT KERNEL FUNCTION SVM
III CONCLUSION
References

 

بخشی از متن مقاله:
Abstract

In view of the two problems of the SVM algorithm in processing large data, the paper proposed a weighted Euclidean distance, radial integral kernel function SVM and dimensionality reduction algorithm for large data packet classification. The SVM cannot handle multi classification and time of building model is long. The algorithm solved these problems. The improved algorithm reconstructs the data feature space, makes the boundary of different data samples clearer, shortens the modeling time, and improves the accuracy of classification. The proposed method verified the feasibility and effectiveness with experiments. The experimental results show that the improved algorithm can achieve better results when multi-duplicated samples and large data capacity are used for multi classification.

INTRODUCTION

The rapid development of network technology makes a huge amount of data every day. The rapid and accurate classification of the vast amounts of data collected is necessary to extract comprehensible knowledge. According to forecasting by market research firm IDC, global data will exceed 40ZB by 2020[1]. Many industries have provided storage systems with capacity ranging from tens of gigabytes to hundreds of terabytes, or even petabytes. But nearly 60% of the data is repeated, which not only increases data storage, processing time, but also leads to higher and higher costs of data analysis and classification. The efficient and accurate classification algorithm is one of the hot issues in current industry research. There are some common classification algorithms. For example, K-Nearest Neighbor ǃ Native BayesǃNeural NetǃSupport Vector Machine and Linear Least Square Fit and so on[2]. Support vector machine (SVM) algorithm is a kind of machine learning method based on VC dimension theory in statistical learning theory and structural risk minimum principle. It has excellent data classification and regression processing ability[3]. The support vector method was first proposed by Vapnik to solve the problem of pattern recognition. It selects a set of characteristic subsets from the training samples, so that the classification of the characteristic subset is equivalent to the division of the whole dataset. The characteristic subset is called the support vector (SV). Due to its excellent learning ability, the application scope is very wide. For example, intrusion detection, facial expression classification, Time series prediction, speech recognition, signal processing, Gene detection, text classification, font recognition, Fault diagnosis, chemical analysis, image recognition and other fields. SVM algorithm has some obvious advantages in solving classification problems. It has a shorter forecast time. The global optimal solution can guarantee the accuracy of the target detection classifier in the classification. But there are some disadvantages, such as the detection model is established for a long time. Time complexity and space complexity increase linearly with the increase of data when processing large scale data. The data objects are often large data sets in the emerging fields of data mining, document classification and multimedia indexing. The number of attributes and the number of records are very large resulting in poor execution of the processing algorithm[4]. The classifier is only determined based on support vector machine by support vector. The complexity of the classifier is not related to the number of training samples. It only has to do with the number of support vectors[5]. In the paper, propose a weighted Euclidean distance, the radial product kernel function and the decreasing dimension packet support vector machine method, reduces the data dimension, remove redundant feature attributes and duplicate data. A classification model with better generalization ability is obtained by using less support vectors. Reducing storage and processing of data resources, speed up the classification model established time, solve big data classification problems.

دیدگاهتان را بنویسید

نشانی ایمیل شما منتشر نخواهد شد. بخش‌های موردنیاز علامت‌گذاری شده‌اند *

دکمه بازگشت به بالا