مقاله انگلیسی رایگان در مورد خوشه بندی ماشین بردار پشتیبانی

مشخصات مقاله
عنوان مقاله	Clustering categories in support vector machines
ترجمه عنوان مقاله	دسته های خوشه بندی در ماشین های بردار پشتیبانی
فرمت مقاله	PDF
نوع مقاله	ISI
نوع نگارش مقاله	مقاله پژوهشی (Research article)
سال انتشار	مقاله سال 2017
تعداد صفحات مقاله	10 صفحه
رشته های مرتبط	مهندسی کامپیوتر
گرایش های مرتبط	هوش مصنوعی
مجله	مجله امگا – Omega
دانشگاه	اداره آمار و تحقیقات اپراتیو، دانشگاه اسپانیا
کلمات کلیدی	دستگاه بردار پشتیبانی، ویژگی های طبقه بندی شده، طبقه بندی اسپارتی، خوشه بندی، برنامه نویسی محدود شده، برنامه نویسی 0-1
کد محصول	E4431
نشریه	نشریه الزویر
لینک مقاله در سایت مرجع	لینک این مقاله در سایت الزویر (ساینس دایرکت) Sciencedirect – Elsevier
وضعیت ترجمه مقاله	ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید.
دانلود رایگان مقاله	دانلود رایگان مقاله انگلیسی
سفارش ترجمه این مقاله	سفارش ترجمه این مقاله

بخشی از متن مقاله:

1. Introduction

In supervised classification [2,18,37], we are given a set of objects Ω partitioned, in its simplest setting, into two classes, and the aim is to classify new objects. Given an object iAΩ, it is represented by a vector ðxi; x0 i ; yiÞ. The feature vector xi is associated with J categorical features, that can be binarized by splitting each feature into a series of 0-1 dummy features, one for each category, and takes values on a set XDf0; 1g PJ j ¼ 1 Kj , where Kj is the number of categories of feature j. Thus, xi ¼ ðxi;j;kÞ, where xi;j;k is equal to 1 if the value of categorical feature j in object i is equal to category k and 0 otherwise. The feature vector x0 i is associated with J 0 continuous features and takes values on a set X0 DRJ 0 . Finally, yi Af1; þ1g is the class membership of object i. Information about objects is only available in the so-called training sample, with n objects.

In many applications of supervised classification datasets are composed by a large number of features and/or objects [26], making it hard to both build the classifier and interpret the results. In this case, it is desirable to obtain a sparser classifier, which may make classification easier to handle and interpret, less prone to overfitting and computationally cheaper when classifying new objects. The most popular strategy proposed in the literature to achieve this goal is feature selection [14,15,17,35], which aims at selecting the subset of most relevant features for classification while maintaining or improving accuracy and preventing the risk of overfitting. Feature selection reduces the number of features by means of an all-or-nothing procedure. For categorical features, binarized as explained above, it simply ignores some categories of some features, and does not give valuable insight on the relationship between feature categories. These issues may imply a significant loss of information.

A state-of-the-art method in supervised classification is the support vector machine (SVM). The SVM aims at separating both classes by means of a classifier, ðωÞ > xþ ðω0 Þ > x0 þb ¼ 0, ðω;ω0 Þ being the so-called score vector, where ω is associated with the categorical features and ω0 is associated with the continuous features. Given an object i, it is classified in the positive or the negative class, according to the sign of the score function, signððωÞ > xi þ ðω0 Þ > x0 i þbÞ, while for the case ðωÞ > xi þ ðω0 Þ > x0 i þ b ¼ 0, the object is classified randomly. See [5,11,17,24,29] for successful applications of the SVM and [10] for a recent review on Mathematical Optimization and the SVM.

دیدگاهتان را بنویسید لغو پاسخ