مشخصات مقاله | |
ترجمه عنوان مقاله | یک روش نوین کلی برای نزدیک ترین همسایه K |
عنوان انگلیسی مقاله | A novel ensemble method for k-nearest neighbor |
انتشار | مقاله سال 2019 |
تعداد صفحات مقاله انگلیسی | 28 صفحه |
هزینه | دانلود مقاله انگلیسی رایگان میباشد. |
پایگاه داده | نشریه الزویر |
نوع نگارش مقاله | مقاله پژوهشی (Research article) |
مقاله بیس | این مقاله بیس نمیباشد |
نمایه (index) | scopus – master journals – JCR |
نوع مقاله | ISI |
فرمت مقاله انگلیسی | |
ایمپکت فاکتور(IF) | 3.962 در سال 2017 |
شاخص H_index | 168 در سال 2019 |
شاخص SJR | 1.065 در سال 2019 |
رشته های مرتبط | مهندسی کامپیوتر، فناوری اطلاعات |
گرایش های مرتبط | الگوریتم ها و محاسبات، مهندسی نرم افزار |
نوع ارائه مقاله | ژورنال |
مجله / کنفرانس | الگو شناسی – Pattern Recognition |
دانشگاه | Nanjing University of Science and Technology – PR China |
کلمات کلیدی | فاصله متریک؛ نزدیک ترین همسایه K؛ یادگیری گروهی؛ فضای تصادفی؛ تئوری شواهد |
کلمات کلیدی انگلیسی | Distance metric; k-nearest neighbor; ensemble learning; random subspace; evidence theory |
شناسه دیجیتال – doi |
https://doi.org/10.1016/j.patcog.2018.08.003 |
کد محصول | E9446 |
وضعیت ترجمه مقاله | ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید. |
دانلود رایگان مقاله | دانلود رایگان مقاله انگلیسی |
سفارش ترجمه این مقاله | سفارش ترجمه این مقاله |
فهرست مطالب مقاله: |
Abstract 1 Introduction 2 Preliminaries 3 Weighted heterogeneous distance metric for kNN algorithm 4 RRSB 5 Experimental results 6 Conclusions References |
بخشی از متن مقاله: |
Abstract
In this paper, to address the issue that ensembling k-nearest neighbor (kNN) classifiers with resampling approaches cannot generate component classifiers with a large diversity, we consider ensembling kNN through a multimodal perturbation-based method. Since kNN is sensitive to the input attributes, we propose a weighted heterogeneous distance Metric (WHDM). By using a WHDM and evidence theory, a progressive kNN classifier is developed. Based on a progressive kNN, the random subspace method, attribute reduction, and Bagging, a novel algorithm termed RRSB (reduced random subspace-based Bagging) is proposed for construct ensemble classifier, which can increase the diversity of component classifiers without damaging the accuracy of the component classifiers. In detail, RRSB adopts the perturbation on the learning parameter with a weighted heterogeneous distance metric, the perturbation on the input space with random subspace and attribute reduction, the perturbation on the training data with Bagging, and the perturbation on the output target of k neighbors with evidence theory. In the experimental stage, the value of k, the different perturbations on RRSB and the ensemble size are analyzed. In addition, RRSB is compared with other multimodal perturbation-based ensemble algorithms on multiple UCI data sets and a KDD data set. The results from the experiments demonstrate the effectiveness of RRSB for kNN ensembling. Introduction Ensemble learning has been a prominent topic in the field of machine learning in recent years, and it is listed as the first of four research directions in machine learning research by Dietterich [1]. To enhance the generalization performance of ensemble learning, many different approaches have been proposed for training accurate but diverse component classifiers. According to the mode of training the classifier, the typical ensemble approaches can be divided into three cases [2]: component classifier is trained on a different attribute subspace. component classifier is trained on different resampling training data. component classifier is trained on a data set with several different parameters. The ensemble scheme may take into account any of the above three techniques. For example, each component classifier is trained on a randomly selected attribute space in the case of the random subspace method (RSM) [3, 4]. Ho [3] first proposed RSM and applied it in a decision tree ensemble and then investigated RSM in a kNN ensemble [4]. Gu et al. [5] proposed a random subspace-based sparse representation ensemble algorithm, where sparse representations in multiple subspaces are integrated into an ensemble sparse representation. Rotation forest (RoF) [6, 7] is an improved version of RSM. Genetic algorithm (GA) is also used to select a best fitting attribute space for each component classifier [8]. Bagging obtains different component classifiers through training on bootstrap sampling data [9]. Boosting is another resampling-based ensemble method, which considers the weight probability distribution of resampling at each trial [10]. The perturbation of parameters is often applied in neural network ensembles. For instance, random initial weights are used to train each neural network [11]. Gabrys and Ruta [12] used GA for selecting classifier prototypes, attribute space and combination rules simultaneously. Unlike neural network and decision tree classifiers have many parameters, kNN classifier has only two parameters, i.e., the distance measure for computing the distance of a given test sample to the training samples and the number of neighbors k, which makes kNN ensembles challenging. Although Bagging has achieved great success on decision trees [13] and neural networks [14], it can hardly work well on kNN classifier because kNN is a stable classifier. As Breiman [9] pointed out that Bagging can hardly work on kNN because Bagging uses the bootstrap resampling technique to generate accurate but diverse component classifiers, which is effective on unstable methods such as decision tree and neural network. |