مشخصات مقاله | |
ترجمه عنوان مقاله | رویکرد محاسباتی نرم برای خلاصه سازی کلان داده ها |
عنوان انگلیسی مقاله | A soft computing approach to big data summarization |
انتشار | مقاله سال 2018 |
تعداد صفحات مقاله انگلیسی | 17 صفحه |
هزینه | دانلود مقاله انگلیسی رایگان میباشد. |
پایگاه داده | نشریه الزویر |
نوع نگارش مقاله | مقاله پژوهشی (Research article) |
مقاله بیس | این مقاله بیس نمیباشد |
نمایه (index) | scopus – master journals – JCR |
نوع مقاله | ISI |
فرمت مقاله انگلیسی | |
ایمپکت فاکتور(IF) | 2.675 در سال 2017 |
شاخص H_index | 144 در سال 2018 |
شاخص SJR | 1.138 در سال 2018 |
رشته های مرتبط | ریاضی |
گرایش های مرتبط | محاسبات نرم |
نوع ارائه مقاله | ژورنال |
مجله / کنفرانس | مجموعه ها و سیستم های فازی – Fuzzy Sets and Systems |
دانشگاه | IRISA – University of Rennes – UMR – Lannion – France |
کلمات کلیدی | شخصی سازی داده ها؛ خلاصه زبانشناسی؛ محاسبات نرم؛ استخراج دانش؛ تجسم؛ اندازه گیری دقیق |
کلمات کلیدی انگلیسی | Data personalisation; Linguistic summaries; Soft computing; Knowledge extraction; Visualization; Specificity measure |
شناسه دیجیتال – doi |
https://doi.org/10.1016/j.fss.2018.02.017 |
کد محصول | E9542 |
وضعیت ترجمه مقاله | ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید. |
دانلود رایگان مقاله | دانلود رایگان مقاله انگلیسی |
سفارش ترجمه این مقاله | سفارش ترجمه این مقاله |
فهرست مطالب مقاله: |
Abstract 1 Introduction 2 Related work 3 Preliminaries 4 From the description space to the summarization space 5 Representativity-driven summary visualization 6 Experimentations 7 Conclusion References |
بخشی از متن مقاله: |
Abstract
The added value of a dataset lies in the knowledge a domain expert can extract from it. Considering the continuously increasing volume and velocity of these datasets, efficient tools have to be defined to generate meaningful, condensed and human-interpretable representations of big datasets. In the proposed approach, soft computing techniques are used to define an interface between the numerical and categorical space of data definition and the linguistic space of human reasoning. Based on the expert’s own vocabulary about the data, a personal summary composed of linguistic terms is efficiently generated and graphically displayed as a term cloud offering a synthetic view of the data properties. Using dedicated indexing strategies linking data and their subjective linguistic rewritings, exploration functionalities are provided on top of the summary to let the user browse the data. Experimentations confirm that the space change operates in linear time wrt. the size of the dataset making the approach tractable on large scale data. © 2018 Elsevier B.V. All rights reserved. Introduction Data analysis is a crucial task at the center of many professional activities and now constitutes a support for decision making, communicating and reporting. Considering the continuously increasing volume and velocity of these datasets, domain experts (as insurers, data journalists, communication managers, decision makers, etc.), who are not, most of the time, data or computer scientists, need efficient tools that help them turn data into useful knowledge. This explains the recent growing interest for so-called Agile Business Intelligence (ABI) systems that reconsider classical data integration processes to favor pragmatic approaches that make domain experts self-reliant in the analysis of raw data. A dataset generally consists of a large collection of items described by numerical and categorical attributes. A way to assist experts in their fastidious task of data-to-knowledge translation is to define efficient strategies that generate meaningful, condensed and human-interpretable representations of the data. To be very useful, such representations should give an insight into the data properties and make it easy for the domain expert to identify the most representative properties of the dataset. In this sense, when a dataset is so large that it cannot be easily perused and analyzed by the user, data summarization is of a particular interest to obtain a big picture of the data distribution on the different dimensions. Such a summary should also offer exploration functionalities to let the expert interactively browse the dataset from its summary and discover interesting properties possessed by different data subsets. |