مشخصات مقاله | |
ترجمه عنوان مقاله | کاوش و تجسم اطلاعات ضد و نقیض |
عنوان انگلیسی مقاله | Mining and visualising contradictory data |
انتشار | مقاله سال 2017 |
تعداد صفحات مقاله انگلیسی | 11 صفحه |
هزینه | دانلود مقاله انگلیسی رایگان میباشد. |
پایگاه داده | نشریه اسپرینگر |
مقاله بیس | این مقاله بیس میباشد |
نمایه (index) | DOAJ – scopus |
نوع مقاله | ISI |
فرمت مقاله انگلیسی | |
رشته های مرتبط | مهندسی صنایع |
گرایش های مرتبط | داده کاوی |
نوع ارائه مقاله |
ژورنال |
مجله | کلان داده – Journal of Big Data |
دانشگاه | Computer Science Department – University of Nigeria – Abuja Building – Nigeria |
کلمات کلیدی | ConTra، مقادیر مجزا از کوما، مجموعه داده، تناقضات، داده های متضاد، مقادیر خروج متقابل |
کلمات کلیدی انگلیسی | ConTra،Comma separated values،Dataset،Contradictions،Contradictory data،Mutual exclusion values |
شناسه دیجیتال – doi |
https://doi.org/10.1186/s40537-017-0100-9 |
کد محصول | E10502 |
وضعیت ترجمه مقاله | ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید. |
دانلود رایگان مقاله | دانلود رایگان مقاله انگلیسی |
سفارش ترجمه این مقاله | سفارش ترجمه این مقاله |
فهرست مطالب مقاله: |
Abstract
Introduction Mining and visual analysis of contradictory data using ConTra Dataset analysis and results Performance evaluation of ConTra Conclusion and the way forward References |
بخشی از متن مقاله: |
Abstract Big datasets are often stored in fat fles and can contain contradictory data. Contradictory data undermines the soundness of the information from a noisy dataset. Traditional tools such as pie chart and bar chart are overwhelmed when used to visually identify contradictory data in multidimensional attribute-values of a big dataset. This work explains the importance of identifying contradictions in a noisy dataset. It also examines how contradictory data in a large and noisy dataset can be mined and visually analysed. The authors developed ‘ConTra’, an open source application which applies mutual exclusion rule in identifying contradictory data, existing in comma separated values (CSV) dataset. ConTra’s capability to enable the identifcation of contradictory data in diferent sizes of datasets is examined. The results show that ConTra can process large dataset when hosted in servers with fast processors. It is also shown in this work that ConTra is 100% accurate in identifying contradictory data of objects whose attribute values do not conform to the mutual exclusion rule of a dataset in CSV format. Diferent approaches through which ConTra can mine and identify contradictory data are also presented. Introduction A noisy dataset can contain contradictory data. Contradictory data is synonymous to incorrect data and it is important that such data be investigated and evaluated when analysing a noisy dataset. Diferent approaches to dealing with contradictory data have been proposed by diferent researchers. For example [1, 2] proposed methods for identifying and removing contradictory data in noisy datasets. However, the removal of contradictory data from a noisy dataset will increase the incompleteness in the dataset thereby reducing the soundness of any information from such set of data. It is therefore important to identify and evaluate contradictory instances when analysing a large and noisy dataset. Tis will improve the soundness of the analysis from such a dataset. Evidently, the analysis of big data is identifed as the next frontier for innovation and advancement of technology [3, 4]. Tere is therefore the need to identify appropriate approaches to dealing with contradictions in a large and noisy dataset. Tere are diferent forms of contradictions. For example, there are contradictions from the use of modal words, structural, subtle lexical contrasts, as well as world knowledge |