مقاله انگلیسی رایگان در مورد اندازه گیری اطلاعات تبعیضی برای نمایش دانش انسانی – الزویر 2019

 

مشخصات مقاله
ترجمه عنوان مقاله بنیان مبتنی بر قانون قدرت برای اندازه گیری اطلاعات تبعیضی برای نمایش دانش انسانی
عنوان انگلیسی مقاله Power law based foundation for the measurement of discrimination information for human knowledge representation
انتشار مقاله سال 2019
تعداد صفحات مقاله انگلیسی  11 صفحه
هزینه دانلود مقاله انگلیسی رایگان میباشد.
پایگاه داده نشریه الزویر
نوع نگارش مقاله
مقاله پژوهشی (Research Article)
مقاله بیس این مقاله بیس نمیباشد
نمایه (index) Scopus – Master Journals List – JCR
نوع مقاله ISI
فرمت مقاله انگلیسی  PDF
ایمپکت فاکتور(IF)
7.007 در سال 2018
شاخص H_index 93 در سال 2019
شاخص SJR .835 در سال 2018
شناسه ISSN 0167-739X
شاخص Quartile (چارک) Q1 در سال 2018
رشته های مرتبط مهندسی فناوری اطلاعات، مهندسی کامپیوتر
گرایش های مرتبط  اینترنت و شبکه های گسترده
نوع ارائه مقاله
ژورنال
مجله / کنفرانس  سیستم های کامپیوتری نسل آینده-Future Generation Computer Systems
دانشگاه  The Third Research Institute of the Ministry of Public Security, 201142, Shanghai, China
کلمات کلیدی  الگوریتم ها، طراحی، نظریه، اطلاعات تبعیضی، قانون قدرت، نظریه اطلاعات
کلمات کلیدی انگلیسی Algorithms، Design، Theory، Discrimination information، Power law، Information theory
شناسه دیجیتال – doi
http://dx.doi.org/10.1016/j.future.2016.10.021
کد محصول  E12085
وضعیت ترجمه مقاله  ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید.
دانلود رایگان مقاله دانلود رایگان مقاله انگلیسی
سفارش ترجمه این مقاله سفارش ترجمه این مقاله

 

فهرست مطالب مقاله:
Abstract
1. Introduction
2. Related work
3. The power law function of keywords
4. DI and power law function: theoretic analysis
5. Identifying the general keywords
6. Identifying the minimum rank keyword
7. Computing DI
8. Document clustering based on DI
9. Conclusion
References

 

بخشی از متن مقاله:
Abstract

The discrimination information (DI) of keyword plays an important role in information retrieval and data mining. However, the measurement of DI is still a challenge because the existing methods cannot leverage the contradiction between accuracy and complexity. In this paper, a new model is proposed, does not need any prior knowledge and the computing complexity is O(nm) for a collection of m documents with n keywords. Firstly, we define three types of keywords according to the document frequency spectrum, which divides the spectrum of keywords into two monotonically spectrums that can give a qualitative analysis of DI. Secondly, in order to decrease the complexity, the power law function of keywords’ document frequencies is built. Thirdly, we propose an algorithm to classify keywords by using the distances between the adjacent points on the linear regression line. Finally, a piecewise function is used for computing DI according to the monotonically spectrums, which transforms DI into a scalable value to be used directly, thereby reducing the computing complexity of DI significantly. Moreover, a new weighting scheme of keywords based on DI is employed for document clustering, which shows that DI has a good prospect on the information retrieval area.

Introduction

It has been widely recognized that different keyword possesses diverse discrimination information (DI) in a knowledge base system. For example, ‘‘Computer’’ possesses a lower DI than ‘‘CPU’’ in the computer field. ‘‘Example Learning’’ possesses a higher DI than ‘‘Intelligence’’ in the area of artificial intelligence. In reality, DI has a wide range of applications including semantic annotations for Web pages [1–3], discovery of semantic community [4–6], documents clustering/classification [7–9], e-learning technology [10,11,4], etc. In addition, DI is important for web search [12–15], which can be used for query expansion to help users find more relevant information. Therefore, how to compute DI is a basic problem for information retrieval and data mining. In [16], Salton et al. regarded DI as a measurement of the variation in the average similarity between documents in a collection. A good discriminator is an assigned keyword which can reduce the average similarity between documents. In contrast, a poor discriminator increases the inter-document similarity. Unfortunately, the computing complexity of DI is proportional to O(nm2 ) for a collection of m documents with n keywords, which is unpractical to be used directly for a collection containing large documents. Cai [17] uses information theory to compute DI. In that work, the discrimination information of a keyword refers to the amount of information conveyed by a keyword in support of a certain category of documents and rejecting other categories. An informative keyword should have a high capability of categorizing document.

دیدگاهتان را بنویسید

نشانی ایمیل شما منتشر نخواهد شد. بخش‌های موردنیاز علامت‌گذاری شده‌اند *

دکمه بازگشت به بالا