مقاله انگلیسی رایگان در مورد تحقیقی درباره تکنولوژی، الگوریتم و کاربرد وب کاوی – IEEE 2017
مشخصات مقاله | |
انتشار | مقاله سال ۲۰۱۷ |
تعداد صفحات مقاله انگلیسی | ۴ صفحه |
هزینه | دانلود مقاله انگلیسی رایگان میباشد. |
منتشر شده در | نشریه IEEE |
نوع مقاله | ISI |
مقاله بیس | این مقاله بیس نمیباشد |
عنوان انگلیسی مقاله | Research on Technology, Algorithm and Application of Web Mining |
ترجمه عنوان مقاله | تحقیقی درباره ی تکنولوژی، الگوریتم و کاربرد وب کاوی |
فرمت مقاله انگلیسی | |
رشته های مرتبط | مهندسی کامپیوتر |
گرایش های مرتبط | مهندسی نرم افزار، الگوریتم ها و محاسبات |
نوع ارائه مقاله |
کنفرانس |
مجله | کنفرانس بین المللی مهندسی و علوم محاسباتی – International Conference on Computational Science and Engineering |
دانشگاه | Inner Mongolia University of Finance and Economics – China |
کلمات کلیدی | تکنولوژی وب کاوی؛ کاوش ساختار وب؛ کاوش محتوای وب؛ کاوش چند رسانه ای |
کلمات کلیدی انگلیسی | Web mining technology; Web structure mining˗ Web content mining; Multi-media mining |
شناسه دیجیتال – doi | https://doi.org/10.1109/CSE-EUC.2017.152 |
کد محصول | E7076 |
وضعیت ترجمه مقاله | ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید. |
دانلود رایگان مقاله | دانلود رایگان مقاله انگلیسی |
سفارش ترجمه این مقاله | سفارش ترجمه این مقاله |
بخشی از متن مقاله: |
I. INTRODUCTION
What is Web mining? It is the process that discover and extract the useful mode and knowledge that people are interested from the massive Web documents and activities through data mining technology [1]. Compared to the wellknown Data mining, Web mining can be extended to a deeper and broader areas, the differences between them are also very obvious: the object of data mining is the data stored in database, that is to say, the structured data; Web Mining aims at the contents or structure of Web document, which has a feature of wide-distributed, dynamic and heterogeneous, and contains unstructured or semi-structured data. Based on the diversity of information on the Web, Web mining is divided into the following category as shown in figure 1: Web structure mining, Web content mining and Web usage mining [2]. These three mining methods are different in the aspect of dealing with the main data, processing methods and application areas ۱) Web Structure Mining mainly deals with Web structure data, it can be divided into page structure mining and URL mining (hyperlink mining) [3]. 2) Web Content Mining mainly deals with unstructured data and semi-structured data, can be refined into Web text mining and Web multimedia mining based on the content, in which multimedia mining is a popular research topic at present [4]. 3) Web Usage Mining can be divided into general access mode analysis and customizing Web site, it analyzes Web site logs to find some valuable knowledge. This paper will analyze the realization of Web content mining and Web structure mining, their basic algorithm principles and their application areas. II. WEB STRUCTURAL MINING A. Introduction of Structural Mining Massive Web sites constitute the entire Internet network, and each page in these Web sites more or less includes some hyperlinks, which contains a lot of potential information. The purpose of Web structure mining is to dig out the hidden knowledge, so that it can be fully applied. For example, if a paper is cited for a lot of times, it is proved that this paper is very authoritative in its field of study. Similarly, if there are a lot of Web pages pointing to the same page X, we think that page X has higher authority. In the field of search engine, it is very important to place the most authoritative page at the first of the search results, because when using the search engine users want to acquire authoritative and publicly recognized results, rather than some of the incorrect result pages or even insignificant ad page. Structural mining is based on this judging method to get a lot of information and help people with navigation and recommendations of the authoritative Web pages. This article will briefly introduce two well-known structure mining algorithm, PageRank and HITS. B. PageRank Algorithm The PageRank algorithm draws on the traditional citation analysis: when page A has a link to page B, we think that B gets the score that A contributes to it. This score depends on the importance of A, that is to say, the more important page A is, the higher the contribution score of page B gets. Based on this premise, Google uses this algorithm to give each page a number of values (PageRank) as a reference to the page quality. PageRank values from 0 to 10, the higher the value, the higher the quality and popularity of the page, the more forward the corresponding search results. |