مقاله انگلیسی رایگان در مورد شناسایی تویت های اسپم در زمان واقعی - الزویر 2019

مشخصات مقاله
ترجمه عنوان مقاله	چارچوب مبتنی بر تجمع نظارت نشده برای بازآموزی پویا مدل شناسایی توییت های اسپم در زمان واقعی نظارت شده
عنوان انگلیسی مقاله	Unsupervised collective-based framework for dynamic retraining of supervised real-time spam tweets detection model
انتشار	مقاله سال 2019
تعداد صفحات مقاله انگلیسی	24 صفحه
هزینه	دانلود مقاله انگلیسی رایگان میباشد.
پایگاه داده	نشریه الزویر
نوع نگارش مقاله	مقاله پژوهشی (Research Article)
مقاله بیس	این مقاله بیس نمیباشد
نمایه (index)	Scopus – Master Journals List – JCR
نوع مقاله	ISI
فرمت مقاله انگلیسی	PDF
ایمپکت فاکتور(IF)	5.891 در سال 2018
شاخص H_index	162 در سال 2019
شاخص SJR	1.190 در سال 2018
شناسه ISSN	0957-4174
شاخص Quartile (چارک)	Q1 در سال 2018
مدل مفهومی	ندارد
پرسشنامه	ندارد
متغیر	ندارد
رفرنس	دارد
رشته های مرتبط	مهندسی کامپیوتر، مهندسی فناوری اطلاعات
گرایش های مرتبط	اینترنت و شبکه های گسترده
نوع ارائه مقاله	ژورنال
مجله / کنفرانس	سیستم های خبره با کابردهای مربوطه – Expert Systems with Applications
دانشگاه	IRIT, University of Toulouse, CNRS, INPT, UPS, UT1, UT2J, France
کلمات کلیدی	توییتر، زمان واقعی، اسپم، اسپم های اجتماعی، جریان توییتری
کلمات کلیدی انگلیسی	Twitter، Real-time، Spam، Social spammers، Twitter stream
شناسه دیجیتال – doi	https://doi.org/10.1016/j.eswa.2019.05.052
کد محصول	E13559
وضعیت ترجمه مقاله	ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید.
دانلود رایگان مقاله	دانلود رایگان مقاله انگلیسی
سفارش ترجمه این مقاله	سفارش ترجمه این مقاله

فهرست مطالب مقاله:

Abstract
1. Introduction
2. Related work
3. Problem definition and formalization
4. Dataset description and ground truth
5. Unsupervised collective-based and real-time spam filtering model
6. Experimental setup and results
7. Conclusion
CRediT authorship contribution statement
Appendix A. Advanced performance results
Conflict of interest
References

بخشی از متن مقاله:

Abstract

Twitter is one of the most popular social platforms. It has changed the way of communication and information dissemination through its real-time messaging mechanism. Recently, it has been used by researchers and industries as a new source of data for various intelligent systems, such as tweet sentiment analysis and recommendation systems, which require high data quality. However, due to its flexibility and popularity, Twitter has become the main target for spamming activities such as phishing legitimate users or spreading malicious software, which introduces new security issues and waste resources. Therefore, researchers have developed various machine-learning algorithms to reveal Twitter spam. However, as spammers have become smarter and more crafty, the characteristics of the spam tweets are varying over time making these methods inefficient to detect new spammers tricks and strategies. In addition, some of the employed methods (e.g. blacklisting) or spammer features (e.g. graph-based features) are extremely time-consuming, which hinders the ability to detect spammer activities in real-time. In this paper, we introduce a framework to deal with the volatility of the spam contents and new spamming patterns, called the spam drift. The framework combines the strength of unsupervised machine learning approach, which learns from unlabeled tweets, to retrain a real-time supervised tweet-level spam detection model in a batch mode. A set of experiments on a large-scale data set show the effectiveness of the proposed online unsupervised method in adaptively discovers and learns the patterns of new spam activities and achieve stable recall values reaching more than 95%. Although the average spam precision of our method is around 60%, the high spam recall values show the ability of our proposed method in reducing spam drift problems compared to traditional machine learning algorithms.

Introduction

The new great characteristics of Social Web that involve users as information producers have exposed different information quality (IQ) problems (Agarwal & Yiliyasi, 2010). For example, Twitter, which is the most popular microblogging sites, has a real-time messaging mechanism that makes it more popular and suitable for handling real-time public events and updates. In addition, due to its popularity, social-based researchers adopt it as a main source of information in performing their experiments on related research areas (Abascal-Mena, Lema, & Sèdes, 2015; Hoang & Mothe, 2016; Mezghani et al., 2015; Mezghani, Zayani, Amous, Péninou, & Sèdes, 2014; Zubiaga, Spina, Amigó, & Gonzalo, 2012; Zubiaga, Spina, Fresno, & Martínez-Unanue, 2011). However, the simplicity and flexibility of using these sites, and the absence of any effective restrictions on content posting action might be viewed as additional challenges for having IQ issues. Indeed, social spam content, which is published by a well-known kind of ill-intentioned users, so-called social spammers, is one of the most common noises appearing on online social media (OSM) sites and is categorized under IQ problems. Social spammers intensively post nonsensical contents such as advertisements, porn materials, viruses, malware, and phishing websites in different contexts (e.g., topics) and in an automated and systematic way (Benevenuto, Magno, Rodrigues, & Almeida, 2010; Washha, Qaroush, & Sèdes, 2016). Moreover, Social spammers exploit trending topics and available services or APIs to lunch their spammy content in short periods (e.g., one day) to maximize their monetary profits and speed up their spamming behavior.

مقاله انگلیسی رایگان در مورد شناسایی تویت های اسپم در زمان واقعی – الزویر 2019

دیدگاهتان را بنویسید لغو پاسخ