مشخصات مقاله | |
ترجمه عنوان مقاله | آمار در عصر کلان داده ها: از کار افتادگی ماشین |
عنوان انگلیسی مقاله | Statistics in the big data era: Failures of the machine |
انتشار | مقاله سال 2018 |
تعداد صفحات مقاله انگلیسی | 11 صفحه |
هزینه | دانلود مقاله انگلیسی رایگان میباشد. |
پایگاه داده | نشریه الزویر |
نوع نگارش مقاله |
Short communication |
مقاله بیس | این مقاله بیس نمیباشد |
نمایه (index) | scopus – master journals – JCR |
نوع مقاله | ISI |
فرمت مقاله انگلیسی | |
ایمپکت فاکتور(IF) |
0.533 در سال 2017 |
شاخص H_index | 54 در سال 2018 |
شاخص SJR | 0.58 در سال 2018 |
رشته های مرتبط | مهندسی فناوری اطلاعات |
گرایش های مرتبط | مدیریت سیستم های اطلاعات |
نوع ارائه مقاله |
ژورنال |
مجله / کنفرانس | اسناد آمار و احتمال – Statistics and Probability Letters |
دانشگاه | Department of Statistical Science – Duke University – United States |
کلمات کلیدی | یادگیری عمیق؛ داده های با ابعاد بزرگ؛ p بزرگ، n کوچک؛ یادگیری ماشین؛ استنتاج علمی؛ تعصب انتخابی؛ مقدار سنجی عدم قطعیت |
کلمات کلیدی انگلیسی | Deep learning; High-dimensional data; Large p, small n; Machine learning; Scientific inference; Selection bias; Uncertainty quantification |
شناسه دیجیتال – doi |
https://doi.org/10.1016/j.spl.2018.02.028 |
کد محصول | E10389 |
وضعیت ترجمه مقاله | ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید. |
دانلود رایگان مقاله | دانلود رایگان مقاله انگلیسی |
سفارش ترجمه این مقاله | سفارش ترجمه این مقاله |
فهرست مطالب مقاله: |
Abstract Keywords 1 Introduction 2 Case studies 3 Uncertainty quantification in scientific inferences 4 Issues with sampling, selection bias and measurement error 5 Discussion Acknowledgments |
بخشی از متن مقاله: |
1. Introduction
1.1 Different cultures The culture and ways in which the statistical community thinks of analyzing and interpreting data have been rapidly evolving in recent years, with the machine learning and signal processing communities having a fundamental impact on the rate and direction of this evolution. To set the stage for this discussion article, it is helpful to first comment on the culture and background of the machine learning and statistical communities. These comments are meant to give a “cartoon” of a complex reality, with this cartoon helpful as a starting point for discussion. Machine learning (ML) community: tends to have its roots in engineering, computer science, and to a certain extent neuroscience – growing out of artificial intelligence (AI). The main publication outlets tend to be peer-reviewed conference proceedings, such as Neural Information Processing Systems (NIPS), and the style of research is very fast paced, trendy, and driven by performance metrics in prediction and related tasks. One measure of “trendiness” is the fact that there is a strong auto-correlation in the main focus areas that are represented in the papers accepted to NIPS and other top conferences. For example, in the past several years much of the focus has been on deep neural network methods. The ML community also has a tendency towards marketing and salesmanship, posting talks and papers on social media and attempting to sell their ideas to the broader public. This feature of the research seems to reflect a desire or tendency to want to monetize the algorithms in the near term, perhaps leading to a focus on industry problems over scientific problems, where the road to monetization is often much longer and less assured. ML marketing has been quite successful in recent years, and there is abundant interest and discussion in the general public about ML/AI, along with increasing success in start-ups and industrial sector high paying jobs partly fueled by the hype. Statistical (Stats) community: made up predominantly of researchers who received their initial degree(s) in mathematics followed by graduate training in statistics. The main publication outlets are peer-reviewed journals, most of which have a long drawn out review process, and the style of research tends to be careful, slower paced, intellectual as opposed to primarily performance driven, emphasizing theoretical support (e.g., through asymptotic properties), under-stated, and conservative. Statisticians tend to be reluctant to market their research, and their training tends to differ dramatically from that for most ML researchers. Statisticians usually have a mathematics base including multivariate calculus, linear algebra, differential equations, and real analysis. They then take several years of probability and statistics, including coverage of asymptotic theory, statistical sampling theory, hypothesis testing, experimental design, and many other areas. ML researchers coming out of Computer Science and Engineering have much less background in many of these areas, but have a stronger background in signal processing, computing (including not just programming but also an understanding of computer engineering and hardware), optimization, and computational complexity. |