مشخصات مقاله | |
انتشار | مقاله سال 2018 |
تعداد صفحات مقاله انگلیسی | 30 صفحه |
هزینه | دانلود مقاله انگلیسی رایگان میباشد. |
منتشر شده در | نشریه الزویر |
نوع مقاله | ISI |
عنوان انگلیسی مقاله | A course on big data analytics |
ترجمه عنوان مقاله | دوره آنالیز کلان داده |
فرمت مقاله انگلیسی | |
رشته های مرتبط | مهندسی کامپیوتر |
گرایش های مرتبط | رایانش ابری |
مجله | مجله محاسبات موازی و توزیع شده – Journal of Parallel and Distributed Computing |
دانشگاه | Stetson University – DeLand – Florida |
کلمات کلیدی | برنامه درسی، تحصیلات تکمیلی، داده های بزرگ، محاسبات ابری |
کلمات کلیدی انگلیسی | curriculum, undergraduate education, big data, cloud computing |
شناسه دیجیتال – doi | https://doi.org/10.1016/j.jpdc.2018.02.019 |
کد محصول | E8250 |
وضعیت ترجمه مقاله | ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید. |
دانلود رایگان مقاله | دانلود رایگان مقاله انگلیسی |
سفارش ترجمه این مقاله | سفارش ترجمه این مقاله |
بخشی از متن مقاله: |
1. Introduction
In 2015, Stetson University introduced a Data Analytics interdisciplinary minor for undergraduate students. A new four credit hour course focused on big data analytics was created to serve as an elective for this minor as well as 5 an upper-level elective for computer science majors. This report documents the curriculum, infrastructure, and outcomes of our big data analytics course. The landscape of big data tools, techniques, and application areas is vast [1]. Popular tools include the software frameworks Hadoop, Spark, and Hive, as well as cloud-based services like Google’s BigQuery. Popular techniques in10 clude MapReduce, relational and non-relational data stores, and computations represented as directed acyclic graphs of functions. Sometimes, these tools and techniques include processing overhead that results in slower processing on small data than traditional approaches such as a single-threaded application. Thus, it is important that students understand not only how but also when and when 15 not to use big data technology Our course is designed to give students realistic, hands-on practice with big data. The course is project-focused and the projects are organized so that later projects require more sophisticated data processing techniques. Our projects and the parallel and distributed computing topics that they address are listed in 20 Table 1. Each project is described in more detail in the corresponding sections of this report. Our projects may change in the future as the popularity and appropriateness of various technologies change over time. Students are given access to a “virtual cluster” running on a moderately-sized server as well as cloud computing resources. Each project employs a publicly25 available dataset and challenges students to answer simply-stated queries about the data. While the queries may be simple, the steps to extract the answers from the data may be considerably more complex. We emphasize to students that the data processing steps are a means to an end and the ultimate goal of each project is to provide clear, insightful answers to the project’s queries. 30 Students are required to produce a short report documenting these answers with supporting statistical analysis and plots as appropriate. They are expected to explain their findings with language intended for non-experts and to hide all implementation details in an appendix separate from the report. |