مشخصات مقاله | |
ترجمه عنوان مقاله | رویکردهای هندسی و توپولوژیکی تا کلان داده |
عنوان انگلیسی مقاله | Geometrical and topological approaches to Big Data |
انتشار | مقاله سال 2017 |
تعداد صفحات مقاله انگلیسی | 11 صفحه |
هزینه | دانلود مقاله انگلیسی رایگان میباشد. |
پایگاه داده | نشریه الزویر |
نوع نگارش مقاله |
مقاله پژوهشی (Research Article) |
مقاله بیس | این مقاله بیس میباشد |
نمایه (index) | Scopus – Master Journal List – JCR |
نوع مقاله | ISI |
فرمت مقاله انگلیسی | |
ایمپکت فاکتور(IF) |
5.341 در سال 2017 |
شاخص H_index | 85 در سال 2019 |
شاخص SJR | 0.844 در سال 2017 |
شناسه ISSN | 0167-739X |
شاخص Quartile (چارک) | Q1 در سال 2017 |
رشته های مرتبط | مهندسی فناوری اطلاعات |
گرایش های مرتبط | مدیریت سیستم های اطلاعات |
نوع ارائه مقاله |
ژورنال |
مجله | نسل آینده سیستم های کامپیوتری – Future Generation Computer Systems |
دانشگاه | Department of Computer Science – Faculty of Electrical Engineering and Computer Science – Technical University of Ostrava – Czech Republic |
کلمات کلیدی | کلان داده، صنعت 4.0، تحلیل داده های توپولوژیکی، هماهنگی پایدار، کاهش ابعاد، تجسم کلان داده |
کلمات کلیدی انگلیسی | Big Data، Industry 4.0، Topological data analysis، Persistent homology، Dimensionality reduction، Big Data visualization |
شناسه دیجیتال – doi |
https://doi.org/10.1016/j.future.2016.06.005 |
کد محصول | E10675 |
وضعیت ترجمه مقاله | ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید. |
دانلود رایگان مقاله | دانلود رایگان مقاله انگلیسی |
سفارش ترجمه این مقاله | سفارش ترجمه این مقاله |
فهرست مطالب مقاله: |
Abstract
1- Introduction 2- Big Data technologies during time 3- Motivation examples 4- Mathematical background 5- Topological data analysis 6- Application of computational geometry and topology 7- Big Data visualization 8- Big Data challenges 9- Conclusion References |
بخشی از متن مقاله: |
Abstract Modern data science uses topological methods to find the structural features of data sets before further supervised or unsupervised analysis. Geometry and topology are very natural tools for analysing massive amounts of data since geometry can be regarded as the study of distance functions. Mathematical formalism, which has been developed for incorporating geometric and topological techniques, deals with point cloud data sets, i.e. finite sets of points. It then adapts tools from the various branches of geometry and topology for the study of point cloud data sets. The point clouds are finite samples taken from a geometric object, perhaps with noise. Topology provides a formal language for qualitative mathematics, whereas geometry is mainly quantitative. Thus, in topology, we study the relationships of proximity or nearness, without using distances. A map between topological spaces is called continuous if it preserves the nearness structures. Geometrical and topological methods are tools allowing us to analyse highly complex data. These methods create a summary or compressed representation of all of the data features to help to rapidly uncover particular patterns and relationships in data. The idea of constructing summaries of entire domains of attributes involves understanding the relationship between topological and geometric objects constructed from data using various features. A common thread in various approaches for noise removal, model reduction, feasibility reconstruction, and blind source separation, is to replace the original data with a lower dimensional approximate representation obtained via a matrix or multi-directional array factorization or decomposition. Besides those transformations, a significant challenge of feature summarization or subset selection methods for Big Data will be considered by focusing on scalable feature selection. Lower dimensional approximate representation is used for Big Data visualization. The cross-field between topology and Big Data will bring huge opportunities, as well as challenges, to Big Data communities. This survey aims at bringing together state-of-the-art research results on geometrical and topological methods for Big Data. Introduction Big Data is everywhere as high volumes of varieties of valuable precise and uncertain data can be easily collected or generated at high velocity in various real-life applications. The explosive growth in web-based storage, management, processing, and accessibility of social, medical, scientific and engineering data has been driven by our need for fundamental understanding of the processes which produce this data. It is predicted that volume of the produced data could reach 44 zettabytes in 2020 [1]. The enormous volume and complexity of this data propel technological advancements realized as exponential increases in storage capability, processing power, bandwidth capacity and transfer velocity. This is, partly, because of new experimental methods, and in part because of the increase in the availability of high-powered computing technology. Massive amounts of data (Big Data) are too complex to be managed by traditional processing applications. Nowadays, it includes the huge, complex, and abundant structured and unstructured data that is generated and gathered from several fields and resources. The challenges of managing massive amounts of data include extracting, analysing, visualizing, sharing, storing, transferring and searching such data. Currently, traditional data processing tools and their applications are not capable of managing Big Data. Therefore, there is a critical need to develop effective and efficient Big Data processing techniques. Big Data has five characteristics: volume, velocity, variety, veracity and value [2]. Volume refers to the size of the data for processing and analysis. Velocity relates to the rate of data growth and usage. Variety means the different types and formats of the data used for processing and analysis. Veracity concerns the accuracy of results and analysis of the data. Value is the added value and contribution offered by data processing and analysis. |