مشخصات مقاله | |
ترجمه عنوان مقاله | Matminer: مجموعه ابزار منبع باز برای مواد داده کاوی |
عنوان انگلیسی مقاله | Matminer: An open source toolkit for materials data mining |
انتشار | مقاله سال 2018 |
تعداد صفحات مقاله انگلیسی | 10 صفحه |
هزینه | دانلود مقاله انگلیسی رایگان میباشد. |
پایگاه داده | نشریه الزویر |
نوع نگارش مقاله |
مقاله پژوهشی (Research article) |
مقاله بیس | این مقاله بیس نمیباشد |
نمایه (index) | scopus – master journals – JCR |
نوع مقاله | ISI |
فرمت مقاله انگلیسی | |
ایمپکت فاکتور(IF) |
2.530 در سال 2017 |
شاخص H_index | 91 در سال 2018 |
شاخص SJR | 1.766 در سال 2018 |
رشته های مرتبط | مهندسی صنایع، مهندسی کامپیوتر |
گرایش های مرتبط | داده کاوی، هوش مصنوعی |
نوع ارائه مقاله |
ژورنال |
مجله / کنفرانس | علوم مواد محاسباتی – Computational Materials Science |
دانشگاه | Computation Institute – University of Chicago – Chicago – United States |
کلمات کلیدی | داده کاوی، نرم افزار منبع باز، یادگیری ماشین، انفورماتیک مواد |
کلمات کلیدی انگلیسی | Data mining, Open source software, Machine learning, Materials informatics |
شناسه دیجیتال – doi |
https://doi.org/10.1016/j.commatsci.2018.05.018 |
کد محصول | E9988 |
وضعیت ترجمه مقاله | ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید. |
دانلود رایگان مقاله | دانلود رایگان مقاله انگلیسی |
سفارش ترجمه این مقاله | سفارش ترجمه این مقاله |
فهرست مطالب مقاله: |
Abstract Keywords 1 Introduction 2 Software architecture and design principles 3 Components of matminer 4 Examples of using matminer 5 Conclusion Acknowledgements Appendix A. Supplementary material References |
بخشی از متن مقاله: |
ABSTRACT
As materials data sets grow in size and scope, the role of data mining and statistical learning methods to analyze these materials data sets and build predictive models is becoming more important. This manuscript introduces matminer, an open-source, Python-based software platform to facilitate data-driven methods of analyzing and predicting materials properties. Matminer provides modules for retrieving large data sets from external databases such as the Materials Project, Citrination, Materials Data Facility, and Materials Platform for Data Science. It also provides implementations for an extensive library of feature extraction routines developed by the materials community, with 47 featurization classes that can generate thousands of individual descriptors and combine them into mathematical functions. Finally, matminer provides a visualization module for producing interactive, shareable plots. These functions are designed in a way that integrates closely with machine learning and data analysis packages already developed and in use by the Python data science community. We explain the structure and logic of matminer, provide a description of its various modules, and showcase several examples of how matminer can be used to collect data, reproduce data mining studies reported in the literature, and test new methodologies. Introduction Recently, the materials community has placed a renewed emphasis in collecting and organizing large data sets for research, materials design, and the eventual application of statistical or “machine learning” techniques. For example, the mining of databases comprised of density functional theory (DFT) calculations has been used to identify materials for batteries [1,2], to aid the design of metal alloys [3,4], and for many other applications [5]. Importantly, such data sets present new opportunities to develop predictive models through machine learning techniques: rather than designing and programming such models manually, such techniques produce predictive models by learning from a body of examples. Machine learning models have been demonstrated to predict properties of crystalline materials much faster than DFT [6–9], estimate properties that are difficult to access via other computational tools [10,11], and guide the search for new materials [12–16]. With the continued development of general-purpose data mining methods for many types of materials data [17–19] and the proliferation of material property databases [20], this emerging field of “materials informatics” is positioned to have a continued impact on materials design. In this paper, we describe a new software library, “matminer”, for applying data-driven techniques to the materials domain. The main roles of matminer are depicted in Fig. 1: matminer assists the user in retrieving large data sets from common databases, extracts features to transform the raw data into representations suitable for machine learning, and produces interactive visualizations of the data for exploratory analysis. We note that matminer does not itself implement common machine learning algorithms; industry-standard tools (e.g., scikit-learn or Keras) are already developed and maintained by the larger data science community for this purpose. Instead, matminer’s role is to connect these advanced machine learning tools to the materials domain. |