مشخصات مقاله | |
ترجمه عنوان مقاله | یادگیری شبکه های بیزی با ساختار محلی، متغیرهای مختلط و الگوریتم های دقیق |
عنوان انگلیسی مقاله | Learning Bayesian networks with local structure, mixed variables, and exact algorithms |
انتشار | مقاله سال 2019 |
تعداد صفحات مقاله انگلیسی | 27 صفحه |
هزینه | دانلود مقاله انگلیسی رایگان میباشد. |
پایگاه داده | نشریه الزویر |
نوع نگارش مقاله |
مقاله پژوهشی (Research Article) |
مقاله بیس | این مقاله بیس میباشد |
نمایه (index) | Scopus – Master Journals List – JCR |
نوع مقاله | ISI |
فرمت مقاله انگلیسی | |
ایمپکت فاکتور(IF) |
2.899 در سال 2019 |
شاخص H_index | 85 در سال 2020 |
شاخص SJR | 0.606 در سال 2019 |
شناسه ISSN | 0888-613X |
شاخص Quartile (چارک) | Q2 در سال 2019 |
مدل مفهومی | ندارد |
پرسشنامه | ندارد |
متغیر | دارد |
رفرنس | دارد |
رشته های مرتبط | مهندسی کامپیوتر |
گرایش های مرتبط | هوش مصنوعی، مهندسی الگوریتم و محاسبات |
نوع ارائه مقاله |
ژورنال |
مجله | مجله بین المللی استدلال تقریبی – International Journal of Approximate Reasoning |
دانشگاه | Department of Computer Science, University of Helsinki, Finland |
کلمات کلیدی | شبکه های بیزی، درخت تصمیم، الگوریتم های دقیق، یادگیری ساختاری |
کلمات کلیدی انگلیسی | Bayesian networks، Decision trees، Exact algorithms، Structure learning |
شناسه دیجیتال – doi |
https://doi.org/10.1016/j.ijar.2019.09.002 |
کد محصول | E15003 |
وضعیت ترجمه مقاله | ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید. |
دانلود رایگان مقاله | دانلود رایگان مقاله انگلیسی |
سفارش ترجمه این مقاله | سفارش ترجمه این مقاله |
فهرست مطالب مقاله: |
Abstract 1. Introduction 2. Preliminaries 3. Partition–dyadic CART 4. Algorithms 5. Empirical studies 6. Discussion Declaration of Competing Interest Acknowledgements Appendix A. Full Intersection-Validation results Appendix B. Overview of used UCI data sets including full prediction results Appendix C. Remaining hyperparameter sensitivity plots Appendix D. Additional results for studies with ordinal variables References |
بخشی از متن مقاله: |
Abstract
Modern exact algorithms for structure learning in Bayesian networks first compute an exact local score of every candidate parent set, and then find a network structure by combinatorial optimization so as to maximize the global score. This approach assumes that each local score can be computed fast, which can be problematic when the scarcity of the data calls for structured local models or when there are both continuous and discrete variables, for these cases have lacked efficient-to-compute local scores. To address this challenge, we introduce a local score that is based on a class of classification and regression trees. We show that under modest restrictions on the possible branchings in the tree structure, it is feasible to find a structure that maximizes a Bayes score in a range of moderate-size problem instances. In particular, this enables global optimization of the Bayesian network structure, including the local structure. In addition, we introduce a related model class that extends ordinary conditional probability tables to continuous variables by employing an adaptive discretization approach. The two model classes are compared empirically by learning Bayesian networks from benchmark real-world and synthetic data sets. We discuss the relative strengths of the model classes in terms of their structure learning capability, predictive performance, and running time. Introduction Bayesian networks (BNs) compose a multivariate distribution as a product of univariate conditional probability distributions (CPDs). The potential complexity of the local CPDs, together with the global acyclicity constraint of the BN structure, make the task of learning BNs from data challenging in many ways. Even if we assume that the available data contain neither unobserved variables nor missing entries—like we will do throughout this article—there often remain the following three concerns: (i) Statistical efficiency: the data may be scarce, rendering simple tabular parameterizations of the CPDs, such as the conditional probability tables (CPTs), statistically inefficient. (ii) Heterogeneous variables: so-called hybrid BNs contain both discrete and continuous variables, again ruling out the most convenient and standard parameterizations of CPDs. (iii) Computational efficiency: the most natural formulations of the learning problem are computationally hard. This limits both the dimensions of the data and the complexity of the CPDs in relation to what can be solved under useful quality guarantees. Each of these issues (i–iii) has been addressed in the literature, as we will describe in the next paragraphs. The present authors are, however, not aware of any prior work that would simultaneously address all the three concerns. |