مشخصات مقاله | |
ترجمه عنوان مقاله | یک سیستم تجزیه و تحلیل یکپارچه بیان ژن با استفاده از یادگیری خود پیما و شبکه SCAD (انحراف مطلق به نرمی قطع شده) |
عنوان انگلیسی مقاله | An integrative analysis system of gene expression using self-paced learning and SCAD-Net |
انتشار | مقاله سال 2019 |
تعداد صفحات مقاله انگلیسی | 11 صفحه |
هزینه | دانلود مقاله انگلیسی رایگان میباشد. |
پایگاه داده | نشریه الزویر |
نوع نگارش مقاله |
مقاله پژوهشی (Research Article) |
مقاله بیس | این مقاله بیس میباشد |
نمایه (index) | Scopus – Master Journals List – JCR |
نوع مقاله | ISI |
فرمت مقاله انگلیسی | |
ایمپکت فاکتور(IF) |
5.891 در سال 2018 |
شاخص H_index | 162 در سال 2019 |
شاخص SJR | 1.190 در سال 2018 |
شناسه ISSN | 0957-4174 |
شاخص Quartile (چارک) | Q1 در سال 2018 |
مدل مفهومی | ندارد |
پرسشنامه | ندارد |
متغیر | دارد |
رفرنس | دارد |
رشته های مرتبط | زیست شناسی |
گرایش های مرتبط | ژنتیک |
نوع ارائه مقاله |
ژورنال |
مجله / کنفرانس | سیستم های خبره با کابردهای مربوطه – Expert Systems with Applications |
دانشگاه | School of Information Science and Engineering & Provincial Demonstration Software Institute, Shaoguan University, Shaoguan, China |
کلمات کلیدی | سیستم تجزیه و تحلیل یکپارچه، متاآنالیز، تنظیم، انتخاب متغیر، بیان ژن |
کلمات کلیدی انگلیسی | Integrative analysis system، Meta-analysis، Regularization، Variable selection، Gene expression |
شناسه دیجیتال – doi |
https://doi.org/10.1016/j.eswa.2019.06.016 |
کد محصول | E13557 |
وضعیت ترجمه مقاله | ترجمه آماده این مقاله موجود نمیباشد. میتوانید از طریق دکمه پایین سفارش دهید. |
دانلود رایگان مقاله | دانلود رایگان مقاله انگلیسی |
سفارش ترجمه این مقاله | سفارش ترجمه این مقاله |
فهرست مطالب مقاله: |
Abstract Abbreviations 1. Introduction 2. Method 3. Calculation 4. Results 5. Discussion and conclusion Acknowledgments Funding Conflicts of interest Authors’ Contributions Appendix. Supplementary materials References |
بخشی از متن مقاله: |
Abstract
Background: Few proposed gene biomarkers have been satisfactory in clinical applications. That is mainly due to the small studies sample size. Because of the batch effect, different gene-expression studies cannot be merged directly. Many integrative methods have attempted to integrate various datasets to eliminate the batch effect while keeping biological information intact. However, due to the complexity of the batch effect, it cannot be eliminated, and these methods may even add new systematic errors to the data, further complicating integrated data. Therefore, direct analysis of the merged data may cause some issues. In this paper, we suggest a novel integrative analysis framework for merged gene-expression data. The framework adopts the self-paced learning. This method allows samples to be automatically added into the training period, from simple to intricate, in a purely self-paced way. Moreover, the framework includes a new feature selection method, the SCAD-Net regularization method, a combination of SCAD and networkbased penalties to integrates the biological network knowledge. The simulation shows that the proposed method outperforms the benchmark with more accurate marker identification. The analysis of seven large NSCLC gene expression datasets shows that the proposed method not only results in higher accuracies, but also identifies potential therapeutic markers and pathways in NSCLC. In conclusion, we provide a new and efficient integrative analysis system of gene expression, for the search for new reliable diagnosis or targeted therapy biomarker. Introduction To date, numerous gene biomarker studies have been completed (Dang et al., 2018; Reis-Filho & Pusztai, 2011). Unfortunately, few of the proposed gene biomarkers are satisfied in clinical applications. That is mainly due to small study sample sizes (Ali et al., 2014; Hay, Thomas, Craighead, Economides, & Rosenthal, 2014). Small sample sizes reduce statistical efficacy, which can result in false conclusions. Sufficient sample is required to produce effective statistical analysis and valid conclusions. The increasing amount and availability of large gene expression studies motivate the development of integrative analysis that combines multiple datasets or relevant results. However, although some gene expression studies share the same goal, constituent datasets have typically been generated using diverse processing facilities, different data platforms and return expression values on different numerical scales (often called the batch effect). Therefore, merging information from different gene expression studies poses a statistical challenge. Extensive efforts have been made to address this challenge and can be divided into two distinct approaches: meta-analysis and integrative analysis via data merging (Ma, 2009). The first approach, meta-analysis, uses statistical methods that combining results from different studies. However, meta-analysis is trivial and several conditions are critical for viable results, and small violations of those conditions can lead to misleading results (Walker, Hernandez, & Kattan, 2008). The second approach is the integrative analysis method, which merges diverse datasets into a union dataset, and performs analysis based on this newly integrated dataset. Its main advantage over meta-analysis is higher result statistical significance due to large datasets (Lazar et al., 2013). |