Biochemistry, Department of

Department of Biochemistry: Faculty Publications

CancerDiscover: an integrative pipeline for cancer biomarker and cancer class prediction from high-throughput sequencing data

Akram Mohammed, University of Nebraska-LincolnFollow
Greyson Biegert, University of Nebraska-Lincoln
Jiri Adamec, University of Nebraska-LincolnFollow
Tomáš Helikar, University of Nebraska-LincolnFollow

Document Type

Article

Date of this Version

2018

Citation

Oncotarget, 2018, Vol. 9, (No. 2), pp: 2565-2573.

Comments

Mohammed et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License 3.0.

Abstract

Accurate identification of cancer biomarkers and classification of cancer type and subtype from High Throughput Sequencing (HTS) data is a challenging problem because it requires manual processing of raw HTS data from various sequencing platforms, quality control, and normalization, which are both tedious and timeconsuming. Machine learning techniques for cancer class prediction and biomarker discovery can hasten cancer detection and significantly improve prognosis. To date, great research efforts have been taken for cancer biomarker identification and cancer class prediction. However, currently available tools and pipelines lack flexibility in data preprocessing, running multiple feature selection methods and learning algorithms, therefore, developing a freely available and easy-to-use program is strongly demanded by researchers. Here, we propose CancerDiscover, an integrative opensource software pipeline that allows users to automatically and efficiently process large high-throughput raw datasets, normalize, and selects best performing features from multiple feature selection algorithms. Additionally, the integrative pipeline lets users apply different feature thresholds to identify cancer biomarkers and build various training models to distinguish different types and subtypes of cancer. The open-source software is available at https://github.com/HelikarLab/CancerDiscover and is free for use under the GPL3 license.

Download

Included in

Biochemistry Commons, Biotechnology Commons, Other Biochemistry, Biophysics, and Structural Biology Commons

COinS

Biochemistry, Department of

Department of Biochemistry: Faculty Publications

CancerDiscover: an integrative pipeline for cancer biomarker and cancer class prediction from high-throughput sequencing data

Document Type

Date of this Version

Citation

Comments

Abstract

Included in

Search

Browse

Author Corner

Links

Biochemistry, Department of

Department of Biochemistry: Faculty Publications

CancerDiscover: an integrative pipeline for cancer biomarker and cancer class prediction from high-throughput sequencing data

Authors

Document Type

Date of this Version

Citation

Comments

Abstract

Included in

Share

Search

Browse

Author Corner

Links