Close
Help
Need Help?





JOURNAL

Cancer Informatics

1,233,752 Journal Article Views | Journal Analytics

Improved Sparse Multi-Class SVM and Its Application for Gene Selection in Cancer Classification

Submit a Paper



Publication Date: 04 Aug 2013

Type: Original Research

Journal: Cancer Informatics

Citation: Cancer Informatics 2013:12 143-153

doi: 10.4137/CIN.S10212

Abstract

Background: Microarray techniques provide promising tools for cancer diagnosis using gene expression profiles. However, molecular diagnosis based on high-throughput platforms presents great challenges due to the overwhelming number of variables versus the small sample size and the complex nature of multi-type tumors. Support vector machines (SVMs) have shown superior performance in cancer classification due to their ability to handle high dimensional low sample size data. The multi-class SVM algorithm of Crammer and Singer provides a natural framework for multi-class learning. Despite its effective performance, the procedure utilizes all variables without selection. In this paper, we propose to improve the procedure by imposing shrinkage penalties in learning to enforce solution sparsity.

Results: The original multi-class SVM of Crammer and Singer is effective for multi-class classification but does not conduct variable selection. We improved the method by introducing soft-thresholding type penalties to incorporate variable selection into multi-class classification for high dimensional data. The new methods were applied to simulated data and two cancer gene expression data sets. The results demonstrate that the new methods can select a small number of genes for building accurate multi-class classification rules. Furthermore, the important genes selected by the methods overlap significantly, suggesting general agreement among different variable selection schemes.

Conclusions: High accuracy and sparsity make the new methods attractive for cancer diagnostics with gene expression data and defining targets of therapeutic intervention.

Availability: The source MATLAB code are available from http://math.arizona.edu/~hzhang/software.html.


Downloads

PDF  (768.33 KB PDF FORMAT)

RIS citation   (ENDNOTE, REFERENCE MANAGER, PROCITE, REFWORKS)

BibTex citation   (BIBDESK, LATEX)

XML


External Resources

MATLAB source code






What Your Colleagues Say About Cancer Informatics
Publishing in Cancer Informatics was the fastest publication I have ever experienced and has received the highest viewing rate.  So it is a great place to publish your very latest research.
Dr Yue Zhang (Boston, MA, USA)
More Testimonials

Quick Links




Follow Us We make it easy to find new research papers.




SUBJECT HUBS
Author Survey Results
author_survey_results
All authors are surveyed after their articles are published. Authors are asked to rate their experience in a variety of areas, and their responses help us to monitor our performance. Presented here are their responses in some key areas. No 'poor' or 'very poor' responses were received; these are represented in the 'other' category.
See Our Results