Select features of a gene expression array via cross-validation

For a classification problem, usually we wish to use as less variables as possible because of difficulties brought by the high dimension. Here we present the process of finding the minimum number of features for a predictive model. FIXME

Animation

This animation has provided an illustration of the process of finding out the optimum number of variables using k-fold cross-validation in a linear discriminant analysis (LDA). </ani>

R code

library(animation)
saveHTML({(nmax = 10,interval = 0.5)
    par(mar = c(3, 3, 1, 0.5), mgp = c(1.5, 0.5, 0), tcl = -0.3, pch = 19, cex = 1.5)
	cv.nfeaturesLDA()
}, img.name="cv_nfeaturesLDA", htmlfile="cv_nfeaturesLDA.html",
	ani.options(ani.height = 500, ani.width = 600, outdir = getwd()),
	title = "Cross-validation to find the optimum number of features in LDA",
    description = c("This animation has provided an illustration of the process of
    finding out the optimum number of variables using k-fold cross-validation in a linear discriminant analysis (LDA)."))

da/biostat/select_features_via_cv.txt · Last modified: 2011/01/16 07:57 by oyster8
 
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki