RaSEn - Random Subspace Ensemble Classification and Variable Screening
We propose a general ensemble classification framework,
RaSE algorithm, for the sparse classification problem. In RaSE
algorithm, for each weak learner, some random subspaces are
generated and the optimal one is chosen to train the model on
the basis of some criterion. To be adapted to the problem, a
novel criterion, ratio information criterion (RIC) is put up
with based on Kullback-Leibler divergence. Besides minimizing
RIC, multiple criteria can be applied, for instance, minimizing
extended Bayesian information criterion (eBIC), minimizing
training error, minimizing the validation error, minimizing the
cross-validation error, minimizing leave-one-out error. There
are various choices of base classifier, for instance, linear
discriminant analysis, quadratic discriminant analysis,
k-nearest neighbour, logistic regression, decision trees,
random forest, support vector machines. RaSE algorithm can also
be applied to do feature ranking, providing us the importance
of each feature based on the selected percentage in multiple
subspaces. RaSE framework can be extended to the general
prediction framework, including both classification and
regression. We can use the selected percentages of variables
for variable screening. The latest version added the variable
screening function for both regression and classification
problems.