Gene expression datasets obtained from high-throughput experiments are usually characterised by a very high number of features compared to a small amount of samples (n >> m). Feature profiles typically comprise thousands of molecular markers where sample sets seldom contain more than a hundred samples. To cope with this high-dimensional setting, sparse classification models or the acquisition of additional data resources might become relevant.
In the first part of this talk, Dr Szekely will discuss two strategies how feature selection can be integrated into multi-class classifier systems. Based on these results, in the second part of the talk Dr Szekely will present an explorative approach utilising feature signatures that were originally designed for the discrimination of a pair of foreign diagnostic classes. This approach will be further extended by an indirect feature selection strategy between a pair of original classes and one foreign class.