Estimating classifier performance with Genetic Programming

Published in Conferences Papers
  1. Leonardo Trujillo, Yuliana Mart\'ınez, Edgar Galván-López and Pierrick Legrand. Predicting Problem Difficulty for Genetic Programming Applied to Data Classification. In Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation. 2011, 1355–1362. URL, DOI BibTeX

    	author = "Trujillo, Leonardo and Mart\'{\i}nez, Yuliana and Galv\'{a}n-L\'{o}pez, Edgar and Legrand, Pierrick",
    	title = "Predicting Problem Difficulty for Genetic Programming Applied to Data Classification",
    	booktitle = "Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation",
    	series = "GECCO '11",
    	year = 2011,
    	isbn = "978-1-4503-0557-0",
    	location = "Dublin, Ireland",
    	pages = "1355--1362",
    	numpages = 8,
    	url = "",
    	doi = "10.1145/2001576.2001759",
    	acmid = 2001759,
    	publisher = "ACM",
    	address = "New York, NY, USA",
    	keywords = "classification, genetic programming, performance prediction"

A fundamental task that must be addressed before classifying a set of data, is that of choosing the proper classification method. In other words, a researcher must infer which classifier will achieve the best performance on the classification problem in order to make a reasoned choice. This task is not trivial, and it is mostly resolved based on personal experience and individual preferences. This paper presents a methodological approach to produce estimators of classifier performance, based on descriptive measures of the problem data. The proposal is to use Genetic Programming (GP) to evolve mathematical operators that take as input descriptors of the problem data, and output the expected error that a particular classifier might achieve if it is used to classify the data. Experimental tests show that GP can produce accurate estimators of classifier performance, by evaluating our approach on a large set of 500 two-class problems of multimodal data, using a neural network for classification. The results suggest that the GP approach could provide a tool that helps researchers make a reasoned decision regarding the applicability of a classifier to a particular problem.

Published in
Proceedings of the 14th European Conference on Genetic Programming
Volume 6621
Pages 274-285
Date of conference
27-29 Abril 2011