Estimating classifier performance with Genetic Programming

Published in Conferences Papers
  1. Leonardo Trujillo, Yuliana Mart\'ınez and Patricia Melin. Estimating Classifier Performance with Genetic Programming. In EuroGP. 2011, 274-285. BibTeX

    	author = "Leonardo Trujillo and Yuliana Mart\'{\i}nez and Patricia Melin",
    	title = "Estimating Classifier Performance with Genetic Programming",
    	booktitle = "EuroGP",
    	year = 2011,
    	pages = "274-285",
    	ee = "",
    	crossref = "DBLP:conf/eurogp/2011",
    	bibsource = "DBLP,"
  2. Sara Silva, James A Foster, Miguel Nicolau, Penousal Machado and Mario Giacobini (eds.). Genetic Programming - 14th European Conference, EuroGP 2011, Torino, Italy, April 27-29, 2011. Proceedings 6621. Springer, 2011. BibTeX

    	editor = "Sara Silva and James A. Foster and Miguel Nicolau and Penousal Machado and Mario Giacobini",
    	title = "Genetic Programming - 14th European Conference, EuroGP 2011, Torino, Italy, April 27-29, 2011. Proceedings",
    	booktitle = "EuroGP",
    	publisher = "Springer",
    	series = "Lecture Notes in Computer Science",
    	volume = 6621,
    	year = 2011,
    	isbn = "978-3-642-20406-7",
    	ee = "",
    	bibsource = "DBLP,"

A fundamental task that must be addressed before classifying a set of data, is that of choosing the proper classification method. In other words, a researcher must infer which classifier will achieve the best performance on the classification problem in order to make a reasoned choice. This task is not trivial, and it is mostly resolved based on personal experience and individual preferences. This paper presents a methodological approach to produce estimators of classifier performance, based on descriptive measures of the problem data. The proposal is to use Genetic Programming (GP) to evolve mathematical operators that take as input descriptors of the problem data, and output the expected error that a particular classifier might achieve if it is used to classify the data. Experimental tests show that GP can produce accurate estimators of classifier performance, by evaluating our approach on a large set of 500 two-class problems of multimodal data, using a neural network for classification. The results suggest that the GP approach could provide a tool that helps researchers make a reasoned decision regarding the applicability of a classifier to a particular problem.

Published in
Proceedings of the 14th European Conference on Genetic Programming
Volume 6621
Pages 274-285
Date of conference
27-29 Abril 2011