An empirical study of functional complexity as an indicator of overfitting in Genetic Programming

Published in Conferences Papers
  1. Leonardo Trujillo, Yuliana Mart\'ınez and Patricia Melin. How Many Neurons?: A Genetic Programming Answer. In Proceedings of the 13th Annual Conference Companion on Genetic and Evolutionary Computation. 2011, 175–176. URL, DOI BibTeX

    	author = "Trujillo, Leonardo and Mart\'{\i}nez, Yuliana and Melin, Patricia",
    	title = "How Many Neurons?: A Genetic Programming Answer",
    	booktitle = "Proceedings of the 13th Annual Conference Companion on Genetic and Evolutionary Computation",
    	series = "GECCO '11",
    	year = 2011,
    	isbn = "978-1-4503-0690-4",
    	location = "Dublin, Ireland",
    	pages = "175--176",
    	numpages = 2,
    	url = "",
    	doi = "10.1145/2001858.2001956",
    	acmid = 2001956,
    	publisher = "ACM",
    	address = "New York, NY, USA",
    	keywords = "artificial neural networks, genetic programming"

Recently, it has been stated that the complexity of a solution is a good indicator of the amount of overfitting it incurs. However, measuring the complexity of a program, in Genetic Programming, is not a trivial task. In this paper, we study the functional complexity and how it relates with overfitting on symbolic regression problems. We consider two measures of complexity, Slope-based Functional Complexity, inspired by the concept of curvature, and Regularity-based Functional Complexity based on the concept of Hölderian regularity. In general, both complexity measures appear to be poor indicators of program overfitting. However, results suggest that Regularity-based Functional Complexity could provide a good indication of overfitting in extreme cases.

Published in
Proceedings of the 14th European Conference on Genetic Programming
Volume 6621
Pages 262-273
Date of conference
27-29 Abril 2011