EPIA'03 - 11th Portuguese Conference on Artificial Intelligence

EKDB -- International Workshop on Extraction of Knowledge from Data Bases


Session: December 5, 10:0-11:30, Room A
Title: Is the UCI Repository Useful for Data Mining?
Carlos Soares
Abstract: We propose a methodology to determine whether the distribution of relative performance of algorithms in repositories of benchmark data sets is the same as in real world data sets. A positive result is a necessary condition for the existence of meta-knowledge for algorithm recommendation in repositories. We apply the method to the UCI repository with positive results. We also propose an adaptation of this method to test whether tool developers are ``overfitting'' repositories, which yields negative results in the UCI repository.
Back to schedule.