|
EPIA'03 - 11th Portuguese Conference on Artificial Intelligence
EKDB -- International Workshop on Extraction of Knowledge from Data Bases
|
Session: December 5, 10:0-11:30, Room A |
Title: |
Is the UCI Repository Useful for Data Mining? |
|
Carlos Soares |
Abstract: |
We propose a methodology to determine whether the distribution of relative
performance of algorithms in repositories of benchmark data sets is the same
as in real world data sets. A positive result is a necessary condition for the existence
of meta-knowledge for algorithm recommendation in repositories. We apply the
method to the UCI repository with positive results. We also propose an adaptation
of this method to test whether tool developers are ``overfitting'' repositories, which
yields negative results in the UCI repository. |
Back to schedule. |