Zeno's Notes


Posts tagged KDD

9 notes

Data Mining Competitions: They Are Very, Very Useful

Note: Read on if you are interested in data analysis, machine learning, or recommender systems.

At this year’s KDD conference, there was, as every year, a workshop on the KDD Cup (at which I was a participant). Additionally, and even more interesting, there was a panel about data mining competitions.

Neal Lathia wrote a really nice and thought-provoking post about this panel discussion, and shared some of his opinions about the topic. I had a different view on some of the things he said, and wanted to write a comment on his blog. After I saw that the comment would be quite long, I decided to turn it into a proper blog post.

Read more …

Filed under kdd kdd2011 kddcup competition challenge prize data mining machine learning recommender systems science engineering academics data analysis predictive analysis

4 notes

C5.0 and Cubist: GPL-licensed decision tree implementations

Ross Quinlan received the SIGKDD Innovation Award at KDD 2011 in San Diego.

Quinlan is well-known for his work on decision tree learning, in particular for developing the C4.5 algorithm and its successor, C5.0.

He has also a company, RuleQuest Research, that sells tools and services related to his inventions.

KDD 2011 Opening Session

At the award session I found out that the single-threaded Linux versions of C5.0 (for classification) and Cubist (for regression) are available under the terms of the GNU General Public License, that is, they are free software. Nice! You can download them here.

Except that I had to install csh to be able to build the programs, installation was without problems. It seems they are not yet packaged for Debian, though. Any volunteers?

PS: The photo above was taken by Markus Weimer. Click on it to get to his flickr photostream.

Filed under KDD kdd2011 data mining machine learning free software GPL GNU open source debian decision tree c4.5 c5.0 regression classification