Zeno's Notes

...

Posts tagged classification

0 notes

Large Scale Machine Learning and Other Animals: Steffen Rendle on Factorization Machines

Steffen and his method/software finally get some well-deserved attention.

Factorization Machines (FMs) are basically factorized polynomial prediction (regression/classification/ranking).

They work really really well for applications like recommendation, where the input data is sparse, and many feature combinations at prediction time (e.g. user-item pairs) are never observed during training.

And the cool thing is, you can mimic many advanced factorization models just by feature engineering for FMs. That means you can reuse the existing training algorithms — no need to derive and implement a new algorithm for a new prediction problem…

Filed under factorization machines FM machine learning data mining recsys recommender system matrix factorization regression ranking classification personalization kaggle KDD Cup 2012

4 notes

C5.0 and Cubist: GPL-licensed decision tree implementations

Ross Quinlan received the SIGKDD Innovation Award at KDD 2011 in San Diego.

Quinlan is well-known for his work on decision tree learning, in particular for developing the C4.5 algorithm and its successor, C5.0.

He has also a company, RuleQuest Research, that sells tools and services related to his inventions.

KDD 2011 Opening Session

At the award session I found out that the single-threaded Linux versions of C5.0 (for classification) and Cubist (for regression) are available under the terms of the GNU General Public License, that is, they are free software. Nice! You can download them here.

Except that I had to install csh to be able to build the programs, installation was without problems. It seems they are not yet packaged for Debian, though. Any volunteers?

PS: The photo above was taken by Markus Weimer. Click on it to get to his flickr photostream.

Filed under KDD kdd2011 data mining machine learning free software GPL GNU open source debian decision tree c4.5 c5.0 regression classification