Sunday 16 August 2015


Apache Mahout is an open source project by Apache Software foundation, to produce free implementations of distributed or scalable machine learning algorithms focused primarily in the areas of collaborative filtering, clustering and classification. Mahout is a Hindi word, refers to an elephant driver.

In this tutorial series, I am going to explain these three areas (collaborative filtering, clustering and classification) in brief; later posts explain them in detail.

Collaborative filtering

It is a technique used by recommender systems. You can observe e-commerce sites; they show you some recommendations while purchasing.

Most of the e-commerce sites use Collaborative filtering to recommend products to users, based on their past behavior. Even news web sites offering related news, based on users past read articles.

Clustering comes under unsupervised learning category. Clustering is used to uncover hidden relations in huge data sets. Clustering takes huge data as input and groups the data into clusters, based on various properties.

For example Google news, groups news articles using clustering technique.

Classification comes under supervised learning category. Here we train system with huge input samples, and system predicts data using the trained samples. Results depend on training samples.
For example, mail systems detects spam messages using this model.

Prevoius                                                 Next                                                 Home

No comments:

Post a Comment