It is built
on top of other ItemSimilarity implementations, used to cache the results of
computations. If you want to clear cache for given item, you can do this by
using "clearCacheForItem(long itemID)" method.
Let’s say I
had following input data.
customer.csv
1,4,3 1,7,2 1,8,2 1,10,1 2,3,2 2,4,3 2,6,3 2,7,1 2,9,1 3,0,3 3,3,2 3,4,1 3,8,3 3,9,1 4,2,5 4,3,4 4,7,3 4,9,2 5,4,5 5,6,4 5,7,1 5,8,3
1,4,3 means
customer 1 like item 4 and rated it 3.
import java.io.File; import java.io.IOException; import org.apache.mahout.cf.taste.common.TasteException; import org.apache.mahout.cf.taste.impl.model.file.FileDataModel; import org.apache.mahout.cf.taste.impl.similarity.CachingItemSimilarity; import org.apache.mahout.cf.taste.impl.similarity.TanimotoCoefficientSimilarity; import org.apache.mahout.cf.taste.model.DataModel; public class CachingItemSimilarityEx { public static String dataFile = "/Users/harikrishna_gurram/customer.csv"; public static void main(String args[]) throws IOException, TasteException { DataModel model = new FileDataModel(new File(dataFile)); TanimotoCoefficientSimilarity similarity = new TanimotoCoefficientSimilarity( model); CachingItemSimilarity cacheSimilarity = new CachingItemSimilarity(similarity, 100); long itemIds[] = { 3, 4, 6, 7, 8, 9, 10 }; double distance[] = cacheSimilarity.itemSimilarities(4, itemIds); for (int i = 0; i < itemIds.length; i++) { System.out.println("distance between item 4 and " + itemIds[i] + " is " + distance[i]); } } }
Output
distance between item 4 and 3 is 0.4 distance between item 4 and 4 is 1.0 distance between item 4 and 6 is 0.5 distance between item 4 and 7 is 0.6 distance between item 4 and 8 is 0.75 distance between item 4 and 9 is 0.4 distance between item 4 and 10 is 0.25
No comments:
Post a Comment