Java – Mahout: To read a custom input file

Mahout: To read a custom input file… here is a solution to the problem.

Mahout: To read a custom input file

I was playing with Mahout and found that FileDataModel accepts data in the following format

     userId,itemId,pref(long,long,Double).

I have some data in format

     String,long,double 

What is the best/easiest way to use this dataset on Mahout?

Solution

One way is to create Extension of FileDataModel. You need to override readUserIDFromString(String value) A method that uses some kind of parser to convert. You can use IDMigrator One of the implementations. , as Sean suggested.

For example, suppose you have an initialized one MemoryIDMigrator , you can do this:

@Override
protected long readUserIDFromString(String stringID) {
    long result = memoryIDMigrator.toLongID(stringID); 
    memoryIDMigrator.storeMapping(result, stringID);
    return result;
}

This way you can also use memoryIDMigrator for reverse mapping. If you don’t need it, you can hash it the way it is in your implementation (in AbstractIDMigrator 中).

Related Problems and Solutions