Java – Lanczsos – feature vector in Mahout

Lanczsos – feature vector in Mahout… here is a solution to the problem.

Lanczsos – feature vector in Mahout

I’m trying machine learning with Java Mahout. I have downloaded all the data I want in MySQL. Where I’m stuck is when my “SparseRowMatrix” type variable does all the calculations and rearrangements. I simply don’t understand how to call either of the two methods I see fit:

1) org.apache.mahout.math.decomposer.lanczos.LanczosSolver

2) org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver

Any suggestions help at this point!

Solution

DistributedLanczosSolver implements a Tool interface, so you can run it as a regular Hadoop job, for example

hadoop jar $MAHOUT_HOME/mahout-examples-0.5-job.jar org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver --input /path/to/input --output /path/to/ output --numCols 42 --numRows 42 --cleansvd "true" --rank 5

You can also call it directly from Java using the following method:

ToolRunner.run(new DistributedLanczosSolver().job(), args);

Or, if you don’t need to do this in a distributed fashion, the LanczosSolver.solve method is the one you’re looking for, and you have to pass matrices, eigenvectors, and eigenvalues to its values. It uses the Lanczos algorithm to do something complicated behind the scenes that I can’t explain, so I suggest you check it straight for more clarity in the source code .

Related Problems and Solutions