Apache Hama good for decision trees?
I have implemented it in Hadoop, the framework (also known as PLANET) that Google uses to build decision trees. It starts with a single vertex, and as the map reduces the number of jobs, you add more and more jobs until the tree is fully built. A major problem is that many map/reduce jobs run one after the other, so it is very expensive to start new jobs all the time.
I’ve seen many times that Apache Hama is suitable for iterative algorithms like graphs. Can someone build a new graph with Hama, or do you just enter a graph and do some calculations on it? Will it be easy to transfer my project to Hama? Thanks
Hama was indeed able to build decision trees in a more efficient way than MapReduce using the algorithm described in the PLANET paper.
Hama does not require graphics as input, you can check out the Hama ML (Machine Learning) module, which typically processes raw feature vectors directly as input from HDFS.
I created a new issue in the Apache Jira to track the progress of the algorithm.