PFP: Parallel frequent pattern mining in mahout

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

PFP: Parallel frequent pattern mining in mahout

Noor
This post has NOT been accepted by the mailing list yet.
I am not an expert in Mahout library, I am trying to learn and read. I am now working on PFP but I am just using one computer. The database fits in one memory. I have the following questions:
1. How PFP works on single node? since parallel is needed to be distributed to a cluster of nodes. What I am looking for what will be the usefulness of using mapreduce method if I am using single node?

2. One of the parameters is -g, what does this indicates?

3.If I am running the default PFP as it is shown in APACHE website without determining the number of mappers and reducers, what will be the default?
https://cwiki.apache.org/confluence/display/MAHOUT/Parallel+Frequent+Pattern+Mining

4. What the difference between Method - sequential and - mapreduce?

5. If I am applying to single node, is the database got spitted to different shards where each mapper process a shard?

6. Please advice me with any literature regarding understanding how does this work on a single node.