PFP: Parallel frequent pattern mining in mahout

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

PFP: Parallel frequent pattern mining in mahout

This post has NOT been accepted by the mailing list yet.
I am not an expert in Mahout library, I am trying to learn and read. I am now working on PFP but I am just using one computer. The database fits in one memory. I have the following questions:
1. How PFP works on single node? since parallel is needed to be distributed to a cluster of nodes. What I am looking for what will be the usefulness of using mapreduce method if I am using single node?

2. One of the parameters is -g, what does this indicates?

3.If I am running the default PFP as it is shown in APACHE website without determining the number of mappers and reducers, what will be the default?

4. What the difference between Method - sequential and - mapreduce?

5. If I am applying to single node, is the database got spitted to different shards where each mapper process a shard?

6. Please advice me with any literature regarding understanding how does this work on a single node.