[jira] Created: (MAHOUT-213) storeMapping should not been called when toLongID() is called

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (MAHOUT-213) storeMapping should not been called when toLongID() is called

Tim Allison (Jira)
storeMapping should not been called when toLongID() is called
-------------------------------------------------------------

                 Key: MAHOUT-213
                 URL: https://issues.apache.org/jira/browse/MAHOUT-213
             Project: Mahout
          Issue Type: Improvement
          Components: Collaborative Filtering
    Affects Versions: 0.4
            Reporter: Jeff Zhang
             Fix For: 0.4


In the trunk, storeMapping is called always when toLongID() is called. In my opinion storeMapping should been called only in method initialize().
storeMapping will cost a lot when you use database to store the id mapping. I believe the code should like this:

{code}
  public void initialize(Iterable<String> stringIDs) throws TasteException {
    for (String stringID : stringIDs) {
      long longID = hash(stringID);
      storeMapping(longID, stringID);
    }
  }
{code}


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (MAHOUT-213) storeMapping should not been called when toLongID() is called

Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/MAHOUT-213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Zhang updated MAHOUT-213:
------------------------------

    Attachment: Mahout_213.patch

> storeMapping should not been called when toLongID() is called
> -------------------------------------------------------------
>
>                 Key: MAHOUT-213
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-213
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.4
>            Reporter: Jeff Zhang
>             Fix For: 0.4
>
>         Attachments: Mahout_213.patch
>
>
> In the trunk, storeMapping is called always when toLongID() is called. In my opinion storeMapping should been called only in method initialize().
> storeMapping will cost a lot when you use database to store the id mapping. I believe the code should like this:
> {code}
>   public void initialize(Iterable<String> stringIDs) throws TasteException {
>     for (String stringID : stringIDs) {
>       long longID = hash(stringID);
>       storeMapping(longID, stringID);
>     }
>   }
> {code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (MAHOUT-213) storeMapping should not been called when toLongID() is called

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/MAHOUT-213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Zhang updated MAHOUT-213:
------------------------------

    Status: Patch Available  (was: Open)

> storeMapping should not been called when toLongID() is called
> -------------------------------------------------------------
>
>                 Key: MAHOUT-213
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-213
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.4
>            Reporter: Jeff Zhang
>             Fix For: 0.4
>
>
> In the trunk, storeMapping is called always when toLongID() is called. In my opinion storeMapping should been called only in method initialize().
> storeMapping will cost a lot when you use database to store the id mapping. I believe the code should like this:
> {code}
>   public void initialize(Iterable<String> stringIDs) throws TasteException {
>     for (String stringID : stringIDs) {
>       long longID = hash(stringID);
>       storeMapping(longID, stringID);
>     }
>   }
> {code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (MAHOUT-213) storeMapping should not been called when toLongID() is called

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/MAHOUT-213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Zhang updated MAHOUT-213:
------------------------------

    Attachment:     (was: Mahout_213.patch)

> storeMapping should not been called when toLongID() is called
> -------------------------------------------------------------
>
>                 Key: MAHOUT-213
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-213
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.4
>            Reporter: Jeff Zhang
>             Fix For: 0.4
>
>
> In the trunk, storeMapping is called always when toLongID() is called. In my opinion storeMapping should been called only in method initialize().
> storeMapping will cost a lot when you use database to store the id mapping. I believe the code should like this:
> {code}
>   public void initialize(Iterable<String> stringIDs) throws TasteException {
>     for (String stringID : stringIDs) {
>       long longID = hash(stringID);
>       storeMapping(longID, stringID);
>     }
>   }
> {code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (MAHOUT-213) storeMapping should not been called when toLongID() is called

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/MAHOUT-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787365#action_12787365 ]

Jeff Zhang commented on MAHOUT-213:
-----------------------------------

And toLong method should like this:


{code}
  @Override
  public long toLongID(String stringID) throws TasteException {
    long longID = hash(stringID);
    return longID;
  }
{code}

> storeMapping should not been called when toLongID() is called
> -------------------------------------------------------------
>
>                 Key: MAHOUT-213
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-213
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.4
>            Reporter: Jeff Zhang
>             Fix For: 0.4
>
>
> In the trunk, storeMapping is called always when toLongID() is called. In my opinion storeMapping should been called only in method initialize().
> storeMapping will cost a lot when you use database to store the id mapping. I believe the code should like this:
> {code}
>   public void initialize(Iterable<String> stringIDs) throws TasteException {
>     for (String stringID : stringIDs) {
>       long longID = hash(stringID);
>       storeMapping(longID, stringID);
>     }
>   }
> {code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (MAHOUT-213) storeMapping should not been called when toLongID() is called

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/MAHOUT-213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Zhang updated MAHOUT-213:
------------------------------

    Attachment: Mahout_213.patch

Attach the patch

> storeMapping should not been called when toLongID() is called
> -------------------------------------------------------------
>
>                 Key: MAHOUT-213
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-213
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.4
>            Reporter: Jeff Zhang
>             Fix For: 0.4
>
>         Attachments: Mahout_213.patch
>
>
> In the trunk, storeMapping is called always when toLongID() is called. In my opinion storeMapping should been called only in method initialize().
> storeMapping will cost a lot when you use database to store the id mapping. I believe the code should like this:
> {code}
>   public void initialize(Iterable<String> stringIDs) throws TasteException {
>     for (String stringID : stringIDs) {
>       long longID = hash(stringID);
>       storeMapping(longID, stringID);
>     }
>   }
> {code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (MAHOUT-213) storeMapping should not been called when toLongID() is called

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/MAHOUT-213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Zhang updated MAHOUT-213:
------------------------------

    Status: Open  (was: Patch Available)

> storeMapping should not been called when toLongID() is called
> -------------------------------------------------------------
>
>                 Key: MAHOUT-213
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-213
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.4
>            Reporter: Jeff Zhang
>             Fix For: 0.4
>
>         Attachments: Mahout_213.patch
>
>
> In the trunk, storeMapping is called always when toLongID() is called. In my opinion storeMapping should been called only in method initialize().
> storeMapping will cost a lot when you use database to store the id mapping. I believe the code should like this:
> {code}
>   public void initialize(Iterable<String> stringIDs) throws TasteException {
>     for (String stringID : stringIDs) {
>       long longID = hash(stringID);
>       storeMapping(longID, stringID);
>     }
>   }
> {code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (MAHOUT-213) storeMapping should not been called when toLongID() is called

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/MAHOUT-213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated MAHOUT-213:
-----------------------------

             Priority: Minor  (was: Major)
    Affects Version/s:     (was: 0.4)
                       0.3
        Fix Version/s:     (was: 0.4)
             Assignee: Sean Owen

Hmm, what problem is this solving? The idea is that the mapping must remember any String-to-long conversion it's done. initialize() is just a convenience method. In this model, all of the strings must be known upfront or else this won't work.

> storeMapping should not been called when toLongID() is called
> -------------------------------------------------------------
>
>                 Key: MAHOUT-213
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-213
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.3
>            Reporter: Jeff Zhang
>            Assignee: Sean Owen
>            Priority: Minor
>         Attachments: Mahout_213.patch
>
>
> In the trunk, storeMapping is called always when toLongID() is called. In my opinion storeMapping should been called only in method initialize().
> storeMapping will cost a lot when you use database to store the id mapping. I believe the code should like this:
> {code}
>   public void initialize(Iterable<String> stringIDs) throws TasteException {
>     for (String stringID : stringIDs) {
>       long longID = hash(stringID);
>       storeMapping(longID, stringID);
>     }
>   }
> {code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (MAHOUT-213) storeMapping should not been called when toLongID() is called

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/MAHOUT-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787390#action_12787390 ]

Jeff Zhang commented on MAHOUT-213:
-----------------------------------

e.g. The toLong() method is public, it will been called by users many times. While the storeMapping should only been called only one time.

Here's the code snippet I will call toLong when I implement a HBaseDataModel:

{code}
  public FastIDSet getItemIDsFromUser(long userID) throws TasteException {
    Get get = new Get(Bytes.toBytes(userID));
    try {
      Result user = userTable.get(get);
      NavigableMap<byte[], byte[]> map = user.getNoVersionMap().get(ItemFamily);
      FastIDSet set = new FastIDSet();
      for (Map.Entry<byte[], byte[]> entry : map.entrySet()) {
        String uID = Bytes.toString(entry.getValue());
        set.add(userIDMigrator.toLongID(uID));
      }
      return set;
    } catch (IOException e) {
      throw new TasteException(e);
    }
  }
{code}

> storeMapping should not been called when toLongID() is called
> -------------------------------------------------------------
>
>                 Key: MAHOUT-213
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-213
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.3
>            Reporter: Jeff Zhang
>            Assignee: Sean Owen
>            Priority: Minor
>         Attachments: Mahout_213.patch
>
>
> In the trunk, storeMapping is called always when toLongID() is called. In my opinion storeMapping should been called only in method initialize().
> storeMapping will cost a lot when you use database to store the id mapping. I believe the code should like this:
> {code}
>   public void initialize(Iterable<String> stringIDs) throws TasteException {
>     for (String stringID : stringIDs) {
>       long longID = hash(stringID);
>       storeMapping(longID, stringID);
>     }
>   }
> {code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Resolved: (MAHOUT-213) storeMapping should not been called when toLongID() is called

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/MAHOUT-213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved MAHOUT-213.
------------------------------

       Resolution: Fixed
    Fix Version/s: 0.3

> storeMapping should not been called when toLongID() is called
> -------------------------------------------------------------
>
>                 Key: MAHOUT-213
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-213
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.3
>            Reporter: Jeff Zhang
>            Assignee: Sean Owen
>            Priority: Minor
>             Fix For: 0.3
>
>         Attachments: Mahout_213.patch
>
>
> In the trunk, storeMapping is called always when toLongID() is called. In my opinion storeMapping should been called only in method initialize().
> storeMapping will cost a lot when you use database to store the id mapping. I believe the code should like this:
> {code}
>   public void initialize(Iterable<String> stringIDs) throws TasteException {
>     for (String stringID : stringIDs) {
>       long longID = hash(stringID);
>       storeMapping(longID, stringID);
>     }
>   }
> {code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (MAHOUT-213) storeMapping should not been called when toLongID() is called

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/MAHOUT-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787432#action_12787432 ]

Sean Owen commented on MAHOUT-213:
----------------------------------

I can buy this. The caller now has to make sure the IDMigrator has seen all possible strings ahead of time. I've documented this. I agree that in some cases (like JDBC-backed storage) the storage overhead is high.

> storeMapping should not been called when toLongID() is called
> -------------------------------------------------------------
>
>                 Key: MAHOUT-213
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-213
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.3
>            Reporter: Jeff Zhang
>            Assignee: Sean Owen
>            Priority: Minor
>             Fix For: 0.3
>
>         Attachments: Mahout_213.patch
>
>
> In the trunk, storeMapping is called always when toLongID() is called. In my opinion storeMapping should been called only in method initialize().
> storeMapping will cost a lot when you use database to store the id mapping. I believe the code should like this:
> {code}
>   public void initialize(Iterable<String> stringIDs) throws TasteException {
>     for (String stringID : stringIDs) {
>       long longID = hash(stringID);
>       storeMapping(longID, stringID);
>     }
>   }
> {code}

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.