This post has NOT been accepted by the mailing list yet.
I have the following case where numItems = 1,000,000, prefs1Size = 900,000 and prefs2Size = 100.
It is the case when i have two users, one who has seen 90% of the movies in the database and another only 100 of the million movies. Suppose they have 90 movies in common (user 2 has seen only 100 movies totally), i would assume the similarity to be high compared to when they have only 10 movies in common. But the similarities i am getting are
0.9971 for intersection size 10 and
0 for intersection size 90.
This seems counter intuitive.
Am i missing something? Is there an explanation for the above mentioned values?