Re: running mahout with lucene.vector produces a dictionary output only

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: running mahout with lucene.vector produces a dictionary output only

Robin Anil
+mahout-user

On Thu, Oct 28, 2010 at 7:06 PM, Mackram <[hidden email]> wrote:

>
> Hey everyone,
>
> I have a simple question to ask and hopefully someone can point me in the
> right direction. I have a Solr setup with several documents in it (in the
> thousands). I have downloaded the latest mahout trunk and attempted to
> create vectors out of the solr index as is shown in the wiki by using the
> command
>
> "./bin/mahout lucene.vector --dir /usr/local/solr/data/index/ --field
> englishContent --idField id --dictOut ~/dict.vec --norm 2 --output
> ~/out.txt
> --max 50"
>
> The command runs and produces the file dict.vec in my case but no out.txt.
> I
> have no idea why this is and would really appreciate a pointer to what
> could
> be happening.
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/running-mahout-with-lucene-vector-produces-a-dictionary-output-only-tp1786351p1786351.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>
Reply | Threaded
Open this post in threaded view
|

Re: running mahout with lucene.vector produces a dictionary output only

Lahiru Samarakoon
Dear All,

I am facing the same problem.
Please advice.

Thanks,
Lahiru

On Thu, Oct 28, 2010 at 7:19 PM, Robin Anil <[hidden email]> wrote:

> +mahout-user
>
> On Thu, Oct 28, 2010 at 7:06 PM, Mackram <[hidden email]> wrote:
>
> >
> > Hey everyone,
> >
> > I have a simple question to ask and hopefully someone can point me in the
> > right direction. I have a Solr setup with several documents in it (in the
> > thousands). I have downloaded the latest mahout trunk and attempted to
> > create vectors out of the solr index as is shown in the wiki by using the
> > command
> >
> > "./bin/mahout lucene.vector --dir /usr/local/solr/data/index/ --field
> > englishContent --idField id --dictOut ~/dict.vec --norm 2 --output
> > ~/out.txt
> > --max 50"
> >
> > The command runs and produces the file dict.vec in my case but no
> out.txt.
> > I
> > have no idea why this is and would really appreciate a pointer to what
> > could
> > be happening.
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/running-mahout-with-lucene-vector-produces-a-dictionary-output-only-tp1786351p1786351.html
> > Sent from the Lucene - General mailing list archive at Nabble.com.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: running mahout with lucene.vector produces a dictionary output only

Lahiru Samarakoon
Guys, any suggesstions?

Thanks,
Lahiru

On Thu, Nov 11, 2010 at 12:28 PM, Lahiru Samarakoon <[hidden email]>wrote:

> Dear All,
>
> I am facing the same problem.
> Please advice.
>
> Thanks,
> Lahiru
>
>
> On Thu, Oct 28, 2010 at 7:19 PM, Robin Anil <[hidden email]> wrote:
>
>> +mahout-user
>>
>> On Thu, Oct 28, 2010 at 7:06 PM, Mackram <[hidden email]> wrote:
>>
>> >
>> > Hey everyone,
>> >
>> > I have a simple question to ask and hopefully someone can point me in
>> the
>> > right direction. I have a Solr setup with several documents in it (in
>> the
>> > thousands). I have downloaded the latest mahout trunk and attempted to
>> > create vectors out of the solr index as is shown in the wiki by using
>> the
>> > command
>> >
>> > "./bin/mahout lucene.vector --dir /usr/local/solr/data/index/ --field
>> > englishContent --idField id --dictOut ~/dict.vec --norm 2 --output
>> > ~/out.txt
>> > --max 50"
>> >
>> > The command runs and produces the file dict.vec in my case but no
>> out.txt.
>> > I
>> > have no idea why this is and would really appreciate a pointer to what
>> > could
>> > be happening.
>> > --
>> > View this message in context:
>> >
>> http://lucene.472066.n3.nabble.com/running-mahout-with-lucene-vector-produces-a-dictionary-output-only-tp1786351p1786351.html
>> > Sent from the Lucene - General mailing list archive at Nabble.com.
>> >
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: running mahout with lucene.vector produces a dictionary output only

Mackram
This post has NOT been accepted by the mailing list yet.
Hey Everyone,

So I was the one who originally had the problem and I would like to point out that in actuality there is no problem. Remember that unless you tell lucene.vector to output to a json file the output is a hadoop sequencefile which for some reason was not appearing when you do a normal local ls. However if you do a hadoop fsck then you will see that your files are there. Further actually running any clustering on those files/directories should be no issue even if you do not see them.

Well at least the above is what happened to me
Reply | Threaded
Open this post in threaded view
|

Re: running mahout with lucene.vector produces a dictionary output only

bauboni
This post has NOT been accepted by the mailing list yet.
Thank you sir, that was happening with me too.

hadoop fsck DIR -files