Re: How to configure Apache gora to take only ol as column family ?
This issue can be addressed by essentially, commenting OUT all of the
instances where the WebPage  object is augmented within each job (and
An example would be as follows
https://github.com/apache/nutch/blob/2.x/src/java/org/apache/nutch/parse/ParseUtil.java#L358 You need to step through the entire codebase and essentially comment out
setting (and maybe getting) values from the WebPage object.
The alternative option, is to simply create a new WebPage schema with only
the outlinks data structure, then use the 'ant generate-gora-src' target to
recompile the Webpage Class.
https://github.com/apache/nutch/blob/2.x/build.xml#L612-L623 You can then attempt to recompile the project and address each compile
error sequentially until all you have remaining is code pertaining to
> From: suyash singh <[hidden email]>
> To: [hidden email] > Cc:
> Date: Tue, 14 Mar 2017 01:30:49 +0530
> Subject: Re: extract elements from each url as json and write it to s3
> I think you have to take database like mongodb. Write your custom gora
> mongodb mapping.xml and pass your Jason object to this.