What's the status of Nutch-GUI?

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

What's the status of Nutch-GUI?

scott green
Hi

Is nutch-gui dead? why i cannot find any source in svn repo?
Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

Sami Siren-2
scott green wrote:
> Hi
>
> Is nutch-gui dead? why i cannot find any source in svn repo?

Unfortunately the sources for the admin gui never got into svn. It would
be great if someone could pick it up and bring it up to date to get it
integrated.

--
  Sami Siren

Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

scott green
In reply to this post by scott green
On 11/21/06, scott green <[hidden email]> wrote:
> Hi
>
> Is nutch-gui dead? why i cannot find any source in svn repo?
>
I mean nutch-admin GUI.
Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

chrismattmann
In reply to this post by Sami Siren-2
Hi Sami and Scott,

 This is on my TO-DO list as one of the items that I will begin working on
getting into the sources as a committer. Additionally, I plan on integrating
and testing the parse-xml plugin into the source tree. As soon as I get my
Apache account and SVN access, I will start working on this.

Thanks!

Cheers,
  Chris



On 11/20/06 9:24 AM, "Sami Siren" <[hidden email]> wrote:

> scott green wrote:
>> Hi
>>
>> Is nutch-gui dead? why i cannot find any source in svn repo?
>
> Unfortunately the sources for the admin gui never got into svn. It would
> be great if someone could pick it up and bring it up to date to get it
> integrated.
>
> --
>   Sami Siren
>


Reply | Threaded
Open this post in threaded view
|

RE: What's the status of Nutch-GUI?

Armel T. Nene-2
Hi Chris,

I am trying to extend parse-xml to enable the creation of lucene fields
straight from an xml file. For example, a database table that has been parse
as an XML file should be stored in the index with the relevant fields, i.e.
customer name, address and so on. This file will not have a namespace
associated with it and should not be stored as "xmlcontent" in the database.
Currently, parse-xml looks for known fields in the document and stores the
associated values with the field name. I have added an extra conditions as
if the known fields are not present in the current document, the element or
node in the document should be the new field stored in the index with their
value.

Therefore, when parse-xml receives an xml document with no namespace
available, it will parse the document and store it element name as new field
in the index and the element associated value.

Let me know if I am on the right track because I know I don't have to write
a separate plugin for this feature but just extending ( or modifying)
parse-xml.

Cheers,

Armel


-----Original Message-----
From: Chris Mattmann [mailto:[hidden email]]
Sent: 20 November 2006 18:40
To: [hidden email]
Subject: Re: What's the status of Nutch-GUI?

Hi Sami and Scott,

 This is on my TO-DO list as one of the items that I will begin working on
getting into the sources as a committer. Additionally, I plan on integrating
and testing the parse-xml plugin into the source tree. As soon as I get my
Apache account and SVN access, I will start working on this.

Thanks!

Cheers,
  Chris



On 11/20/06 9:24 AM, "Sami Siren" <[hidden email]> wrote:

> scott green wrote:
>> Hi
>>
>> Is nutch-gui dead? why i cannot find any source in svn repo?
>
> Unfortunately the sources for the admin gui never got into svn. It would
> be great if someone could pick it up and bring it up to date to get it
> integrated.
>
> --
>   Sami Siren
>




Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

chrismattmann
Hi Armel,

On 11/20/06 1:44 PM, "Armel T. Nene" <[hidden email]> wrote:

> Hi Chris,
>
> I am trying to extend parse-xml to enable the creation of lucene fields
> straight from an xml file. For example, a database table that has been parse
> as an XML file should be stored in the index with the relevant fields, i.e.
> customer name, address and so on. This file will not have a namespace
> associated with it and should not be stored as "xmlcontent" in the database.
> Currently, parse-xml looks for known fields in the document and stores the
> associated values with the field name. I have added an extra conditions as
> if the known fields are not present in the current document, the element or
> node in the document should be the new field stored in the index with their
> value.

I think that this is fine.
>
> Therefore, when parse-xml receives an xml document with no namespace
> available, it will parse the document and store it element name as new field
> in the index and the element associated value.
>
> Let me know if I am on the right track because I know I don't have to write
> a separate plugin for this feature but just extending ( or modifying)
> parse-xml.

I think that parse-xml will support what you are talking about. In terms of
the "check" that you are doing to see if a field exists or not before adding
another value for it in the index, as I understood Lucene, I believe that
you could just omit this check and add the field regardless. If you add
multiple values for the same field in a Document, e.g:

<snip>
Document doc = new Document();

doc.add(new Field("fieldname", "fieldvalue", ...));
doc.add(new Field("fieldname", "fieldvalue2",...));

</snip>

Both the values "fieldvalue" and "fieldvalue2" will both get stored in the
index for the key "fieldname". So, if I understand you correctly (which I
may not ;) ), then I think you can omit the check that you are talking about
above and just go with adding the same field name 2x.

HTH,
  Chris

>
> Cheers,
>
> Armel
>
>
> -----Original Message-----
> From: Chris Mattmann [mailto:[hidden email]]
> Sent: 20 November 2006 18:40
> To: [hidden email]
> Subject: Re: What's the status of Nutch-GUI?
>
> Hi Sami and Scott,
>
>  This is on my TO-DO list as one of the items that I will begin working on
> getting into the sources as a committer. Additionally, I plan on integrating
> and testing the parse-xml plugin into the source tree. As soon as I get my
> Apache account and SVN access, I will start working on this.
>
> Thanks!
>
> Cheers,
>   Chris
>
>
>
> On 11/20/06 9:24 AM, "Sami Siren" <[hidden email]> wrote:
>
>> scott green wrote:
>>> Hi
>>>
>>> Is nutch-gui dead? why i cannot find any source in svn repo?
>>
>> Unfortunately the sources for the admin gui never got into svn. It would
>> be great if someone could pick it up and bring it up to date to get it
>> integrated.
>>
>> --
>>   Sami Siren
>>
>
>
>
>

______________________________________________
Chris A. Mattmann
[hidden email]
Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.


Reply | Threaded
Open this post in threaded view
|

RE: What's the status of Nutch-GUI?

Armel T. Nene-2
Chris, Rida,

Here the changes that I have made to XMLParseConfig.java in the
populateConfig(Document doc) method:


if (elemNode.getAttribute("nodeXpath") != null) {
                                        String nodeXpath =
elemNode.getAttributeValue("namespace");
                                        xip.setNodeXpath(nodeXpath);
                                }
                                List fieldList = XPath.selectNodes(elemNode,
"field");
                               
                                if(fieldList != null) // modified 20062011
by Armel
                                {
                                for (int j = 0; j < fieldList.size(); j++) {
                                        Element elem = (Element)
fieldList.get(j);
                                        XMLField xf =
populateXMLField(elem);
                                        fieldsColl.add(xf);
                                }
                                }
                               
                                /*
                                 * modifiied by Armel
                                 * 20062011
                                 * if fieldList is empty because it doesn't
contain
                                 * an element "field"
                                 */
                                if(fieldList == null){
                                       XMLField xf =
populateXMLField(elemNode);
                                        fieldsColl.add(xf);
                                    }

And the populateXMLField(Element el) method:

if (elem.getAttribute("name") != null)
                        xf.setFieldName(elem.getAttributeValue("name"));

                if(elem.getAttribute("name")== null)// modified by Armel
                {
                    List att = elem.getAttributes();
                    if(att != null){ // modified by Armel - loop and create
field accondingly
                        for (int i = 0; i < att.size(); i++){
                           Attribute at = (Attribute)att.get(i);
 
xf.setFieldName(elem.getAttributeValue(at.getName()));
                        }
                }
                if (elem.getAttribute("xpath") != null)
                        xf.setFieldXPath(elem.getAttributeValue("xpath"));

this is supposed to do the feature I want to implement, please advise.

Armel

-----Original Message-----
From: Chris Mattmann [mailto:[hidden email]]
Sent: 20 November 2006 23:30
To: [hidden email]
Subject: Re: What's the status of Nutch-GUI?

Hi Armel,

On 11/20/06 1:44 PM, "Armel T. Nene" <[hidden email]> wrote:

> Hi Chris,
>
> I am trying to extend parse-xml to enable the creation of lucene fields
> straight from an xml file. For example, a database table that has been
parse
> as an XML file should be stored in the index with the relevant fields,
i.e.
> customer name, address and so on. This file will not have a namespace
> associated with it and should not be stored as "xmlcontent" in the
database.
> Currently, parse-xml looks for known fields in the document and stores the
> associated values with the field name. I have added an extra conditions as
> if the known fields are not present in the current document, the element
or
> node in the document should be the new field stored in the index with
their
> value.

I think that this is fine.
>
> Therefore, when parse-xml receives an xml document with no namespace
> available, it will parse the document and store it element name as new
field
> in the index and the element associated value.
>
> Let me know if I am on the right track because I know I don't have to
write
> a separate plugin for this feature but just extending ( or modifying)
> parse-xml.

I think that parse-xml will support what you are talking about. In terms of
the "check" that you are doing to see if a field exists or not before adding
another value for it in the index, as I understood Lucene, I believe that
you could just omit this check and add the field regardless. If you add
multiple values for the same field in a Document, e.g:

<snip>
Document doc = new Document();

doc.add(new Field("fieldname", "fieldvalue", ...));
doc.add(new Field("fieldname", "fieldvalue2",...));

</snip>

Both the values "fieldvalue" and "fieldvalue2" will both get stored in the
index for the key "fieldname". So, if I understand you correctly (which I
may not ;) ), then I think you can omit the check that you are talking about
above and just go with adding the same field name 2x.

HTH,
  Chris

>
> Cheers,
>
> Armel
>
>
> -----Original Message-----
> From: Chris Mattmann [mailto:[hidden email]]
> Sent: 20 November 2006 18:40
> To: [hidden email]
> Subject: Re: What's the status of Nutch-GUI?
>
> Hi Sami and Scott,
>
>  This is on my TO-DO list as one of the items that I will begin working on
> getting into the sources as a committer. Additionally, I plan on
integrating

> and testing the parse-xml plugin into the source tree. As soon as I get my
> Apache account and SVN access, I will start working on this.
>
> Thanks!
>
> Cheers,
>   Chris
>
>
>
> On 11/20/06 9:24 AM, "Sami Siren" <[hidden email]> wrote:
>
>> scott green wrote:
>>> Hi
>>>
>>> Is nutch-gui dead? why i cannot find any source in svn repo?
>>
>> Unfortunately the sources for the admin gui never got into svn. It would
>> be great if someone could pick it up and bring it up to date to get it
>> integrated.
>>
>> --
>>   Sami Siren
>>
>
>
>
>

______________________________________________
Chris A. Mattmann
[hidden email]
Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.




Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

Enis Soztutar
In reply to this post by scott green
scott green wrote:
> On 11/21/06, scott green <[hidden email]> wrote:
>> Hi
>>
>> Is nutch-gui dead? why i cannot find any source in svn repo?
>>
> I mean nutch-admin GUI.
>
Hi,

I am working on the patch to make it work with the current trunk. I will
upload the patch to the Jira, when i'm done.
Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

scott green
Hi

I am now port Stefan to my dev-box. And some errors here, hope some
one can help me. When I start embedded web application jetty, the
exceptions:

06/11/22 02:28:10 INFO util.Credential: Checking Resource aliases
06/11/22 02:28:11 INFO util.Container: Started
org.mortbay.jetty.servlet.WebApplicationHandler@102a0a5
Exception in thread "main" java.lang.ClassNotFoundException:
org.apache.jasper.servlet.JspServlet
        at java.net.URLClassLoader$1.run(Unknown Source)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(Unknown Source)
        at java.lang.ClassLoader.loadClass(Unknown Source)
        at java.lang.ClassLoader.loadClass(Unknown Source)
        at org.mortbay.http.HttpContext.loadClass(HttpContext.java:1262)
        at org.mortbay.jetty.servlet.Holder.start(Holder.java:188)
        at org.mortbay.jetty.servlet.ServletHolder.start(ServletHolder.java:219)
        at org.mortbay.jetty.servlet.ServletHandler.initializeServlets(ServletHandler.java:445)
        at org.mortbay.jetty.servlet.WebApplicationHandler.initializeServlets(WebApplicationHandler.java:323)
        at org.mortbay.jetty.servlet.WebApplicationContext.doStart(WebApplicationContext.java:511)
        at org.mortbay.util.Container.start(Container.java:72)
        at org.apache.nutch.admin.WebContainer.addComponentExtensions(WebContainer.java:152)
        at org.apache.nutch.admin.AdministrationApp.startContainer(AdministrationApp.java:41)
        at org.apache.nutch.admin.AdministrationApp.main(AdministrationApp.java:158)
06/11/22 02:28:24 INFO util.Container: Started HttpContext[/,/]

the code snippets:
      WebApplicationContext webContext =
this.server.addWebApplication(contextName, new
File(jsps).getCanonicalPath());
      webContext.setClassLoader(extension.getDescriptor().getClassLoader());
      webContext.setAttribute("component", component);
      webContext.setAttribute("components", components);
      if (instances != null) {
        webContext.setAttribute("instances", instances);
        webContext.setAttribute("container", this);
      }
      webContext.start();

So how can I put some required jars into the classloader?
Thanks

- Scott


On 11/21/06, Enis Soztutar <[hidden email]> wrote:

> scott green wrote:
> > On 11/21/06, scott green <[hidden email]> wrote:
> >> Hi
> >>
> >> Is nutch-gui dead? why i cannot find any source in svn repo?
> >>
> > I mean nutch-admin GUI.
> >
> Hi,
>
> I am working on the patch to make it work with the current trunk. I will
> upload the patch to the Jira, when i'm done.
>
Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

Sami Siren-2
scott green wrote:

> Hi
>
> I am now port Stefan to my dev-box. And some errors here, hope some
> one can help me. When I start embedded web application jetty, the
> exceptions:
>
> 06/11/22 02:28:10 INFO util.Credential: Checking Resource aliases
> 06/11/22 02:28:11 INFO util.Container: Started
> org.mortbay.jetty.servlet.WebApplicationHandler@102a0a5
> Exception in thread "main" java.lang.ClassNotFoundException:
> org.apache.jasper.servlet.JspServlet
>     at java.net.URLClassLoader$1.run(Unknown Source)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at java.net.URLClassLoader.findClass(Unknown Source)
>     at java.lang.ClassLoader.loadClass(Unknown Source)
>     at java.lang.ClassLoader.loadClass(Unknown Source)
>     at org.mortbay.http.HttpContext.loadClass(HttpContext.java:1262)
>     at org.mortbay.jetty.servlet.Holder.start(Holder.java:188)
>     at
> org.mortbay.jetty.servlet.ServletHolder.start(ServletHolder.java:219)
>     at
> org.mortbay.jetty.servlet.ServletHandler.initializeServlets(ServletHandler.java:445)
>
>     at
> org.mortbay.jetty.servlet.WebApplicationHandler.initializeServlets(WebApplicationHandler.java:323)
>
>     at
> org.mortbay.jetty.servlet.WebApplicationContext.doStart(WebApplicationContext.java:511)
>
>     at org.mortbay.util.Container.start(Container.java:72)
>     at
> org.apache.nutch.admin.WebContainer.addComponentExtensions(WebContainer.java:152)
>
>     at
> org.apache.nutch.admin.AdministrationApp.startContainer(AdministrationApp.java:41)
>
>     at
> org.apache.nutch.admin.AdministrationApp.main(AdministrationApp.java:158)
> 06/11/22 02:28:24 INFO util.Container: Started HttpContext[/,/]
>
> the code snippets:
>      WebApplicationContext webContext =
> this.server.addWebApplication(contextName, new
> File(jsps).getCanonicalPath());
>      webContext.setClassLoader(extension.getDescriptor().getClassLoader());
>      webContext.setAttribute("component", component);
>      webContext.setAttribute("components", components);
>      if (instances != null) {
>        webContext.setAttribute("instances", instances);
>        webContext.setAttribute("container", this);
>      }
>      webContext.start();
>
> So how can I put some required jars into the classloader?
> Thanks

Is there a starts script (bin/nutch?) or something like that where you
could add the jasper-compiler.jar so it gets into classpath of JVM.

--
  Sami Siren
Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

Stefan Groschupf
In reply to this post by scott green
Hi,
sorry I'm a bit out of the loop, but if you are patient I will help  
you get things running with latest trunk during next week.
May be also Marko can help.

Cheers,
Stefan

On 21.11.2006, at 10:34, scott green wrote:

> Hi
>
> I am now port Stefan to my dev-box. And some errors here, hope some
> one can help me. When I start embedded web application jetty, the
> exceptions:
>
> 06/11/22 02:28:10 INFO util.Credential: Checking Resource aliases
> 06/11/22 02:28:11 INFO util.Container: Started
> org.mortbay.jetty.servlet.WebApplicationHandler@102a0a5
> Exception in thread "main" java.lang.ClassNotFoundException:
> org.apache.jasper.servlet.JspServlet
> at java.net.URLClassLoader$1.run(Unknown Source)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(Unknown Source)
> at java.lang.ClassLoader.loadClass(Unknown Source)
> at java.lang.ClassLoader.loadClass(Unknown Source)
> at org.mortbay.http.HttpContext.loadClass(HttpContext.java:1262)
> at org.mortbay.jetty.servlet.Holder.start(Holder.java:188)
> at org.mortbay.jetty.servlet.ServletHolder.start
> (ServletHolder.java:219)
> at org.mortbay.jetty.servlet.ServletHandler.initializeServlets
> (ServletHandler.java:445)
> at  
> org.mortbay.jetty.servlet.WebApplicationHandler.initializeServlets
> (WebApplicationHandler.java:323)
> at org.mortbay.jetty.servlet.WebApplicationContext.doStart
> (WebApplicationContext.java:511)
> at org.mortbay.util.Container.start(Container.java:72)
> at org.apache.nutch.admin.WebContainer.addComponentExtensions
> (WebContainer.java:152)
> at org.apache.nutch.admin.AdministrationApp.startContainer
> (AdministrationApp.java:41)
> at org.apache.nutch.admin.AdministrationApp.main
> (AdministrationApp.java:158)
> 06/11/22 02:28:24 INFO util.Container: Started HttpContext[/,/]
>
> the code snippets:
>      WebApplicationContext webContext =
> this.server.addWebApplication(contextName, new
> File(jsps).getCanonicalPath());
>      webContext.setClassLoader(extension.getDescriptor
> ().getClassLoader());
>      webContext.setAttribute("component", component);
>      webContext.setAttribute("components", components);
>      if (instances != null) {
>        webContext.setAttribute("instances", instances);
>        webContext.setAttribute("container", this);
>      }
>      webContext.start();
>
> So how can I put some required jars into the classloader?
> Thanks
>
> - Scott
>
>
> On 11/21/06, Enis Soztutar <[hidden email]> wrote:
>> scott green wrote:
>> > On 11/21/06, scott green <[hidden email]> wrote:
>> >> Hi
>> >>
>> >> Is nutch-gui dead? why i cannot find any source in svn repo?
>> >>
>> > I mean nutch-admin GUI.
>> >
>> Hi,
>>
>> I am working on the patch to make it work with the current trunk.  
>> I will
>> upload the patch to the Jira, when i'm done.
>>
>

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
101tec Inc.
search tech for web 2.1
Menlo Park, California
http://www.101tec.com



Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

scott green
In reply to this post by Sami Siren-2
On 11/22/06, Sami Siren <[hidden email]> wrote:

> scott green wrote:
> > Hi
> >
> > I am now port Stefan to my dev-box. And some errors here, hope some
> > one can help me. When I start embedded web application jetty, the
> > exceptions:
> >
> > 06/11/22 02:28:10 INFO util.Credential: Checking Resource aliases
> > 06/11/22 02:28:11 INFO util.Container: Started
> > org.mortbay.jetty.servlet.WebApplicationHandler@102a0a5
> > Exception in thread "main" java.lang.ClassNotFoundException:
> > org.apache.jasper.servlet.JspServlet
> >     at java.net.URLClassLoader$1.run(Unknown Source)
> >     at java.security.AccessController.doPrivileged(Native Method)
> >     at java.net.URLClassLoader.findClass(Unknown Source)
> >     at java.lang.ClassLoader.loadClass(Unknown Source)
> >     at java.lang.ClassLoader.loadClass(Unknown Source)
> >     at org.mortbay.http.HttpContext.loadClass(HttpContext.java:1262)
> >     at org.mortbay.jetty.servlet.Holder.start(Holder.java:188)
> >     at
> > org.mortbay.jetty.servlet.ServletHolder.start(ServletHolder.java:219)
> >     at
> > org.mortbay.jetty.servlet.ServletHandler.initializeServlets(ServletHandler.java:445)
> >
> >     at
> > org.mortbay.jetty.servlet.WebApplicationHandler.initializeServlets(WebApplicationHandler.java:323)
> >
> >     at
> > org.mortbay.jetty.servlet.WebApplicationContext.doStart(WebApplicationContext.java:511)
> >
> >     at org.mortbay.util.Container.start(Container.java:72)
> >     at
> > org.apache.nutch.admin.WebContainer.addComponentExtensions(WebContainer.java:152)
> >
> >     at
> > org.apache.nutch.admin.AdministrationApp.startContainer(AdministrationApp.java:41)
> >
> >     at
> > org.apache.nutch.admin.AdministrationApp.main(AdministrationApp.java:158)
> > 06/11/22 02:28:24 INFO util.Container: Started HttpContext[/,/]
> >
> > the code snippets:
> >      WebApplicationContext webContext =
> > this.server.addWebApplication(contextName, new
> > File(jsps).getCanonicalPath());
> >      webContext.setClassLoader(extension.getDescriptor().getClassLoader());
> >      webContext.setAttribute("component", component);
> >      webContext.setAttribute("components", components);
> >      if (instances != null) {
> >        webContext.setAttribute("instances", instances);
> >        webContext.setAttribute("container", this);
> >      }
> >      webContext.start();
> >
> > So how can I put some required jars into the classloader?
> > Thanks
>
> Is there a starts script (bin/nutch?) or something like that where you
> could add the jasper-compiler.jar so it gets into classpath of JVM.

Hi Sami

You are right. I add the jars into JVM classpath and now it works, thanks.

- Scott

> --
>  Sami Siren
>
Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

scott green
In reply to this post by Stefan Groschupf
Hi Stefan

Glad to here that. I hope some changes in Hadoop can be commited so
that we dont need to maintain the non-offical hadoop jar.

-Scott

On 11/22/06, Stefan Groschupf <[hidden email]> wrote:

> Hi,
> sorry I'm a bit out of the loop, but if you are patient I will help
> you get things running with latest trunk during next week.
> May be also Marko can help.
>
> Cheers,
> Stefan
>
> On 21.11.2006, at 10:34, scott green wrote:
>
> > Hi
> >
> > I am now port Stefan to my dev-box. And some errors here, hope some
> > one can help me. When I start embedded web application jetty, the
> > exceptions:
> >
> > 06/11/22 02:28:10 INFO util.Credential: Checking Resource aliases
> > 06/11/22 02:28:11 INFO util.Container: Started
> > org.mortbay.jetty.servlet.WebApplicationHandler@102a0a5
> > Exception in thread "main" java.lang.ClassNotFoundException:
> > org.apache.jasper.servlet.JspServlet
> >       at java.net.URLClassLoader$1.run(Unknown Source)
> >       at java.security.AccessController.doPrivileged(Native Method)
> >       at java.net.URLClassLoader.findClass(Unknown Source)
> >       at java.lang.ClassLoader.loadClass(Unknown Source)
> >       at java.lang.ClassLoader.loadClass(Unknown Source)
> >       at org.mortbay.http.HttpContext.loadClass(HttpContext.java:1262)
> >       at org.mortbay.jetty.servlet.Holder.start(Holder.java:188)
> >       at org.mortbay.jetty.servlet.ServletHolder.start
> > (ServletHolder.java:219)
> >       at org.mortbay.jetty.servlet.ServletHandler.initializeServlets
> > (ServletHandler.java:445)
> >       at
> > org.mortbay.jetty.servlet.WebApplicationHandler.initializeServlets
> > (WebApplicationHandler.java:323)
> >       at org.mortbay.jetty.servlet.WebApplicationContext.doStart
> > (WebApplicationContext.java:511)
> >       at org.mortbay.util.Container.start(Container.java:72)
> >       at org.apache.nutch.admin.WebContainer.addComponentExtensions
> > (WebContainer.java:152)
> >       at org.apache.nutch.admin.AdministrationApp.startContainer
> > (AdministrationApp.java:41)
> >       at org.apache.nutch.admin.AdministrationApp.main
> > (AdministrationApp.java:158)
> > 06/11/22 02:28:24 INFO util.Container: Started HttpContext[/,/]
> >
> > the code snippets:
> >      WebApplicationContext webContext =
> > this.server.addWebApplication(contextName, new
> > File(jsps).getCanonicalPath());
> >      webContext.setClassLoader(extension.getDescriptor
> > ().getClassLoader());
> >      webContext.setAttribute("component", component);
> >      webContext.setAttribute("components", components);
> >      if (instances != null) {
> >        webContext.setAttribute("instances", instances);
> >        webContext.setAttribute("container", this);
> >      }
> >      webContext.start();
> >
> > So how can I put some required jars into the classloader?
> > Thanks
> >
> > - Scott
> >
> >
> > On 11/21/06, Enis Soztutar <[hidden email]> wrote:
> >> scott green wrote:
> >> > On 11/21/06, scott green <[hidden email]> wrote:
> >> >> Hi
> >> >>
> >> >> Is nutch-gui dead? why i cannot find any source in svn repo?
> >> >>
> >> > I mean nutch-admin GUI.
> >> >
> >> Hi,
> >>
> >> I am working on the patch to make it work with the current trunk.
> >> I will
> >> upload the patch to the Jira, when i'm done.
> >>
> >
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 101tec Inc.
> search tech for web 2.1
> Menlo Park, California
> http://www.101tec.com
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

Sami Siren-2
scott green wrote:
> Hi Stefan
>
> Glad to here that. I hope some changes in Hadoop can be commited so
> that we dont need to maintain the non-offical hadoop jar.

I guess non-official hadoop jar is out of the question (as it goes on
so rapidly). What are the modifications required, couldn't we start
without them?

--
  Sami Siren

Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

Stefan Groschupf
> I guess non-official hadoop jar is out of the question (as it goes on
> so rapidly). What are the modifications required, couldn't we start  
> without them?

Well than we would have a admin gui that does not work for local  
installation but only for distributed installations.
See:
http://www.find23.net/nutch_guiToHadoop.pdf
Section required hadoop changes.

Stefan
Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

Zaheed Haque
In reply to this post by scott green
Scott:

Would you be kind enough to upload your Nutch-Gui patch which works
with current trunk? I would like to give it a try.

Regards

On 11/22/06, scott green <[hidden email]> wrote:

> On 11/22/06, Sami Siren <[hidden email]> wrote:
> > scott green wrote:
> > > Hi
> > >
> > > I am now port Stefan to my dev-box. And some errors here, hope some
> > > one can help me. When I start embedded web application jetty, the
> > > exceptions:
> > >
> > > 06/11/22 02:28:10 INFO util.Credential: Checking Resource aliases
> > > 06/11/22 02:28:11 INFO util.Container: Started
> > > org.mortbay.jetty.servlet.WebApplicationHandler@102a0a5
> > > Exception in thread "main" java.lang.ClassNotFoundException:
> > > org.apache.jasper.servlet.JspServlet
> > >     at java.net.URLClassLoader$1.run(Unknown Source)
> > >     at java.security.AccessController.doPrivileged(Native Method)
> > >     at java.net.URLClassLoader.findClass(Unknown Source)
> > >     at java.lang.ClassLoader.loadClass(Unknown Source)
> > >     at java.lang.ClassLoader.loadClass(Unknown Source)
> > >     at org.mortbay.http.HttpContext.loadClass(HttpContext.java:1262)
> > >     at org.mortbay.jetty.servlet.Holder.start(Holder.java:188)
> > >     at
> > > org.mortbay.jetty.servlet.ServletHolder.start(ServletHolder.java:219)
> > >     at
> > > org.mortbay.jetty.servlet.ServletHandler.initializeServlets(ServletHandler.java:445)
> > >
> > >     at
> > > org.mortbay.jetty.servlet.WebApplicationHandler.initializeServlets(WebApplicationHandler.java:323)
> > >
> > >     at
> > > org.mortbay.jetty.servlet.WebApplicationContext.doStart(WebApplicationContext.java:511)
> > >
> > >     at org.mortbay.util.Container.start(Container.java:72)
> > >     at
> > > org.apache.nutch.admin.WebContainer.addComponentExtensions(WebContainer.java:152)
> > >
> > >     at
> > > org.apache.nutch.admin.AdministrationApp.startContainer(AdministrationApp.java:41)
> > >
> > >     at
> > > org.apache.nutch.admin.AdministrationApp.main(AdministrationApp.java:158)
> > > 06/11/22 02:28:24 INFO util.Container: Started HttpContext[/,/]
> > >
> > > the code snippets:
> > >      WebApplicationContext webContext =
> > > this.server.addWebApplication(contextName, new
> > > File(jsps).getCanonicalPath());
> > >      webContext.setClassLoader(extension.getDescriptor().getClassLoader());
> > >      webContext.setAttribute("component", component);
> > >      webContext.setAttribute("components", components);
> > >      if (instances != null) {
> > >        webContext.setAttribute("instances", instances);
> > >        webContext.setAttribute("container", this);
> > >      }
> > >      webContext.start();
> > >
> > > So how can I put some required jars into the classloader?
> > > Thanks
> >
> > Is there a starts script (bin/nutch?) or something like that where you
> > could add the jasper-compiler.jar so it gets into classpath of JVM.
>
> Hi Sami
>
> You are right. I add the jars into JVM classpath and now it works, thanks.
>
> - Scott
>
> > --
> >  Sami Siren
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

scott green
Hi

I will try my best. But I think Stefan is the right guy to do the
right thing :) He designed admin gui and implemented it.

- Scott

On 11/23/06, Zaheed Haque <[hidden email]> wrote:

> Scott:
>
> Would you be kind enough to upload your Nutch-Gui patch which works
> with current trunk? I would like to give it a try.
>
> Regards
>
> On 11/22/06, scott green <[hidden email]> wrote:
> > On 11/22/06, Sami Siren <[hidden email]> wrote:
> > > scott green wrote:
> > > > Hi
> > > >
> > > > I am now port Stefan to my dev-box. And some errors here, hope some
> > > > one can help me. When I start embedded web application jetty, the
> > > > exceptions:
> > > >
> > > > 06/11/22 02:28:10 INFO util.Credential: Checking Resource aliases
> > > > 06/11/22 02:28:11 INFO util.Container: Started
> > > > org.mortbay.jetty.servlet.WebApplicationHandler@102a0a5
> > > > Exception in thread "main" java.lang.ClassNotFoundException:
> > > > org.apache.jasper.servlet.JspServlet
> > > >     at java.net.URLClassLoader$1.run(Unknown Source)
> > > >     at java.security.AccessController.doPrivileged(Native Method)
> > > >     at java.net.URLClassLoader.findClass(Unknown Source)
> > > >     at java.lang.ClassLoader.loadClass(Unknown Source)
> > > >     at java.lang.ClassLoader.loadClass(Unknown Source)
> > > >     at org.mortbay.http.HttpContext.loadClass(HttpContext.java:1262)
> > > >     at org.mortbay.jetty.servlet.Holder.start(Holder.java:188)
> > > >     at
> > > > org.mortbay.jetty.servlet.ServletHolder.start(ServletHolder.java:219)
> > > >     at
> > > > org.mortbay.jetty.servlet.ServletHandler.initializeServlets(ServletHandler.java:445)
> > > >
> > > >     at
> > > > org.mortbay.jetty.servlet.WebApplicationHandler.initializeServlets(WebApplicationHandler.java:323)
> > > >
> > > >     at
> > > > org.mortbay.jetty.servlet.WebApplicationContext.doStart(WebApplicationContext.java:511)
> > > >
> > > >     at org.mortbay.util.Container.start(Container.java:72)
> > > >     at
> > > > org.apache.nutch.admin.WebContainer.addComponentExtensions(WebContainer.java:152)
> > > >
> > > >     at
> > > > org.apache.nutch.admin.AdministrationApp.startContainer(AdministrationApp.java:41)
> > > >
> > > >     at
> > > > org.apache.nutch.admin.AdministrationApp.main(AdministrationApp.java:158)
> > > > 06/11/22 02:28:24 INFO util.Container: Started HttpContext[/,/]
> > > >
> > > > the code snippets:
> > > >      WebApplicationContext webContext =
> > > > this.server.addWebApplication(contextName, new
> > > > File(jsps).getCanonicalPath());
> > > >      webContext.setClassLoader(extension.getDescriptor().getClassLoader());
> > > >      webContext.setAttribute("component", component);
> > > >      webContext.setAttribute("components", components);
> > > >      if (instances != null) {
> > > >        webContext.setAttribute("instances", instances);
> > > >        webContext.setAttribute("container", this);
> > > >      }
> > > >      webContext.start();
> > > >
> > > > So how can I put some required jars into the classloader?
> > > > Thanks
> > >
> > > Is there a starts script (bin/nutch?) or something like that where you
> > > could add the jasper-compiler.jar so it gets into classpath of JVM.
> >
> > Hi Sami
> >
> > You are right. I add the jars into JVM classpath and now it works, thanks.
> >
> > - Scott
> >
> > > --
> > >  Sami Siren
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

Sami Siren-2
In reply to this post by Stefan Groschupf
Stefan Groschupf wrote:
>> I guess non-official hadoop jar is out of the question (as it goes on
>> so rapidly). What are the modifications required, couldn't we start
>> without them?
>
> Well than we would have a admin gui that does not work for local
> installation but only for distributed installations.
> See:
> http://www.find23.net/nutch_guiToHadoop.pdf
> Section required hadoop changes.

I quess you refer to these:

•  LocalJobRunner:
   •  Run as kind of singelton
   •  Have a kind of jobQueue
   •  Implement JobSubmissionProtocol status-report
      methods
   •  implement killJob method

-how about writing a nutchrunner that just extends the functionality of
localjobrunner?
-scheduling (jobQueue) could be completely outside of jobrunner?

--
  Sami Siren



Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

Stefan Groschupf
Hi Sami,
> I quess you refer to these:
> •  LocalJobRunner:
>   •  Run as kind of singelton
>   •  Have a kind of jobQueue
>   •  Implement JobSubmissionProtocol status-report
>      methods
>   •  implement killJob method
Right!

>
> -how about writing a nutchrunner that just extends the  
> functionality of localjobrunner?
That would be one solution, however I still hope that the hadoop  
developer understand that it would be general benefit to improve the  
local jobrunner.
Since it would be somehow duplicated code it does not feel right, but  
I also think better this way as never get this issue solved.


> -scheduling (jobQueue) could be completely outside of jobrunner?

We solved that with Quarz and file based JobStore we implemented back  
than.

Stefan
Reply | Threaded
Open this post in threaded view
|

Re: What's the status of Nutch-GUI?

Doug Cutting
In reply to this post by Sami Siren-2
Sami Siren wrote:

> Stefan Groschupf wrote:
>> See:
>> http://www.find23.net/nutch_guiToHadoop.pdf
>> Section required hadoop changes.
>
> I quess you refer to these:
>
> •  LocalJobRunner:
>   •  Run as kind of singelton
>   •  Have a kind of jobQueue
>   •  Implement JobSubmissionProtocol status-report
>      methods
>   •  implement killJob method

Is there an issue in Hadoop's Jira for this?  Is there a patch that
implements these?  If there is, then I suggest folks vote for the issue.

> -how about writing a nutchrunner that just extends the functionality of
> localjobrunner?
> -scheduling (jobQueue) could be completely outside of jobrunner?

These also sounds like a good solutions.  If it is not Nutch-specific,
then perhaps it could be integrated into Hadoop, so that it is
maintained as Hadoop evolves.  If that sounds like a good approach,
please submit a patch to Hadoop with some unit tests.

Cheers,

Doug