solr query result not read the latest xml file

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

solr query result not read the latest xml file

e8en
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: solr query result not read the latest xml file

iorixxx
> hi everyone,
>
> I do these steps every time the new xml file created (for
> example
> cat_978.xml has just been created):
> 1. delete the index
> (<delete><query>AUC_CAT:978</query></delete>)
> 2. commit the new cat_978.xml (java -jar post.jar
> cat_978.xml)
> 3. restart the java (stop and java -jar start.jar)
>
> if I'm not done those steps then the query result showed in
> the browser
> still using the old value (cat_978.xml - no changes at all)
> instead of
> reading the new cat_978.xml
>
> what I want to ask, is there a way so I don't need to
> restart the java since
> it consume too much resources and time?

You dont need to delete old document. Solr replaces it automaticaly. Assuming they have same <uniqueKey>.

Probably HTTP caching causing you problems when testing with browser. You can disable it in solrconfig.xml file <httpCaching never304="true">


     
Reply | Threaded
Open this post in threaded view
|

Re: solr query result not read the latest xml file

e8en
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: solr query result not read the latest xml file

iorixxx


> I already set in my solrconfig.xml as you told me:
> <httpCaching never304="false"></httpCaching>
>
> and then I commit the xml
> and it's still not working
> the query result still show the old data :(
>
> do you have any suggestion?

Shouldn't it be never304="true"? You wrote never304="false"

Additionally cant you try with something else than browser, curl, wget etc.


     
Reply | Threaded
Open this post in threaded view
|

AW: solr query result not read the latest xml file

Bastian S.
In reply to this post by e8en
make sure you send a <commit/> after add/delete to make the changes visible.

-----Ursprüngliche Nachricht-----
Von: e8en [mailto:[hidden email]]
Gesendet: Dienstag, 10. August 2010 10:04
An: [hidden email]
Betreff: Re: solr query result not read the latest xml file


I already set in my solrconfig.xml as you told me:
<httpCaching never304="false"></httpCaching>

and then I commit the xml
and it's still not working
the query result still show the old data :(

do you have any suggestion?

Eben
--
View this message in context: http://lucene.472066.n3.nabble.com/solr-query-result-not-read-the-latest-xml-file-tp1066785p1068647.html
Sent from the Solr - User mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: solr query result not read the latest xml file

e8en
In reply to this post by iorixxx
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: AW: solr query result not read the latest xml file

e8en
In reply to this post by Bastian S.
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: solr query result not read the latest xml file

iorixxx
In reply to this post by e8en
> yes I try with both value, never304="true" and
> never304="false" and none of
> them make it works

It must be <httpCaching never304="true">  </httpCaching>, so lets forget about never304="false". But when you change something in solrconfig.xml you need to restart jetty/tomcat.

java -jar post.jar *.xml does </commit> by default at the end.

> what is curl and wget?

They are command line tools.

> I use mozilla firefox browser
> I'm really newbie in programming world especially solr

May be you can configure firefox to disable caches.



     
Reply | Threaded
Open this post in threaded view
|

AW: AW: solr query result not read the latest xml file

Bastian S.
In reply to this post by e8en
you can check the admin panel to see if there are pending deletes/commits in the statistics section.
older versions of post.jar dont auto-commit the changes, so if your xml doesnt contain a <commit/>
you could just create a commit.xml containing only the following:

<commit/>

and send it via post.jar. you can also curl it or whatever u like:

curl http://<hostname>:<port>/solr/update -H "Content-Type: text/xml" --data-binary '<commit/>'

-----Ursprüngliche Nachricht-----
Von: e8en [mailto:[hidden email]]
Gesendet: Dienstag, 10. August 2010 10:22
An: [hidden email]
Betreff: Re: AW: solr query result not read the latest xml file


hi Bastian,
how to send a <commit/>?
is it by typing : java -jar post.jar cat_978.xml?

if yes then I've already done that
any solution please?
--
View this message in context: http://lucene.472066.n3.nabble.com/solr-query-result-not-read-the-latest-xml-file-tp1066785p1068782.html
Sent from the Solr - User mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: solr query result not read the latest xml file

e8en
In reply to this post by e8en
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: solr query result not read the latest xml file

Jan Høydahl / Cominvent
Hi,

Beware that post.jar is just an example tool to play with the default example index located at /solr/ namespace. It is very limited and you shold look elsewhere for a more production ready and robust tool.

However, it has the ability to specify custom url. Please try:

java -jar post.jar -help

> SimplePostTool: version 1.2
> This is a simple command line tool for POSTing raw XML to a Solr
> port.  XML data can be read from files specified as commandline
> args; as raw commandline arg strings; or via STDIN.
> Examples:
>   java -Ddata=files -jar post.jar *.xml
>   java -Ddata=args  -jar post.jar '<delete><id>42</id></delete>'
>   java -Ddata=stdin -jar post.jar < hd.xml
> Other options controlled by System Properties include the Solr
> URL to POST to, and whether a commit should be executed.  These
> are the defaults for all System Properties...
>   -Ddata=files
>   -Durl=http://localhost:8983/solr/update
>   -Dcommit=yes
>


Thus for your index, try:
java -Durl=http://localhost:80/search/update -jar post.jar myfile.xml

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com

On 10. aug. 2010, at 12.10, e8en wrote:

>
> finally I found out the cause of my problem
> yes you don't need to delete the index and restart the tomcat just to get
> the data query result updated, you just need to commit the xml files.
>
> I made a custom url as per requirement from my client
> default url -- >
> http://localhost/solr/select/?q=ITEM_CAT:817&version=2.2&start=0&rows=10&indent=on
>
> my custom url -->
> http://localhost/search/select/?q=ITEM_CAT:817&version=2.2&start=0&rows=10&indent=on
>
> I made the custom url by copy paste the solr.war and renamed it to
> search.war, so in webapps folder there are two war files
> this is the cause of my problem, when I use the default url there is no
> problem at all but when I use my custom url I have to delete, commit, and
> restart the tomcat to make the query result correctly.
>
> the question is now changed :)
> how to make the search.war behave exactly the same like solr.war?
> maybe when I start the tomcat I should add some parameter so it will
> including/pointing to search.war not solr.war anymore?
>
> when I removed the solr.war so there is only one war file in webapps folder
> which is search.war, I can't do commit, it said 'FATAL: Solr returned an
> error: Not Found'
> it is because the app searching solr.war not search.war
> --
> View this message in context: http://lucene.472066.n3.nabble.com/solr-query-result-not-read-the-latest-xml-file-tp1066785p1070189.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: solr query result not read the latest xml file

e8en
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: solr query result not read the latest xml file

Jan Høydahl / Cominvent
Hi,

Yes, this is normal behavior. This is because Solr is *document* based, it does not know about *files*.
What happens here is that your source database (or whatever) has had deletinons within this category in addition to updates, and you need to relay those to Solr.

The best way to integrate with your source system is through some connector which picks up deletes as well as adds (updates is just a special case of add). If your source data is in a database, have a look at DataImportHandler which can be setup to do things like this.

If your source data is on files on a file system only, you'll have to write some scripts which takes care of all of this, e.g. by first issuing the delete and then the add (tip: try -Dcommit=no on the delete request and -Dcommit=yes on the following add to avoid temporary loss of data).

You need to think about what happens if a whole category is deleted. How would you know by simply looking at the file system?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Training in Europe - www.solrtraining.com

On 11. aug. 2010, at 04.10, e8en wrote:

>
> thanks for you response Jan,
> I just knew that the post.jar only an example tool
> so what should I use if not post.jar for production?
>
> btw, I already tried using this command:
> java -Durl=http://localhost:8983/search/update -jar post.jar cat_817.xml
>
> and IT WORKS !!
> the cat_817.xml reflected directly in the solr query after I commit the
> cat_817.xml, this is the url:
> http://localhost:8983/search/select/?q=ITEM_CAT:817&version=2.2&start=0&rows=10&indent=on
>
> the problem is it works if the old xml contain less doc than the new xml,
> for example if the old cat_817.xml contain 2 doc and the new cat_817.xml
> contain 10 doc then I just have to re-index (java
> -Durl=http://localhost:8983/search/update -jar post.jar cat_817.xml) and it
> the query result will have correct result (10 doc), but it doesn't work vice
> versa.
> If the old cat_817.xml contain 10 doc and the new cat_817.xml contain 2 doc,
> then I have to delete the index first (java -Ddata=args -Dcommit=yes -jar
> post.jar "<delete><query>ITEM_CAT:817</query></delete>") and re-index it
> (java -Durl=http://localhost:8983/search/update -jar post.jar cat_817.xml)
> to make the query result updated (2 doc).
>
> is it a normal process or something wrong with my solr?
>
> once again thanks again Jan, your help really make my day brighter :)
> and I believe your answer will help many solr newbie especially me
> --
> View this message in context: http://lucene.472066.n3.nabble.com/solr-query-result-not-read-the-latest-xml-file-tp1066785p1081802.html
> Sent from the Solr - User mailing list archive at Nabble.com.