Time of insert

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Time of insert

Mahmoud Almokadem
Hello,

I'm using dih on solr 6 for indexing data from sql server. The document can
br indexed many times according to the updates on it. Is that available to
get the first time the document inserted to solr?

And how to get the dates of the document updated?

Thanks for help,
Mahmoud
Reply | Threaded
Open this post in threaded view
|

Re: Time of insert

Fuad Efendi
Not; historical logs for document updates is not provided. Users need to
implement such functionality themselves if needed.


From: Mahmoud Almokadem <[hidden email]> <[hidden email]>
Reply: [hidden email] <[hidden email]>
<[hidden email]>
Date: February 6, 2017 at 3:32:34 PM
To: [hidden email] <[hidden email]>
<[hidden email]>
Subject:  Time of insert

Hello,

I'm using dih on solr 6 for indexing data from sql server. The document can
br indexed many times according to the updates on it. Is that available to
get the first time the document inserted to solr?

And how to get the dates of the document updated?

Thanks for help,
Mahmoud
Reply | Threaded
Open this post in threaded view
|

Re: Time of insert

Alexandre Rafalovitch
In reply to this post by Mahmoud Almokadem
If you are reindexing full documents, there is no way.

If you are actually doing updates using Solr updates XML/JSON, then
you can have a created_date field with default value of NOW.
Similarly, you could probably do something with UpdateRequestProcessor
chains to get that NOW added somewhere.

Regards,
   Alex.
----
http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 6 February 2017 at 15:32, Mahmoud Almokadem <[hidden email]> wrote:

> Hello,
>
> I'm using dih on solr 6 for indexing data from sql server. The document can
> br indexed many times according to the updates on it. Is that available to
> get the first time the document inserted to solr?
>
> And how to get the dates of the document updated?
>
> Thanks for help,
> Mahmoud
Reply | Threaded
Open this post in threaded view
|

Re: Time of insert

Mahmoud Almokadem
Thanks Alex for your reply. But the field created_date will be updated
every time the document inserted to the solr. I want to record the first
time the document indexed to solr and I'm using DataImport handler.

And I tried solr.TimestampUpdateProcessorFactory but I got
NullPointerException, So I changed it to use default value for the field on
the schema

  <field name="solr_time_stamp" type="tdate" indexed="true" stored="true"
multiValued="false"  omitNorms="true"   termVectors="false"
termPositions="false" termOffsets="false"   default="NOW" />



but this field contains the last update of the document not the first time
the document inserted.


Thanks,
Mahmoud

On Tue, Feb 7, 2017 at 12:10 AM, Alexandre Rafalovitch <[hidden email]>
wrote:

> If you are reindexing full documents, there is no way.
>
> If you are actually doing updates using Solr updates XML/JSON, then
> you can have a created_date field with default value of NOW.
> Similarly, you could probably do something with UpdateRequestProcessor
> chains to get that NOW added somewhere.
>
> Regards,
>    Alex.
> ----
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 6 February 2017 at 15:32, Mahmoud Almokadem <[hidden email]>
> wrote:
> > Hello,
> >
> > I'm using dih on solr 6 for indexing data from sql server. The document
> can
> > br indexed many times according to the updates on it. Is that available
> to
> > get the first time the document inserted to solr?
> >
> > And how to get the dates of the document updated?
> >
> > Thanks for help,
> > Mahmoud
>
Reply | Threaded
Open this post in threaded view
|

Re: Time of insert

Alessandro Benedetti
Hi Mahomoud,
I need to double check but let's assume you use atomic updates and a created_data stored with default to NOW.

1) First time the document is not in the index you will get the default NOW.
2) second time, using the atomic update you will update only a subset of fields you send to Solr.
Under the hood Solr will fetch the existing Doc, change only few fields and send it back to Solr.
created_date will have the date fetched from the old version of the document, so the default will not be used this time.

Have you tried ?

Cheers
---------------
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
Reply | Threaded
Open this post in threaded view
|

Re: Time of insert

Mahmoud Almokadem
Thanks Alessandro,

I used the DIH as it is and no atomic updates was called with this DIH.

Add this script to my script transformation section and everything worked
properly:

var now = java.time.LocalDateTime.now();


var dtf =
java.time.format.DateTimeFormatter.ofPattern("yyyy-MM-dd'T'HH:mm:ss'Z'");


var val = dtf.format(now);



var hash = new java.util.HashMap();



hash.put('add', val);


row.put('time_stamp_log', hash);


The time_stamp_log no contains the log of the updates on documents and the
created_date set one time.

I think hash.put('add', val); fires the atomic updates on documents.

But when I remove this part of script I got created_date field updated
every time.

Thanks for your help.



On Tue, Feb 7, 2017 at 11:30 AM, alessandro.benedetti <[hidden email]>
wrote:

> Hi Mahomoud,
> I need to double check but let's assume you use atomic updates and a
> created_data stored with default to NOW.
>
> 1) First time the document is not in the index you will get the default
> NOW.
> 2) second time, using the atomic update you will update only a subset of
> fields you send to Solr.
> Under the hood Solr will fetch the existing Doc, change only few fields and
> send it back to Solr.
> created_date will have the date fetched from the old version of the
> document, so the default will not be used this time.
>
> Have you tried ?
>
> Cheers
>
>
>
> -----
> ---------------
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Time-of-insert-tp4319040p4319122.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>