Solr 8.0.0 Delta import add/delete data

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Solr 8.0.0 Delta import add/delete data

Anuj Bhargava
We have a MySql database (news) which has the following fields -
posting_id, date, name, currency, country, expiry ....etc

The database has more than 1200000 entries. Daily around 200000 plus new
records are added and around the same number deleted.

posting_id is a unique ID for every record.

Please can some help write a delta-import script so that the index file is
update as per the records in the MySql database (news) everyday. If the
posting_id is not found in the database (news) then the same is deleted
from the solr indexed file and the records with new posting_id are indexed.
Reply | Threaded
Open this post in threaded view
|

Re: Solr 8.0.0 Delta import add/delete data

Zheng Lin Edwin Yeo
Have you tried how long does it take to index all the entries?

Regards,
Edwin

On Fri, 12 Apr 2019 at 12:32, Anuj Bhargava <[hidden email]> wrote:

> We have a MySql database (news) which has the following fields -
> posting_id, date, name, currency, country, expiry ....etc
>
> The database has more than 1200000 entries. Daily around 200000 plus new
> records are added and around the same number deleted.
>
> posting_id is a unique ID for every record.
>
> Please can some help write a delta-import script so that the index file is
> update as per the records in the MySql database (news) everyday. If the
> posting_id is not found in the database (news) then the same is deleted
> from the solr indexed file and the records with new posting_id are indexed.
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr 8.0.0 Delta import add/delete data

Anuj Bhargava
The database has 1200000 entries and 20000 (not 200000 as mentioned
earlier) plus new records are added and around the same number deleted.

For Full Import it takes approximately 4 minutes to Index.

Regards,

Anuj

On Mon, 15 Apr 2019 at 07:29, Zheng Lin Edwin Yeo <[hidden email]>
wrote:

> Have you tried how long does it take to index all the entries?
>
> Regards,
> Edwin
>
> On Fri, 12 Apr 2019 at 12:32, Anuj Bhargava <[hidden email]> wrote:
>
> > We have a MySql database (news) which has the following fields -
> > posting_id, date, name, currency, country, expiry ....etc
> >
> > The database has more than 1200000 entries. Daily around 200000 plus new
> > records are added and around the same number deleted.
> >
> > posting_id is a unique ID for every record.
> >
> > Please can some help write a delta-import script so that the index file
> is
> > update as per the records in the MySql database (news) everyday. If the
> > posting_id is not found in the database (news) then the same is deleted
> > from the solr indexed file and the records with new posting_id are
> indexed.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Solr 8.0.0 Delta import add/delete data

Zheng Lin Edwin Yeo
Hi Anuj,

I think it could be faster and cleaner to re-index, since the indexing
takes only 4 minutes, and you require records to be deleted.

Also, when you delete records in Solr, it only annotates them as deleted
for the purpose of searching. The space used by those documents will be
reclaimed when the segments they are in are merged. This will be done based
on your setting of your Merge Policy, or when you optimized the index.

Regards,
Edwin

On Mon, 15 Apr 2019 at 11:16, Anuj Bhargava <[hidden email]> wrote:

> The database has 1200000 entries and 20000 (not 200000 as mentioned
> earlier) plus new records are added and around the same number deleted.
>
> For Full Import it takes approximately 4 minutes to Index.
>
> Regards,
>
> Anuj
>
> On Mon, 15 Apr 2019 at 07:29, Zheng Lin Edwin Yeo <[hidden email]>
> wrote:
>
> > Have you tried how long does it take to index all the entries?
> >
> > Regards,
> > Edwin
> >
> > On Fri, 12 Apr 2019 at 12:32, Anuj Bhargava <[hidden email]> wrote:
> >
> > > We have a MySql database (news) which has the following fields -
> > > posting_id, date, name, currency, country, expiry ....etc
> > >
> > > The database has more than 1200000 entries. Daily around 200000 plus
> new
> > > records are added and around the same number deleted.
> > >
> > > posting_id is a unique ID for every record.
> > >
> > > Please can some help write a delta-import script so that the index file
> > is
> > > update as per the records in the MySql database (news) everyday. If the
> > > posting_id is not found in the database (news) then the same is deleted
> > > from the solr indexed file and the records with new posting_id are
> > indexed.
> > >
> >
>