Deleting with the DIH sometimes doesn't delete

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Deleting with the DIH sometimes doesn't delete

Qwerky
I'm doing deletes with the DIH but getting mixed results. Sometimes the documents get deleted, other times I can still find them in the index. What would prevent a doc from getting deleted?

For example, I delete 594039 and get this in the logs;

2010-08-12 14:41:55,625 [Thread-210] INFO  [DataImporter] Starting Delta Import
2010-08-12 14:41:55,625 [Thread-210] INFO  [SolrWriter] Read productimportupdate.properties
2010-08-12 14:41:55,625 [Thread-210] INFO  [DocBuilder] Starting delta collection.
2010-08-12 14:41:55,625 [Thread-210] INFO  [DocBuilder] Running ModifiedRowKey() for Entity: item
2010-08-12 14:41:55,625 [Thread-210] INFO  [DocBuilder] Completed ModifiedRowKey for Entity: item rows obtained : 0
2010-08-12 14:41:55,625 [Thread-210] INFO  [DocBuilder] Completed DeletedRowKey for Entity: item rows obtained : 1
2010-08-12 14:41:55,625 [Thread-210] INFO  [DocBuilder] Completed parentDeltaQuery for Entity: item
2010-08-12 14:41:55,625 [Thread-210] INFO  [DocBuilder] Deleting stale documents
2010-08-12 14:41:55,625 [Thread-210] INFO  [SolrWriter] Deleting document: 594039
2010-08-12 14:41:55,703 [Thread-210] INFO  [SolrDeletionPolicy] newest commit = 1281030128383
2010-08-12 14:41:55,718 [Thread-210] DEBUG [SolrIndexWriter] Opened Writer DirectUpdateHandler2
2010-08-12 14:41:55,718 [Thread-210] INFO  [DocBuilder] Delta Import completed successfully
2010-08-12 14:41:55,718 [Thread-210] INFO  [DocBuilder] Import completed successfully
2010-08-12 14:41:55,718 [Thread-210] INFO  [DirectUpdateHandler2] start commit(optimize=true,waitFlush=false,waitSearcher=true,expungeDeletes=false)
2010-08-12 14:42:08,562 [Thread-210] DEBUG [SolrIndexWriter] Closing Writer DirectUpdateHandler2
2010-08-12 14:42:10,437 [Thread-210] INFO  [SolrDeletionPolicy] SolrDeletionPolicy.onCommit: commits:num=2
        commit{dir=E:\SOLR\kiosk\data\index,segFN=segments_8,version=1281030128383,generation=8,filenames=[_39.frq, _2i.fdx, _39.tis, _39.prx, _39.fnm, _2i.fdt, _39.tii, _39.nrm, segments_8]
        commit{dir=E:\SOLR\kiosk\data\index,segFN=segments_9,version=1281030128384,generation=9,filenames=[_3a.prx, _3a.tis, _3a.fnm, _3a.nrm, _3a.fdt, _3a.tii, _3a.fdx, _3a.frq, segments_9]
2010-08-12 14:42:10,437 [Thread-210] INFO  [SolrDeletionPolicy] newest commit = 1281030128384

..this works fine; I can no longer find 594039 in the index. But a little later I delete a couple more (33252 and 105224) and get the following (I added two docs at the same time);

2010-08-12 15:27:42,828 [Thread-217] INFO  [DataImporter] Starting Delta Import
2010-08-12 15:27:42,828 [Thread-217] INFO  [SolrWriter] Read productimportupdate.properties
2010-08-12 15:27:42,828 [Thread-217] INFO  [DocBuilder] Starting delta collection.
2010-08-12 15:27:42,843 [Thread-217] INFO  [DocBuilder] Running ModifiedRowKey() for Entity: item
2010-08-12 15:27:42,843 [Thread-217] INFO  [DocBuilder] Completed ModifiedRowKey for Entity: item rows obtained : 2
2010-08-12 15:27:42,843 [Thread-217] INFO  [DocBuilder] Completed DeletedRowKey for Entity: item rows obtained : 2
2010-08-12 15:27:42,843 [Thread-217] INFO  [DocBuilder] Completed parentDeltaQuery for Entity: item
2010-08-12 15:27:42,843 [Thread-217] INFO  [DocBuilder] Deleting stale documents
2010-08-12 15:27:42,843 [Thread-217] INFO  [SolrWriter] Deleting document: 33252
2010-08-12 15:27:42,906 [Thread-217] INFO  [SolrDeletionPolicy] SolrDeletionPolicy.onInit: commits:num=1
        commit{dir=E:\SOLR\kiosk\data\index,segFN=segments_9,version=1281030128384,generation=9,filenames=[_3a.prx, _3a.tis, _3a.fnm, _3a.nrm, _3a.fdt, _3a.tii, _3a.fdx, _3a.frq, segments_9]
2010-08-12 15:27:42,906 [Thread-217] INFO  [SolrDeletionPolicy] newest commit = 1281030128384
2010-08-12 15:27:42,906 [Thread-217] DEBUG [SolrIndexWriter] Opened Writer DirectUpdateHandler2
2010-08-12 15:27:42,906 [Thread-217] INFO  [SolrWriter] Deleting document: 105224
2010-08-12 15:27:42,906 [Thread-217] INFO  [DocBuilder] Delta Import completed successfully
2010-08-12 15:27:42,906 [Thread-217] INFO  [DocBuilder] Import completed successfully
2010-08-12 15:27:42,906 [Thread-217] INFO  [DirectUpdateHandler2] start commit(optimize=true,waitFlush=false,waitSearcher=true,expungeDeletes=false)
2010-08-12 15:27:55,578 [Thread-217] DEBUG [SolrIndexWriter] Closing Writer DirectUpdateHandler2
2010-08-12 15:27:56,875 [Thread-217] INFO  [SolrDeletionPolicy] SolrDeletionPolicy.onCommit: commits:num=2
        commit{dir=E:\SOLR\kiosk\data\index,segFN=segments_9,version=1281030128384,generation=9,filenames=[_3a.prx, _3a.tis, _3a.fnm, _3a.nrm, _3a.fdt, _3a.tii, _3a.fdx, _3a.frq, segments_9]
        commit{dir=E:\SOLR\kiosk\data\index,segFN=segments_a,version=1281030128385,generation=10,filenames=[_3c.tis, _3c.fdt, _3c.fnm, _3c.nrm, _3c.tii, segments_a, _3c.fdx, _3c.prx, _3c.frq]
2010-08-12 15:27:56,875 [Thread-217] INFO  [SolrDeletionPolicy] newest commit = 1281030128385
Reply | Threaded
Open this post in threaded view
|

Re: Deleting with the DIH sometimes doesn't delete

Lance Norskog-2
Which version of Solr is this? How many documents are there in the
index? Etc. It is hard for us to help you without more details.


On Thu, Aug 12, 2010 at 8:32 AM, Qwerky <[hidden email]> wrote:

>
> I'm doing deletes with the DIH but getting mixed results. Sometimes the
> documents get deleted, other times I can still find them in the index. What
> would prevent a doc from getting deleted?
>
> For example, I delete 594039 and get this in the logs;
>
> 2010-08-12 14:41:55,625 [Thread-210] INFO  [DataImporter] Starting Delta
> Import
> 2010-08-12 14:41:55,625 [Thread-210] INFO  [SolrWriter] Read
> productimportupdate.properties
> 2010-08-12 14:41:55,625 [Thread-210] INFO  [DocBuilder] Starting delta
> collection.
> 2010-08-12 14:41:55,625 [Thread-210] INFO  [DocBuilder] Running
> ModifiedRowKey() for Entity: item
> 2010-08-12 14:41:55,625 [Thread-210] INFO  [DocBuilder] Completed
> ModifiedRowKey for Entity: item rows obtained : 0
> 2010-08-12 14:41:55,625 [Thread-210] INFO  [DocBuilder] Completed
> DeletedRowKey for Entity: item rows obtained : 1
> 2010-08-12 14:41:55,625 [Thread-210] INFO  [DocBuilder] Completed
> parentDeltaQuery for Entity: item
> 2010-08-12 14:41:55,625 [Thread-210] INFO  [DocBuilder] Deleting stale
> documents
> 2010-08-12 14:41:55,625 [Thread-210] INFO  [SolrWriter] Deleting document:
> 594039
> 2010-08-12 14:41:55,703 [Thread-210] INFO  [SolrDeletionPolicy] newest
> commit = 1281030128383
> 2010-08-12 14:41:55,718 [Thread-210] DEBUG [SolrIndexWriter] Opened Writer
> DirectUpdateHandler2
> 2010-08-12 14:41:55,718 [Thread-210] INFO  [DocBuilder] Delta Import
> completed successfully
> 2010-08-12 14:41:55,718 [Thread-210] INFO  [DocBuilder] Import completed
> successfully
> 2010-08-12 14:41:55,718 [Thread-210] INFO  [DirectUpdateHandler2] start
> commit(optimize=true,waitFlush=false,waitSearcher=true,expungeDeletes=false)
> 2010-08-12 14:42:08,562 [Thread-210] DEBUG [SolrIndexWriter] Closing Writer
> DirectUpdateHandler2
> 2010-08-12 14:42:10,437 [Thread-210] INFO  [SolrDeletionPolicy]
> SolrDeletionPolicy.onCommit: commits:num=2
>
> commit{dir=E:\SOLR\kiosk\data\index,segFN=segments_8,version=1281030128383,generation=8,filenames=[_39.frq,
> _2i.fdx, _39.tis, _39.prx, _39.fnm, _2i.fdt, _39.tii, _39.nrm, segments_8]
>
> commit{dir=E:\SOLR\kiosk\data\index,segFN=segments_9,version=1281030128384,generation=9,filenames=[_3a.prx,
> _3a.tis, _3a.fnm, _3a.nrm, _3a.fdt, _3a.tii, _3a.fdx, _3a.frq, segments_9]
> 2010-08-12 14:42:10,437 [Thread-210] INFO  [SolrDeletionPolicy] newest
> commit = 1281030128384
>
> ..this works fine; I can no longer find 594039 in the index. But a little
> later I delete a couple more (33252 and 105224) and get the following (I
> added two docs at the same time);
>
> 2010-08-12 15:27:42,828 [Thread-217] INFO  [DataImporter] Starting Delta
> Import
> 2010-08-12 15:27:42,828 [Thread-217] INFO  [SolrWriter] Read
> productimportupdate.properties
> 2010-08-12 15:27:42,828 [Thread-217] INFO  [DocBuilder] Starting delta
> collection.
> 2010-08-12 15:27:42,843 [Thread-217] INFO  [DocBuilder] Running
> ModifiedRowKey() for Entity: item
> 2010-08-12 15:27:42,843 [Thread-217] INFO  [DocBuilder] Completed
> ModifiedRowKey for Entity: item rows obtained : 2
> 2010-08-12 15:27:42,843 [Thread-217] INFO  [DocBuilder] Completed
> DeletedRowKey for Entity: item rows obtained : 2
> 2010-08-12 15:27:42,843 [Thread-217] INFO  [DocBuilder] Completed
> parentDeltaQuery for Entity: item
> 2010-08-12 15:27:42,843 [Thread-217] INFO  [DocBuilder] Deleting stale
> documents
> 2010-08-12 15:27:42,843 [Thread-217] INFO  [SolrWriter] Deleting document:
> 33252
> 2010-08-12 15:27:42,906 [Thread-217] INFO  [SolrDeletionPolicy]
> SolrDeletionPolicy.onInit: commits:num=1
>
> commit{dir=E:\SOLR\kiosk\data\index,segFN=segments_9,version=1281030128384,generation=9,filenames=[_3a.prx,
> _3a.tis, _3a.fnm, _3a.nrm, _3a.fdt, _3a.tii, _3a.fdx, _3a.frq, segments_9]
> 2010-08-12 15:27:42,906 [Thread-217] INFO  [SolrDeletionPolicy] newest
> commit = 1281030128384
> 2010-08-12 15:27:42,906 [Thread-217] DEBUG [SolrIndexWriter] Opened Writer
> DirectUpdateHandler2
> 2010-08-12 15:27:42,906 [Thread-217] INFO  [SolrWriter] Deleting document:
> 105224
> 2010-08-12 15:27:42,906 [Thread-217] INFO  [DocBuilder] Delta Import
> completed successfully
> 2010-08-12 15:27:42,906 [Thread-217] INFO  [DocBuilder] Import completed
> successfully
> 2010-08-12 15:27:42,906 [Thread-217] INFO  [DirectUpdateHandler2] start
> commit(optimize=true,waitFlush=false,waitSearcher=true,expungeDeletes=false)
> 2010-08-12 15:27:55,578 [Thread-217] DEBUG [SolrIndexWriter] Closing Writer
> DirectUpdateHandler2
> 2010-08-12 15:27:56,875 [Thread-217] INFO  [SolrDeletionPolicy]
> SolrDeletionPolicy.onCommit: commits:num=2
>
> commit{dir=E:\SOLR\kiosk\data\index,segFN=segments_9,version=1281030128384,generation=9,filenames=[_3a.prx,
> _3a.tis, _3a.fnm, _3a.nrm, _3a.fdt, _3a.tii, _3a.fdx, _3a.frq, segments_9]
>
> commit{dir=E:\SOLR\kiosk\data\index,segFN=segments_a,version=1281030128385,generation=10,filenames=[_3c.tis,
> _3c.fdt, _3c.fnm, _3c.nrm, _3c.tii, segments_a, _3c.fdx, _3c.prx, _3c.frq]
> 2010-08-12 15:27:56,875 [Thread-217] INFO  [SolrDeletionPolicy] newest
> commit = 1281030128385
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Deleting-with-the-DIH-sometimes-doesn-t-delete-tp1113098p1113098.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



--
Lance Norskog
[hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: Deleting with the DIH sometimes doesn't delete

Qwerky
In reply to this post by Qwerky
I'm using solr 1.4.1 and I've got about 280,000 docs in the index. I'm using a multi core setup (if that makes any difference) with 2 cores.

When I check the stats from the JSP my updateHandler reports 3 deletes;
cumulative_deletesById : 3

When I search from the admin page the docs are still found;

<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">16</int>
 <lst name="params">
  <str name="q">FKWEBITM:33252</str>
  <str name="start">0</str>
  <str name="indent">on</str>
  <str name="explainOther"/>
  <str name="wt">standard</str>
  <str name="hl.fl"/>
  <str name="fq"/>
  <str name="version">2.2</str>
  <str name="qt">standard</str>
  <str name="fl">FKWEBITM</str>
  <str name="rows">10</str>
 </lst>
</lst>
<result name="response" numFound="1" start="0">
 <doc>
  <str name="FKWEBITM">33252</str>
 </doc>
</result>
</response>

My unique key in my schema.xml is FKWEBITM;
<uniqueKey>FKWEBITM</uniqueKey>