Questions about DeleteFile method

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Questions about DeleteFile method

Monsur Hossain-2
So after digging around FSDirectory's DeleteFile method, I noticed something
curious.  After an incremental index, the system tried to delete a lot of
*.f* files (like _5.f1, _5.f2), which didn't exist on the file system.
These files are named after the segment that is being deleted (for example,
there does exist a _5.cfs file, which is deleted).  Why its trying to delete
these files that don't exist?

Also, when these files aren't found, DeleteFile throws an exception; the
calling method traps this exception and adds the filename to the "deletable"
file.  This can lead to a lot of exceptions being thrown during a large
indexing operation, which could incur a performance penalty.  For
performance reasons, should DeleteFile return a boolean (true if the file is
deleted, false if not), which the calling method can then handle
approriately?  The calling method would still have to trap on the exception,
but at least there'd be far fewer Exceptions thrown.

Thanks,
Monsur

P.s. I haven't done any perf tests to verify this, it was just a thought.
I'll look into pulling something together.



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Questions about DeleteFile method

Otis Gospodnetic-2
Judging from the method name, this is back in Lucene.Net, so maybe this
is a bug in the .Net port.  The .cfs file indicates that you are using
the compound index format, which means that *.fN files should not be
deleted explicitly like that.
I wonder if you see the same behaviour with Lucene (Java).

Otis


--- Monsur Hossain <[hidden email]> wrote:

> So after digging around FSDirectory's DeleteFile method, I noticed
> something
> curious.  After an incremental index, the system tried to delete a
> lot of
> *.f* files (like _5.f1, _5.f2), which didn't exist on the file
> system.
> These files are named after the segment that is being deleted (for
> example,
> there does exist a _5.cfs file, which is deleted).  Why its trying to
> delete
> these files that don't exist?
>
> Also, when these files aren't found, DeleteFile throws an exception;
> the
> calling method traps this exception and adds the filename to the
> "deletable"
> file.  This can lead to a lot of exceptions being thrown during a
> large
> indexing operation, which could incur a performance penalty.  For
> performance reasons, should DeleteFile return a boolean (true if the
> file is
> deleted, false if not), which the calling method can then handle
> approriately?  The calling method would still have to trap on the
> exception,
> but at least there'd be far fewer Exceptions thrown.
>
> Thanks,
> Monsur
>
> P.s. I haven't done any perf tests to verify this, it was just a
> thought.
> I'll look into pulling something together.



____________________________________________________________________
Simpy -- simpy.com -- tags, social bookmarks, personal search engine

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Questions about DeleteFile method

Monsur Hossain

This does happen in the Java version (1.4.3), but I now have a better idea
of what's going on.  I felt really cool, wrote a big long explanation about
it, and then just for kicks checked the code in the Repository.  Guess what,
it will be fixed in the next version.  If you're interested, the issue is in
the files() method, around line 219, of Index\SegmentReader.java; in version
1.4.3 there's no check if the file exists before adding it to the Vector.

Thanks,
Monsur

 

> -----Original Message-----
> From: Otis Gospodnetic [mailto:[hidden email]]
> Sent: Tuesday, May 03, 2005 1:26 AM
> To: [hidden email]
> Subject: Re: Questions about DeleteFile method
>
> Judging from the method name, this is back in Lucene.Net, so
> maybe this
> is a bug in the .Net port.  The .cfs file indicates that you are using
> the compound index format, which means that *.fN files should not be
> deleted explicitly like that.
> I wonder if you see the same behaviour with Lucene (Java).
>
> Otis
>
>
> --- Monsur Hossain <[hidden email]> wrote:
> > So after digging around FSDirectory's DeleteFile method, I noticed
> > something
> > curious.  After an incremental index, the system tried to delete a
> > lot of
> > *.f* files (like _5.f1, _5.f2), which didn't exist on the file
> > system.
> > These files are named after the segment that is being deleted (for
> > example,
> > there does exist a _5.cfs file, which is deleted).  Why its
> trying to
> > delete
> > these files that don't exist?
> >
> > Also, when these files aren't found, DeleteFile throws an exception;
> > the
> > calling method traps this exception and adds the filename to the
> > "deletable"
> > file.  This can lead to a lot of exceptions being thrown during a
> > large
> > indexing operation, which could incur a performance penalty.  For
> > performance reasons, should DeleteFile return a boolean (true if the
> > file is
> > deleted, false if not), which the calling method can then handle
> > approriately?  The calling method would still have to trap on the
> > exception,
> > but at least there'd be far fewer Exceptions thrown.
> >
> > Thanks,
> > Monsur
> >
> > P.s. I haven't done any perf tests to verify this, it was just a
> > thought.
> > I'll look into pulling something together.
>
>
>
> ____________________________________________________________________
> Simpy -- simpy.com -- tags, social bookmarks, personal search engine
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Questions about DeleteFile method

George Aroush
Hi Monsur,

I just finished porting SegmentReader.java to C# last night for 1.9 RC1 and
I did see your observation -- this is now fixed in RC1.

All: Speaking of my port work for 1.9 RC1, I don't have a clear idea what to
do about java.util.zip.  There is no equivalent in .NET and it is being used
in Lucene 1.9 RC1 for Index.FieldsWriter and Index.FieldsReader.  Any
suggestion?

Regards,

-- George Aroush


-----Original Message-----
From: Monsur Hossain [mailto:[hidden email]]
Sent: Tuesday, May 03, 2005 8:18 PM
To: [hidden email]
Subject: RE: Questions about DeleteFile method


This does happen in the Java version (1.4.3), but I now have a better idea
of what's going on.  I felt really cool, wrote a big long explanation about
it, and then just for kicks checked the code in the Repository.  Guess what,
it will be fixed in the next version.  If you're interested, the issue is in
the files() method, around line 219, of Index\SegmentReader.java; in version
1.4.3 there's no check if the file exists before adding it to the Vector.

Thanks,
Monsur

 

> -----Original Message-----
> From: Otis Gospodnetic [mailto:[hidden email]]
> Sent: Tuesday, May 03, 2005 1:26 AM
> To: [hidden email]
> Subject: Re: Questions about DeleteFile method
>
> Judging from the method name, this is back in Lucene.Net, so maybe
> this is a bug in the .Net port.  The .cfs file indicates that you are
> using the compound index format, which means that *.fN files should
> not be deleted explicitly like that.
> I wonder if you see the same behaviour with Lucene (Java).
>
> Otis
>
>
> --- Monsur Hossain <[hidden email]> wrote:
> > So after digging around FSDirectory's DeleteFile method, I noticed
> > something curious.  After an incremental index, the system tried to
> > delete a lot of
> > *.f* files (like _5.f1, _5.f2), which didn't exist on the file
> > system.
> > These files are named after the segment that is being deleted (for
> > example, there does exist a _5.cfs file, which is deleted).  Why its
> trying to
> > delete
> > these files that don't exist?
> >
> > Also, when these files aren't found, DeleteFile throws an exception;
> > the calling method traps this exception and adds the filename to the
> > "deletable"
> > file.  This can lead to a lot of exceptions being thrown during a
> > large indexing operation, which could incur a performance penalty.  
> > For performance reasons, should DeleteFile return a boolean (true if
> > the file is deleted, false if not), which the calling method can
> > then handle approriately?  The calling method would still have to
> > trap on the exception, but at least there'd be far fewer Exceptions
> > thrown.
> >
> > Thanks,
> > Monsur
> >
> > P.s. I haven't done any perf tests to verify this, it was just a
> > thought.
> > I'll look into pulling something together.
>
>
>
> ____________________________________________________________________
> Simpy -- simpy.com -- tags, social bookmarks, personal search engine
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Questions about DeleteFile method

Pasha Bizhan
In reply to this post by Monsur Hossain-2
Hi,

  "George Aroush" <[hidden email]> wrote:

> All: Speaking of my port work for 1.9 RC1, I don't have
>a clear idea what to
> do about java.util.zip.  There is no equivalent in .NET
>and it is being used in Lucene 1.9 RC1 for Index.FieldsWriter and
>Index.FieldsReader.  Any  suggestion?

SharpZLib. We use it for our port :)) Current tests for
compatibility works well but we have not the final results
at present.

Pasha Bizhan
http://lucenedotnet.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Loading...