DO NOT REPLY [Bug 35029] New: - Inconsistent Read and write behavior

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

DO NOT REPLY [Bug 35029] New: - Inconsistent Read and write behavior

Bugzilla from bugzilla@apache.org
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG?
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=35029>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND?
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=35029

           Summary: Inconsistent Read and write behavior
           Product: Lucene
           Version: 1.4
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Index
        AssignedTo: [hidden email]
        ReportedBy: [hidden email]


While writing an undefined term , the field is inserted into the index as
fieldnumber -1 and while reading the same index back an exception is thrown.

First of all, the indexwriter should not allow the operation to succeed if the
field is not known.

Second, if the data is allowed to write, at least we should be able to read it
with out any problem.

If one uses the default indexreader, indexwriter and segmentmerger this may
error may not occur.  However, it is simple fix for the code not to accept bad
data. Please review and commit the changes.  I am not sure, if there are any
other classes that requires a similar fix. Our usage uncovered the following
files:

--

TermInfosWriter

private final void writeTerm(Term term)
throws IOException {
int iField = fieldInfos.fieldNumber(term.field);
if (iField < 0) {
throw new IOException("Unknown field "+term.field+"; term="+term.text);
}
int start = stringDifference(lastTerm.text, term.text);
int length = term.text.length() - start;

output.writeVInt(start); // write shared prefix length
output.writeVInt(length); // write delta length
output.writeChars(term.text, start, length); // write delta chars

output.writeVInt(iField); // write field num

lastTerm = term;
}

 

FieldsReader

 final Document doc(int n) throws IOException {
    indexStream.seek(n * 8L);
    long position = indexStream.readLong();
    fieldsStream.seek(position);

    Document doc = new Document();
    int numFields = fieldsStream.readVInt();
    for (int i = 0; i < numFields; i++) {
      int fieldNumber = fieldsStream.readVInt();
      byte bits = fieldsStream.readByte();
      String stFieldValue = fieldsStream.readString();
      if (fieldNumber >=0) {
          FieldInfo fi = fieldInfos.fieldInfo(fieldNumber);

          doc.add(new Field(fi.name, // name
                            stFieldValue, // read value
                            true, // stored
                            fi.isIndexed, // indexed
                            (bits & 1) != 0)); // tokenized
      }
    }

    return doc;
  }

-- FieldsWriter.java

final void addDocument(Document doc) throws IOException {
    indexStream.writeLong(fieldsStream.getFilePointer());
   
    int storedCount = 0;
    Enumeration fields  = doc.fields();
    while (fields.hasMoreElements()) {
      Field field = (Field)fields.nextElement();
      if (field.isStored())
        storedCount++;
    }
    fieldsStream.writeVInt(storedCount);
   
    fields  = doc.fields();
    while (fields.hasMoreElements()) {
      Field field = (Field)fields.nextElement();
      if (field.isStored()) {
          int iField = fieldInfos.fieldNumber(field.name());
          if (iField == -1) {
              throw new IOException("Unknown field " + field.name());
          }
          fieldsStream.writeVInt(iField);

        byte bits = 0;
        if (field.isTokenized())
          bits |= 1;
        fieldsStream.writeByte(bits);

        fieldsStream.writeString(field.stringValue());
      }
    }
  }

--
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]