More on "ant regenerate" target:

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

More on "ant regenerate" target:

Erick Erickson
I have the git pull working for fetching a particular revision of nfkc.txt and the like. Now TestICUFoldingFilterFactory fails tests. Here's what I could find on that topic:

org.apache.lucene.analysis.icu.ICUFoldingFilter
  public static final Normalizer2 NORMALIZER = Normalizer2.getInstance(
    // TODO: if the wrong version of the ICU jar is used, loading these data files may give a strange error.
    // maybe add an explicit check? http://icu-project.org/apiref/icu4j/com/ibm/icu/util/VersionInfo.html
    ICUFoldingFilter.class.getResourceAsStream("utr30.nrm"),
    "utr30", Normalizer2.Mode.COMPOSE);
eventually calls:

com.ibm.icu.impl.Normalizer2Impl
 public Normalizer2Impl load(ByteBuffer bytes) {
    try {
      this.dataVersion = ICUBinary.readHeaderAndDataVersion(bytes, 1316121906, IS_ACCEPTABLE);
which throws
Caused by: com.ibm.icu.util.ICUUncheckedIOException: java.io.IOException: ICU data file error: Header authentication failed, please check if you have a valid ICU data file; data format 4e726d32, format version 4.0.0.0

0x4e726d32==1316121906, so the data format looks ok to my uninformed eye.

The jar file I have for icu is: icu4j-62.1.jar

I looked at the nfc* files that are now fetched from github and at least ./lucene/analysis/icu/src/data/utr30/nfc.txt is identical.

I’ll get back to this later this afternoon, meanwhile any pointers?
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: More on "ant regenerate" target:

Robert Muir
Hi Erick. sorry for the slow reply on this one. make sure you have correct icu4c version at the beginning of your PATH before running ant regenerate. it should match the icu4j version. it seems to me you have a mismatch.

On Wed, Dec 4, 2019, 2:32 PM Erick Erickson <[hidden email]> wrote:
I have the git pull working for fetching a particular revision of nfkc.txt and the like. Now TestICUFoldingFilterFactory fails tests. Here's what I could find on that topic:

org.apache.lucene.analysis.icu.ICUFoldingFilter
  public static final Normalizer2 NORMALIZER = Normalizer2.getInstance(
    // TODO: if the wrong version of the ICU jar is used, loading these data files may give a strange error.
    // maybe add an explicit check? http://icu-project.org/apiref/icu4j/com/ibm/icu/util/VersionInfo.html
    ICUFoldingFilter.class.getResourceAsStream("utr30.nrm"),
    "utr30", Normalizer2.Mode.COMPOSE);
eventually calls:

com.ibm.icu.impl.Normalizer2Impl
 public Normalizer2Impl load(ByteBuffer bytes) {
    try {
      this.dataVersion = ICUBinary.readHeaderAndDataVersion(bytes, 1316121906, IS_ACCEPTABLE);
which throws
Caused by: com.ibm.icu.util.ICUUncheckedIOException: java.io.IOException: ICU data file error: Header authentication failed, please check if you have a valid ICU data file; data format 4e726d32, format version 4.0.0.0

0x4e726d32==1316121906, so the data format looks ok to my uninformed eye.

The jar file I have for icu is: icu4j-62.1.jar

I looked at the nfc* files that are now fetched from github and at least ./lucene/analysis/icu/src/data/utr30/nfc.txt is identical.

I’ll get back to this later this afternoon, meanwhile any pointers?
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: More on "ant regenerate" target:

Robert Muir
IMO we should open an issue to make the regenerate task do some kind of check and fail in a clear way if this happens. It would save people from crazy debugging.

On Sat, Dec 7, 2019 at 11:14 AM Robert Muir <[hidden email]> wrote:
Hi Erick. sorry for the slow reply on this one. make sure you have correct icu4c version at the beginning of your PATH before running ant regenerate. it should match the icu4j version. it seems to me you have a mismatch.

On Wed, Dec 4, 2019, 2:32 PM Erick Erickson <[hidden email]> wrote:
I have the git pull working for fetching a particular revision of nfkc.txt and the like. Now TestICUFoldingFilterFactory fails tests. Here's what I could find on that topic:

org.apache.lucene.analysis.icu.ICUFoldingFilter
  public static final Normalizer2 NORMALIZER = Normalizer2.getInstance(
    // TODO: if the wrong version of the ICU jar is used, loading these data files may give a strange error.
    // maybe add an explicit check? http://icu-project.org/apiref/icu4j/com/ibm/icu/util/VersionInfo.html
    ICUFoldingFilter.class.getResourceAsStream("utr30.nrm"),
    "utr30", Normalizer2.Mode.COMPOSE);
eventually calls:

com.ibm.icu.impl.Normalizer2Impl
 public Normalizer2Impl load(ByteBuffer bytes) {
    try {
      this.dataVersion = ICUBinary.readHeaderAndDataVersion(bytes, 1316121906, IS_ACCEPTABLE);
which throws
Caused by: com.ibm.icu.util.ICUUncheckedIOException: java.io.IOException: ICU data file error: Header authentication failed, please check if you have a valid ICU data file; data format 4e726d32, format version 4.0.0.0

0x4e726d32==1316121906, so the data format looks ok to my uninformed eye.

The jar file I have for icu is: icu4j-62.1.jar

I looked at the nfc* files that are now fetched from github and at least ./lucene/analysis/icu/src/data/utr30/nfc.txt is identical.

I’ll get back to this later this afternoon, meanwhile any pointers?
---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]