StandardTokenizer generation from JFlex grammar

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

StandardTokenizer generation from JFlex grammar

vempap
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

RE: StandardTokenizer generation from JFlex grammar

steve_rowe
Hi Phani,

Assuming you're using Lucene 3.6.X, see:

<http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/lucene/core/src/java/org/apache/lucene/analysis/standard/READ_BEFORE_REGENERATING.txt>

and

<http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_3_6/lucene/common-build.xml?revision=1364130&view=markup#l356>

I've pasted the relevant contents below:

====
WARNING: if you change StandardTokenizerImpl*.jflex or UAX29URLEmailTokenizer
and need to regenerate the tokenizer, only use the trunk version
of JFlex 1.5 (with a minimum SVN revision 597) at the moment!
----
Please install the jFlex 1.5 version (currently not released)
from its SVN repository:

 svn co http://jflex.svn.sourceforge.net/svnroot/jflex/trunk jflex
 cd jflex
 mvn install

Then, create a build.properties file either in your home
directory, or within the Lucene directory and set the jflex.home
property to the path where the JFlex trunk checkout is located
(in the above example its the directory called "jflex").
====

Steve

-----Original Message-----
From: vempap [mailto:[hidden email]]
Sent: Thursday, October 04, 2012 7:43 PM
To: [hidden email]
Subject: StandardTokenizer generation from JFlex grammar

Hello,

  I'm trying to generate the standard tokenizer again using the jflex
specification (StandardTokenizerImpl.jflex) but I'm not able to do so due to
some errors (I would like to create my own jflex file using the standard
tokenizer which is why I'm trying to first generate using that to get a hang
of things).

I'm using jflex 1.4.3 and I ran into the following error:

Error in file "<filename>" (line 64):
Syntax error.
HangulEx       = (!(!\p{Script:Hangul}|!\p{WB:ALetter})) ({Format} |
{Extend})*


Also, I tried installing an eclipse plugin from
http://cup-lex-eclipse.sourceforge.net/ which I thought would provide
options similar to JavaCC (http://eclipse-javacc.sourceforge.net/) through
which we can generate classes within eclipse - but had a hard luck.

Any help would be very helpful.

Regards,
Phani.



--
View this message in context: http://lucene.472066.n3.nabble.com/StandardTokenizer-generation-from-JFlex-grammar-tp4011939.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

RE: StandardTokenizer generation from JFlex grammar

vempap
CONTENTS DELETED
The author has deleted this message.