[lucy-user] Serialization error when inheriting from Lucy::Analysis::Normalizer

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[lucy-user] Serialization error when inheriting from Lucy::Analysis::Normalizer

Knut Arne Bjørndal
Hi

I'm implementing a custom normalization analyzer and it would be
convenient to let it inherit from the default implementation, but I'm
having trouble with the serialization and deserialization.

I override dump to add some extra data to the serialized object, which
seems to be the part with the problem.

Minimal test case:
package My::Normalizer;
use base qw( Lucy::Analysis::Normalizer );

sub dump {
  shift->SUPER::dump(@_);
}

1;

Adding this to a schema causes an exception to be throw when the schema
is deserialized:
Can't downcast from Lucy::Object::CharBuf to Lucy::Object::BoolNum
        lucy_Normalizer_load at .../Normalizer.c line 151
        at /usr/lib/perl5/Lucy.pm line 239
        Lucy::Index::Indexer::new('Lucy::Index::Indexer', 'index',
'index.1246/', 'schema', 'Lucy::Plan::Schema=SCALAR(0x2e4fc90)',
'create', 1)

Looking at schema.json a Lucy::Analysis::Normalizer is serialized to:
        {
          "_class": "Lucy::Analysis::Normalizer",
          "case_fold": true,
          "normalization_form": "NFKC",
          "strip_accents": false
        },
while my subclassed instance is serialized to:
        {
          "_class": "My::Normalizer",
          "case_fold": "1",
          "normalization_form": "NFKC",
          "strip_accents": "0"
        },

So it looks like the serializer doesn't correctly handle boolean values
when called from perl?

--
Knut Arne Bjørndal, Tekniker Easy Connect AS - http://1890.no
E-post: [hidden email]


signature.asc (267 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [lucy-user] Serialization error when inheriting from Lucy::Analysis::Normalizer

Marvin Humphrey
Hi,

On Mon, May 5, 2014 at 7:58 AM, Knut Arne Bjørndal
<[hidden email]> wrote:

> Minimal test case:

Thanks for providing the test case!  It allowed me to isolated the problem
right away -- here's a quick fix:

https://github.com/rectang/lucy/commit/91b7a5fde2e18370c81a9189591fa25b36b3e963

> So it looks like the serializer doesn't correctly handle boolean values
> when called from perl?

Round-trip serialization of booleans is tricky for languages like Perl which
don't provide a dedicated boolean type.  We can probably make some general
improvements on that front, but in any case it's better for client code like
Normalizer#Load to be forgiving and call To_Bool() on whatever object is
present rather than insist on a specific boolean type.

Marvin Humphrey