[jira] Created: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

classic Classic list List threaded Threaded
33 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[jira] Created: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
----------------------------------------------------------------------------------------------------------------------------------------

                 Key: SOLR-912
                 URL: https://issues.apache.org/jira/browse/SOLR-912
             Project: Solr
          Issue Type: Improvement
          Components: clients - java
    Affects Versions: 1.4
         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
            Reporter: Kay Kay
            Priority: Minor
             Fix For: 1.3.1


The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kay Kay updated SOLR-912:
-------------------------

    Attachment: SOLR-912.patch

1) New Type-safe implementation of NamedList , ModernNamedList .

2) New interface INamedList<T> created

3) New Test case - ModernNamedList added.

4) Added more test  cases to NamedListTest .

Once the patch is approved - NamedList would be deprecated and the existing codebase in Solr would be replaced to use ModernNamedList<T> to be more type-safe.

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656381#action_12656381 ]

Noble Paul commented on SOLR-912:
---------------------------------

Type safety is not an overriding concern when a NamedList is used (that is the beauty of it). It does not help in any way. Most of the usages of NamedList involves heterogeneous values .

your implementation is not as efficient (memory usage) as the original one

The idea of having an interface is an overkill

-1

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656390#action_12656390 ]

Kay Kay commented on SOLR-912:
------------------------------

The interface is present only to enable the migration from NamedList (legacy) to the new one. (with similar properties of Cloneable, Serializable etc. ).

If type-safety is not a concern - is there a reason why NamedList<T> is defined as a generic type. We could probably define it as NamedList , with T replaced to be an object internally. not making it a generic type.

There seems to be no memory leaks as far as the container is concerned.  Creating an iterator object for every call to an iterator seems to be quite a bit of data redundancy issues when ideally we can use the iterator of one of the underlying objects as well.

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656399#action_12656399 ]

Noble Paul commented on SOLR-912:
---------------------------------

bq.The interface is present only to enable the migration from NamedList (legacy) to the new one. (with similar properties of Cloneable, Serializable etc. ).
This means we will need to use this interface wherever we use NamedList which is not desirable

NamedList is designed to achieve a specific purpose

Solr is not meant for java users only. It is also meant to be consumed over xml/json etc .There is no type safety in these. NamedList helps us to have a datastructure which can easily be converted back and forth from these.

bq.If type-safety is not a concern - is there a reason why NamedList<T> is defined as a generic type
It helps where it makes sense . But where it is not necessary I can totally omit that and javac does not complain. So , it is better to keep it generic than not.

bq.There seems to be no memory leaks as far as the container is concerned

There are no leaks. I was referring to the internal implementation . instead of one big array you keep a list of objects (means more objects, one per entry) .  

bq.Creating an iterator object for every call to an iterator

iterating over NamedList does not need to use an iterator. That is a choice left to the consumer of the API

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656408#action_12656408 ]

Kay Kay commented on SOLR-912:
------------------------------

The interface is mostly used as a proof of concept that the new class - ModernNamedList implements the same methods as that of NamedList. Eventually - the interface would be gotten rid of , once we are happy with the same.

As far as NamedList , I believe if we want to have the flexibility of allowing any type in it - we might as well define it to an Object. If we do qualify it as a specific type - then we might as well implement type-safety in the class. javac does not complain today because the compiler switch to indicate type-safety errors has been turned off.

Previous implementation used to be having a pre-jdk5 List , with members being String and a type depending on the index. The revised implementation has Map.Entry<String, T> interface - which is directly intuitive to what is required ( a Map with order being preserved , allowing duplicates and nulls ).  I did profile with 2 different implementations , involving Map.Entry<?> and a heterogenous list with String and a type (with insertion / deletion of 100,000 records). The current implementation in fact , failed in the performance comparison in both insertion / deletion in the middle of the List ( remove () ) , since we have to add/remove elements twice from the List (as in the current impl) , as compared to 1 insertion/deletion in the Map.Entry<> implementation. ) Given that addition/deletion in the List is worst-case linear - I believe the perceived performance degradation due to additional object , turns out to be not so bad when compared to 2-path insertion / deletion as we have today.

NamedList does seem to implement the interface Iterable<T> . I am not sure how the consumer of the API can have independent iterators (since only NamedList is supposed to be aware of the internal data structures and not the consumer). So I believe it would be upto NamedList<T> to provide an iterator to the user of the API.

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kay Kay updated SOLR-912:
-------------------------

    Component/s:     (was: clients - java)
                 Analysis

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657299#action_12657299 ]

Hoss Man commented on SOLR-912:
-------------------------------

If i'm understanding the discussion so far...

* ModernNamedList is being suggested as an alternate implementation of NamedList ... ideally the internals of NamedLIst would be replaced with the internals of ModernNamedList, but in this patch they are seperate classes so they can be compared.
* INamedList is included in the patch as a way to demonstrate that ModernNamedList fulfills the same contract as NamedList (for the purposes of testing etc)

do i have those aspects correct?

with that in mind: i'm not sure i understand what "itch" changing the implementation "scratches" ... the initial issue description says it's because NamedList " is not necessarily type-safe" but it's not clear what that statement is referring to ... later comments suggest that the motivation is to improve the performance of "remove" ... which hardly seems like something worth optimizing for.

I agree that having the internals based on a "list of pairs" certainly seems like it might be more intuitive to developers looking at the internals (then the current approach is), but how is the current approach less type safe for consumers using just the NamedList API?

If the "modern" approach is more performant then the existing impl and passes all of the tests then i suppose it would make sense to switch -- but i'm far more interested in how the performance compares for common cases (add/get/iterate) then for cases that hardly ever come up (remove).

My suggestion: provide two independent attachments.  One patch that just replaces the internals of NamedList with the approach you suggest so people can apply the patch, test it out, and verify the API/behavior; A second attachment that provides some benchmarks against the NmaedList class -- so people can read/run your benchmark with and with out the patch to see how the performance changes.


> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kay Kay updated SOLR-912:
-------------------------

    Attachment: NLProfile.java

a sample benchmarking program that works with the previous patch submitted.

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: NLProfile.java, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657627#action_12657627 ]

Kay Kay commented on SOLR-912:
------------------------------

| ModernNamedList is being suggested as an alternate implementation of NamedList ... ideally the internals of NamedLIst would be replaced with the internals of ModernNamedList, but in this patch they are seperate classes so they can be compared.

| INamedList is included in the patch as a way to demonstrate that ModernNamedList fulfills the same contract as NamedList (for the purposes of testing etc)

True.

Attached herewith is:  NLProfile.java - that contains sample benchmarking against the 2 implementations (will work with the previous page on the page).

Some results:

addAll / getAll():   increase in performance is almost [1-10]% range.

add: increase in performance by around 30% , probably because of the additional growth in the List implementation when size approaches capacity. And since, in NamedList - we insert 2 elements as opposed to one, ( as done in ModernNamedList) - it might be more pronounced.


iterator:   ~70% increase in performance in favor of the new implementation since it just reuses the iterator for the internal data structure.

The numbers should be present as comments in the corresponding methods - testAdd() , testAddAll(), testGetAll(), testIterator() .

I will attach the final patch once we are convinced with the  benchmark methodology and the numbers.


> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: NLProfile.java, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657630#action_12657630 ]

Kay Kay commented on SOLR-912:
------------------------------

Additional Info: JRE 6,  Linux 2.6.27-9 ,  3.2GB Memory, Dual-core Intel @ 2.53 Ghz.

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: NLProfile.java, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12657692#action_12657692 ]

Yonik Seeley commented on SOLR-912:
-----------------------------------

While we're going down the micro-benchmarking path, I tried eliminating ArrayList and got an additional 15-25% gain on common operations (create new, add between 5 and 15 elements, and then iterate over those elements later).  This was with Java 1.6.  -Xbatch improved the results even more... ~40% - but this is just a micro-benchmark.

{code}
class NamedList2<T> implements INamedList<T> {

  protected NamedListEntry<T>[] nvPairs;
  protected int size;

  public NamedList2() {
    nvPairs = new NamedListEntry[10];
    size = 0;
  }

  @Override
  public int size() {
    return size;
  }

  @Override
  public String getName(int idx) {
    if (idx >= size) throw new ArrayIndexOutOfBoundsException();
    return nvPairs[idx].key;
  }

  @Override
  public T getVal(int idx) {
    if (idx >= size) throw new ArrayIndexOutOfBoundsException();    
    return nvPairs[idx].value;
  }

  private void resize() {
    NamedListEntry<T>[] arr = new NamedListEntry[nvPairs.length << 1];
    System.arraycopy(nvPairs, 0, arr, 0, size);
    nvPairs = arr;
  }

  @Override
  public void add(String name, T val) {
    if (size >= nvPairs.length) {
      resize();
    }
    nvPairs[size++] = new NamedListEntry<T>(name, val);
  }

[...]
{code}

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: Analysis
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.3.1
>
>         Attachments: NLProfile.java, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar updated SOLR-912:
---------------------------------------

      Component/s:     (was: Analysis)
                   search
    Fix Version/s:     (was: 1.3.1)
                   1.4

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658114#action_12658114 ]

Kay Kay commented on SOLR-912:
------------------------------

System.arrayCopy is great. It is bound to perform much better because of the native code for the same.

Meanwhile - w.r.t resize() - ( trade-off because increasing size a lot would increase memory usage.  increase a size by a smaller factor would be resulting in a more frequent increases in size). I believe reading some theory that the ideal increase factor is somewhere close to  ( 1 + 2^0.5) / 2  or something similar to that.


The method - ensureCapacity(capacity) in ArrayList (Java 6) also seems to be a number along the lines ~ (1.5)

            int newCapacity = (oldCapacity * 3)/2 + 1;

+1 seems to be move away from 0, and keep incrementing the count. ( Hmm .. That piece of code - in Java 6 ArrayList can definitely make use of bitwise operators for the div-by-2 operation !!).



> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Updated: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

     [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kay Kay updated SOLR-912:
-------------------------

    Attachment: SOLR-912.patch

Introduce another ctor. called   Type(Object [] ) to distinguish them from List<Map.Entry<String, T > > and List of objects.

Change the invocation in DebugComponent   . Highlight Component etc.

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

Re: [jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Mike Klaas
In reply to this post by Tim Allison (Jira)

On 19-Dec-08, at 8:27 AM, Kay Kay (JIRA) wrote:
>
> Meanwhile - w.r.t resize() - ( trade-off because increasing size a  
> lot would increase memory usage.  increase a size by a smaller  
> factor would be resulting in a more frequent increases in size). I  
> believe reading some theory that the ideal increase factor is  
> somewhere close to  ( 1 + 2^0.5) / 2  or something similar to that.

It should be benchmarked, but yes, a factor of two is typically more  
memory wasteful than the performance it gains (you have a 50% chance  
of wasting at least 1/4 of your memory, a 25% chance of wasting at  
least 3/8th, etc.)

> The method - ensureCapacity(capacity) in ArrayList (Java 6) also  
> seems to be a number along the lines ~ (1.5)
>
>    int newCapacity = (oldCapacity * 3)/2 + 1;
>
> +1 seems to be move away from 0, and keep incrementing the count.  
> ( Hmm .. That piece of code - in Java 6 ArrayList can definitely  
> make use of bitwise operators for the div-by-2 operation !!).

Let's not go crazy here guys.  This relatively trivial calculation is  
only called log(n) times, and certainly uses bit ops after the jit  
gets its hands on it.

-Mike
Reply | Threaded
Open this post in threaded view
|

Re: [jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Yonik Seeley
On Fri, Dec 19, 2008 at 7:10 PM, Mike Klaas <[hidden email]> wrote:

>
> On 19-Dec-08, at 8:27 AM, Kay Kay (JIRA) wrote:
>>            int newCapacity = (oldCapacity * 3)/2 + 1;
>>
>> +1 seems to be move away from 0, and keep incrementing the count. ( Hmm ..
>> That piece of code - in Java 6 ArrayList can definitely make use of bitwise
>> operators for the div-by-2 operation !!).
>
> Let's not go crazy here guys.  This relatively trivial calculation is only
> called log(n) times, and certainly uses bit ops after the jit gets its hands
> on it.

Log(n) point is well taken.

The translation from "/2" to ">>1" can only really take place when the
compiler knows it's unsigned.  This might be a simple enough case for
the compiler to figure it out, but who knows... it took forever (late
Java6) to generate native rotate instructions.

Oh, and Kay, the simplest way to strength-reduce x*3/2 is (x+x/2) or (x+(x>>1))

-Yonik
Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658249#action_12658249 ]

Hoss Man commented on SOLR-912:
-------------------------------

If there are performance gains to be had in the common case i'm all for it ... but i still feel like i'm not understanding the original goal: how does this approach give us more type safety?

We also need to make sure we don't eliminate any public constructors, which seems to be the case based on my quick glance at the latest patch.

> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

[jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Tim Allison (Jira)
In reply to this post by Tim Allison (Jira)

    [ https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658251#action_12658251 ]

Kay Kay commented on SOLR-912:
------------------------------

|  We also need to make sure we don't eliminate any public constructors, which seems to be the case based on my quick glance at the latest patch.


<code>
-   public NamedList(List nameValuePairs) {
-   nvPairs=nameValuePairs;
+  protected NamedList(List<Map.Entry<String, T>> nameValuePairs) {
+    nvPairs = nameValuePairs;
-   }
</code>

As part of ensuring type-safety , the previous code had a heterogenous List ctor. as before.  I changed the access level and added another public ctor.  ( Object [] ) with deprecated tag to it so that people are still able to use the functionality.

Otherwise - retaining the same signature after type safety would imply - people passing in a List of String-s and T-s , when the List expects Map.Entry<String , T > and would cause more confusion.

Thanks to the erasure of generics , List and List<Map.Entry<String, T>> are all equal , not helping here.
If backward compatibility is the key here-  I can revisit the patch again ensuring the same.


| If there are performance gains to be had in the common case i'm all for it ... but i still feel like i'm not understanding the original goal: how does this approach give us more type safety?

When I logged the issue - type-safety was the major reason behind the same. When I submitted by first patch and did the benchmarking - performance was also found to be a major constraint , (with incremental addition and creation of iterator objects).  NamedList seemed to be used all over the place. As long as we preserve the contract of the methods - this should definitely give an additional boost - since I discovered as part of profiling of the launch of SolrCore ( CoreContainer.Initializer.initalize() .. ) .




> org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList
> ----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-912
>                 URL: https://issues.apache.org/jira/browse/SOLR-912
>             Project: Solr
>          Issue Type: Improvement
>          Components: search
>    Affects Versions: 1.4
>         Environment: Tomcat 6, JRE 6, Solr 1.3+ nightlies
>            Reporter: Kay Kay
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: NLProfile.java, SOLR-912.patch, SOLR-912.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> The implementation of NamedList - while being fast - is not necessarily type-safe. I have implemented an additional implementation of the same - ModernNamedList (a type-safe variation providing the same interface as NamedList) - while preserving the semantics in terms of ordering of elements and allowing null elements for key and values (keys are always Strings , while values correspond to generics ).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply | Threaded
Open this post in threaded view
|

Re: [jira] Commented: (SOLR-912) org.apache.solr.common.util.NamedList - Typesafe efficient variant - ModernNamedList introduced - implementing the same API as NamedList

Kay Kay-2-3
In reply to this post by Yonik Seeley
On Fri, Dec 19, 2008 at 8:35 PM, Yonik Seeley <[hidden email]> wrote:

> On Fri, Dec 19, 2008 at 7:10 PM, Mike Klaas <[hidden email]> wrote:
> >
> > On 19-Dec-08, at 8:27 AM, Kay Kay (JIRA) wrote:
> >>            int newCapacity = (oldCapacity * 3)/2 + 1;
> >>
> >> +1 seems to be move away from 0, and keep incrementing the count. ( Hmm
> ..
> >> That piece of code - in Java 6 ArrayList can definitely make use of
> bitwise
> >> operators for the div-by-2 operation !!).
> >
> > Let's not go crazy here guys.  This relatively trivial calculation is
> only
> > called log(n) times, and certainly uses bit ops after the jit gets its
> hands
> > on it.
>
> Log(n) point is well taken.
>
> The translation from "/2" to ">>1" can only really take place when the
> compiler knows it's unsigned.  This might be a simple enough case for
> the compiler to figure it out, but who knows... it took forever (late
> Java6) to generate native rotate instructions.
>
> Oh, and Kay, the simplest way to strength-reduce x*3/2 is (x+x/2) or
> (x+(x>>1))


I agree.  If we are doing any numeric calculations with the numbers
predetermined - we might as well help the compiler generate the most
efficient code, instead of human readable ease . (That can always go in the
comments ).



>
> -Yonik
>
12