***UNCHECKED*** SolrCloud is sick.

classic Classic list List threaded Threaded
40 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Missing top level javadocs

Bram Van Dam
David Smiley mentioned this in the "SolrCloud is sick" thread. Instead
of hijacking that, I figured I'd start another thread.

On 03/11/2019 05:32, David Smiley wrote:
> <snip> requiring javadocs on all top level classes.  I think more javadocs and
> code comments would be very helpful -- especially for the major
> classes.

This sounds like something that's actionable.

I'm not sure if there are any guidelines regarding documentation on the
Solr project, but on my team there's a rule that says all classes must
have a top-level javadoc that explains the "why" of the class. "Why does
it exist/what's it for?"

Excluding contrib, solrj and tests, there are some 400 source files with
classes with missing top level Javadoc. This includes some files with
undocumented nested "public static" classes -- couldn't find an obvious
way to exclude those using checkstyle.

Here's a "top ten most frequently modified files with missing Javadoc"
below. This is an arbitrary metric, the "most referenced classes" might
be more useful, but that was harder to hack together with shell foo.

solr/core/src/java/org/apache/solr/core/CoreContainer.java
solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java
solr/core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java
solr/core/src/java/org/apache/solr/handler/StreamHandler.java
solr/core/src/java/org/apache/solr/cloud/ElectionContext.java
solr/core/src/java/org/apache/solr/handler/component/RealTimeGetComponent.java
solr/core/src/java/org/apache/solr/update/DefaultSolrCoreState.java
solr/core/src/java/org/apache/solr/search/JoinQParserPlugin.java
solr/core/src/java/org/apache/solr/handler/component/HttpShardHandlerFactory.java
solr/core/src/java/org/apache/solr/handler/SolrConfigHandler.java

If there's any interest in this, I could write a patch to include
something like this in the build (ant or gradle, whatever).

 - Bram

Following checkstyle configuration detects classes with missing Javadoc:

check.xml:
==========

<!DOCTYPE module PUBLIC
  "-//Checkstyle//DTD Checkstyle Configuration 1.3//EN"
  "https://checkstyle.org/dtds/configuration_1_3.dtd">

<module name="Checker">
        <module name="TreeWalker">
                <module name="MissingJavadocType"/>
        </module>
</module>

Bit of shell foo to list offending files:
=========================================

java -jar checkstyle-8.26-all.jar -c config.xml solr/ | cut -d ' ' -f 2
| sed "s:.*/lucene-solr/::g" | cut -d ':' -f 1 | sort | uniq


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Missing top level javadocs

Andrzej Białecki-2
+1, I think it’s an excellent idea. The check should also verify that the comment not only exists but also that it’s not empty - eg. there’s an IntelliJ template that creates an empty top-level javadoc.

> On 4 Nov 2019, at 16:40, Bram Van Dam <[hidden email]> wrote:
>
> David Smiley mentioned this in the "SolrCloud is sick" thread. Instead
> of hijacking that, I figured I'd start another thread.
>
> On 03/11/2019 05:32, David Smiley wrote:
>> <snip> requiring javadocs on all top level classes.  I think more javadocs and
>> code comments would be very helpful -- especially for the major
>> classes.
>
> This sounds like something that's actionable.
>
> I'm not sure if there are any guidelines regarding documentation on the
> Solr project, but on my team there's a rule that says all classes must
> have a top-level javadoc that explains the "why" of the class. "Why does
> it exist/what's it for?"
>
> Excluding contrib, solrj and tests, there are some 400 source files with
> classes with missing top level Javadoc. This includes some files with
> undocumented nested "public static" classes -- couldn't find an obvious
> way to exclude those using checkstyle.
>
> Here's a "top ten most frequently modified files with missing Javadoc"
> below. This is an arbitrary metric, the "most referenced classes" might
> be more useful, but that was harder to hack together with shell foo.
>
> solr/core/src/java/org/apache/solr/core/CoreContainer.java
> solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java
> solr/core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java
> solr/core/src/java/org/apache/solr/handler/StreamHandler.java
> solr/core/src/java/org/apache/solr/cloud/ElectionContext.java
> solr/core/src/java/org/apache/solr/handler/component/RealTimeGetComponent.java
> solr/core/src/java/org/apache/solr/update/DefaultSolrCoreState.java
> solr/core/src/java/org/apache/solr/search/JoinQParserPlugin.java
> solr/core/src/java/org/apache/solr/handler/component/HttpShardHandlerFactory.java
> solr/core/src/java/org/apache/solr/handler/SolrConfigHandler.java
>
> If there's any interest in this, I could write a patch to include
> something like this in the build (ant or gradle, whatever).
>
> - Bram
>
> Following checkstyle configuration detects classes with missing Javadoc:
>
> check.xml:
> ==========
>
> <!DOCTYPE module PUBLIC
>  "-//Checkstyle//DTD Checkstyle Configuration 1.3//EN"
>  "https://checkstyle.org/dtds/configuration_1_3.dtd">
>
> <module name="Checker">
> <module name="TreeWalker">
> <module name="MissingJavadocType"/>
> </module>
> </module>
>
> Bit of shell foo to list offending files:
> =========================================
>
> java -jar checkstyle-8.26-all.jar -c config.xml solr/ | cut -d ' ' -f 2
> | sed "s:.*/lucene-solr/::g" | cut -d ':' -f 1 | sort | uniq
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Erick Erickson
In reply to this post by Bram Van Dam
Bram:

Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.

Erick

> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>
>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>
> On an unrelated project, I've stopped using "raw" ZK client access and
> have switched to Curator. The API is a fair bit easier to work with, and
> it results in less ugly code. I realize that this won't go very far in
> resolving more fundamental issues, but it might be something that can
> help improve the shape of the code.
>
> - Bram
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Missing top level javadocs

david.w.smiley@gmail.com
In reply to this post by Andrzej Białecki-2
+1 fantastic!

~ David Smiley
Apache Lucene/Solr Search Developer


On Mon, Nov 4, 2019 at 10:45 AM Andrzej Białecki <[hidden email]> wrote:
+1, I think it’s an excellent idea. The check should also verify that the comment not only exists but also that it’s not empty - eg. there’s an IntelliJ template that creates an empty top-level javadoc.

> On 4 Nov 2019, at 16:40, Bram Van Dam <[hidden email]> wrote:
>
> David Smiley mentioned this in the "SolrCloud is sick" thread. Instead
> of hijacking that, I figured I'd start another thread.
>
> On 03/11/2019 05:32, David Smiley wrote:
>> <snip> requiring javadocs on all top level classes.  I think more javadocs and
>> code comments would be very helpful -- especially for the major
>> classes.
>
> This sounds like something that's actionable.
>
> I'm not sure if there are any guidelines regarding documentation on the
> Solr project, but on my team there's a rule that says all classes must
> have a top-level javadoc that explains the "why" of the class. "Why does
> it exist/what's it for?"
>
> Excluding contrib, solrj and tests, there are some 400 source files with
> classes with missing top level Javadoc. This includes some files with
> undocumented nested "public static" classes -- couldn't find an obvious
> way to exclude those using checkstyle.
>
> Here's a "top ten most frequently modified files with missing Javadoc"
> below. This is an arbitrary metric, the "most referenced classes" might
> be more useful, but that was harder to hack together with shell foo.
>
> solr/core/src/java/org/apache/solr/core/CoreContainer.java
> solr/core/src/java/org/apache/solr/handler/admin/CollectionsHandler.java
> solr/core/src/java/org/apache/solr/update/processor/DistributedUpdateProcessor.java
> solr/core/src/java/org/apache/solr/handler/StreamHandler.java
> solr/core/src/java/org/apache/solr/cloud/ElectionContext.java
> solr/core/src/java/org/apache/solr/handler/component/RealTimeGetComponent.java
> solr/core/src/java/org/apache/solr/update/DefaultSolrCoreState.java
> solr/core/src/java/org/apache/solr/search/JoinQParserPlugin.java
> solr/core/src/java/org/apache/solr/handler/component/HttpShardHandlerFactory.java
> solr/core/src/java/org/apache/solr/handler/SolrConfigHandler.java
>
> If there's any interest in this, I could write a patch to include
> something like this in the build (ant or gradle, whatever).
>
> - Bram
>
> Following checkstyle configuration detects classes with missing Javadoc:
>
> check.xml:
> ==========
>
> <!DOCTYPE module PUBLIC
>  "-//Checkstyle//DTD Checkstyle Configuration 1.3//EN"
>  "https://checkstyle.org/dtds/configuration_1_3.dtd">
>
> <module name="Checker">
>       <module name="TreeWalker">
>               <module name="MissingJavadocType"/>
>       </module>
> </module>
>
> Bit of shell foo to list offending files:
> =========================================
>
> java -jar checkstyle-8.26-all.jar -c config.xml solr/ | cut -d ' ' -f 2
> | sed "s:.*/lucene-solr/::g" | cut -d ':' -f 1 | sort | uniq
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Scott Blum
In reply to this post by Erick Erickson
Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)

On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
Bram:

Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.

Erick

> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>
>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>
> On an unrelated project, I've stopped using "raw" ZK client access and
> have switched to Curator. The API is a fair bit easier to work with, and
> it results in less ugly code. I realize that this won't go very far in
> resolving more fundamental issues, but it might be something that can
> help improve the shape of the code.
>
> - Bram
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Erick Erickson
If Curator would make that easier and we’re doing major surgery anyway….

But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.

Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.

But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
1> are we at that point?
2> are we going to put the effort into rewriting some of the worst offenders?



> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>
> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>
> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
> Bram:
>
> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>
> Erick
>
> > On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
> >
> >> SolrCloud is sick right now. The way low level Zookeeper is handeled
> >
> > On an unrelated project, I've stopped using "raw" ZK client access and
> > have switched to Curator. The API is a fair bit easier to work with, and
> > it results in less ugly code. I realize that this won't go very far in
> > resolving more fundamental issues, but it might be something that can
> > help improve the shape of the code.
> >
> > - Bram
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [hidden email]
> > For additional commands, e-mail: [hidden email]
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Jörn Franke
I guess this is also a bit normal with software that grows over the years.
One could also say that one writes the current use cases and interesting future use cases for Solr in a document and designs from scratch new - taking only the good pieces out of the existing software.
Of course there is a certain amount of time where you need to maintain both - but this will be also the case for a major rewrite.

> Am 04.11.2019 um 20:58 schrieb Erick Erickson <[hidden email]>:
>
> If Curator would make that easier and we’re doing major surgery anyway….
>
> But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.
>
> Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.
>
> But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
> 1> are we at that point?
> 2> are we going to put the effort into rewriting some of the worst offenders?
>
>
>
>> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>>
>> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>>
>> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
>> Bram:
>>
>> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>>
>> Erick
>>
>>>> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>>>
>>>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>>>
>>> On an unrelated project, I've stopped using "raw" ZK client access and
>>> have switched to Curator. The API is a fair bit easier to work with, and
>>> it results in less ugly code. I realize that this won't go very far in
>>> resolving more fundamental issues, but it might be something that can
>>> help improve the shape of the code.
>>>
>>> - Bram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Mark Miller-3
There are 10,000 problems here.

So if you eventually land on one possible solution you agree on, we a little closer.

There is no problem with the current design. Design's can always be improved, sure. I've made this one fast. You won't believe me fast. The low hanging fruit is astronomical, there is more fruit above that.

We never focused on performance. Or at least didn't. That's after we harden.

Except performance is the key to everything.

SolrCloud is not the only problem. The design of Solr, of SolrCloud, they are fine. Change them, I don't care. Later. They are not a problem.

But Solr has as many problems as SolrCloud at this point. This just mater  a whole hell of lot less unless they are messing with SolrCloud. Standalone is more of a brute.

We have 60 modules that are interconnected. We have a huge code base. That is also fine.

We don't tend our garden. That's not fine. I've tended the garden before without one - more than once before. It's a great damn garden. You guys only get to see it grown over and full of weeds.

Anyway, no redesign, no library, no nothing like that gonna save this.

This is hardly concrete awareness of a problem here. The awareness to figure out what actually are the problems and what must be done - that's expensive shit these days if you ask me. I've been wrong lots tough.






On Mon, Nov 4, 2019 at 2:26 PM Jörn Franke <[hidden email]> wrote:
I guess this is also a bit normal with software that grows over the years.
One could also say that one writes the current use cases and interesting future use cases for Solr in a document and designs from scratch new - taking only the good pieces out of the existing software.
Of course there is a certain amount of time where you need to maintain both - but this will be also the case for a major rewrite.

> Am 04.11.2019 um 20:58 schrieb Erick Erickson <[hidden email]>:
>
> If Curator would make that easier and we’re doing major surgery anyway….
>
> But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.
>
> Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.
>
> But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
> 1> are we at that point?
> 2> are we going to put the effort into rewriting some of the worst offenders?
>
>
>
>> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>>
>> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>>
>> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
>> Bram:
>>
>> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>>
>> Erick
>>
>>>> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>>>
>>>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>>>
>>> On an unrelated project, I've stopped using "raw" ZK client access and
>>> have switched to Curator. The API is a fair bit easier to work with, and
>>> it results in less ugly code. I realize that this won't go very far in
>>> resolving more fundamental issues, but it might be something that can
>>> help improve the shape of the code.
>>>
>>> - Bram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



--
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Mark Miller-3
Basically I can fix 99% of this without you guys - with simple care and effort and time that non of you are likely in the circumstances of being able to duplicate.. Been there done that, made it 100x-1000x faster to boot and added all kinds of fun.

But I can't build the rest of Solr. I don't care about facets. So let's meet half way.

On Tue, Nov 5, 2019 at 5:14 AM Mark Miller <[hidden email]> wrote:
There are 10,000 problems here.

So if you eventually land on one possible solution you agree on, we a little closer.

There is no problem with the current design. Design's can always be improved, sure. I've made this one fast. You won't believe me fast. The low hanging fruit is astronomical, there is more fruit above that.

We never focused on performance. Or at least didn't. That's after we harden.

Except performance is the key to everything.

SolrCloud is not the only problem. The design of Solr, of SolrCloud, they are fine. Change them, I don't care. Later. They are not a problem.

But Solr has as many problems as SolrCloud at this point. This just mater  a whole hell of lot less unless they are messing with SolrCloud. Standalone is more of a brute.

We have 60 modules that are interconnected. We have a huge code base. That is also fine.

We don't tend our garden. That's not fine. I've tended the garden before without one - more than once before. It's a great damn garden. You guys only get to see it grown over and full of weeds.

Anyway, no redesign, no library, no nothing like that gonna save this.

This is hardly concrete awareness of a problem here. The awareness to figure out what actually are the problems and what must be done - that's expensive shit these days if you ask me. I've been wrong lots tough.






On Mon, Nov 4, 2019 at 2:26 PM Jörn Franke <[hidden email]> wrote:
I guess this is also a bit normal with software that grows over the years.
One could also say that one writes the current use cases and interesting future use cases for Solr in a document and designs from scratch new - taking only the good pieces out of the existing software.
Of course there is a certain amount of time where you need to maintain both - but this will be also the case for a major rewrite.

> Am 04.11.2019 um 20:58 schrieb Erick Erickson <[hidden email]>:
>
> If Curator would make that easier and we’re doing major surgery anyway….
>
> But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.
>
> Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.
>
> But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
> 1> are we at that point?
> 2> are we going to put the effort into rewriting some of the worst offenders?
>
>
>
>> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>>
>> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>>
>> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
>> Bram:
>>
>> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>>
>> Erick
>>
>>>> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>>>
>>>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>>>
>>> On an unrelated project, I've stopped using "raw" ZK client access and
>>> have switched to Curator. The API is a fair bit easier to work with, and
>>> it results in less ugly code. I realize that this won't go very far in
>>> resolving more fundamental issues, but it might be something that can
>>> help improve the shape of the code.
>>>
>>> - Bram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



--


--
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Mark Miller-3
And look, we started pretty deep in the hole. Solr started with tons of bug or limitations that hardly mattered to it and hit SolrCloud in the eye like a train. And we were not setup to deal with that.

We never had a nice garden for SolrCloud. We started in a mess, thinking, eventually we clear the overgrowth, and we are all good. And then we started building our house and that garden went wild with a life of it's own.

And our development practices, amazingly above many many many groups and standards out there, is woefully inaccurate for what we are doing.

"Test pass, I'm not sure about all this but I'm going to commit" (Tests never pass, must be a lie anyway)
"Leaving on vacation, going to fire this in"
"No one has looked at this huge thing, it's been a while, going to commit"
*commit*

And comments to that affect pretty much wrap up our careful and thoughtful attitude.

And then of course we come and clean up after, careful gardeners that we are ... no, we don't. We are not setup to be gardeners, we are not trying, even if we do, I only like grass and screw the other plants.

Without SolrCloud, Solr wold be in trouble as well. Brute that it is, it could go a few more rounds. SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it. But it is, and it cannot take the beating that the brute does.

- Mark

On Tue, Nov 5, 2019 at 5:19 AM Mark Miller <[hidden email]> wrote:
Basically I can fix 99% of this without you guys - with simple care and effort and time that non of you are likely in the circumstances of being able to duplicate.. Been there done that, made it 100x-1000x faster to boot and added all kinds of fun.

But I can't build the rest of Solr. I don't care about facets. So let's meet half way.

On Tue, Nov 5, 2019 at 5:14 AM Mark Miller <[hidden email]> wrote:
There are 10,000 problems here.

So if you eventually land on one possible solution you agree on, we a little closer.

There is no problem with the current design. Design's can always be improved, sure. I've made this one fast. You won't believe me fast. The low hanging fruit is astronomical, there is more fruit above that.

We never focused on performance. Or at least didn't. That's after we harden.

Except performance is the key to everything.

SolrCloud is not the only problem. The design of Solr, of SolrCloud, they are fine. Change them, I don't care. Later. They are not a problem.

But Solr has as many problems as SolrCloud at this point. This just mater  a whole hell of lot less unless they are messing with SolrCloud. Standalone is more of a brute.

We have 60 modules that are interconnected. We have a huge code base. That is also fine.

We don't tend our garden. That's not fine. I've tended the garden before without one - more than once before. It's a great damn garden. You guys only get to see it grown over and full of weeds.

Anyway, no redesign, no library, no nothing like that gonna save this.

This is hardly concrete awareness of a problem here. The awareness to figure out what actually are the problems and what must be done - that's expensive shit these days if you ask me. I've been wrong lots tough.






On Mon, Nov 4, 2019 at 2:26 PM Jörn Franke <[hidden email]> wrote:
I guess this is also a bit normal with software that grows over the years.
One could also say that one writes the current use cases and interesting future use cases for Solr in a document and designs from scratch new - taking only the good pieces out of the existing software.
Of course there is a certain amount of time where you need to maintain both - but this will be also the case for a major rewrite.

> Am 04.11.2019 um 20:58 schrieb Erick Erickson <[hidden email]>:
>
> If Curator would make that easier and we’re doing major surgery anyway….
>
> But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.
>
> Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.
>
> But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
> 1> are we at that point?
> 2> are we going to put the effort into rewriting some of the worst offenders?
>
>
>
>> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>>
>> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>>
>> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
>> Bram:
>>
>> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>>
>> Erick
>>
>>>> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>>>
>>>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>>>
>>> On an unrelated project, I've stopped using "raw" ZK client access and
>>> have switched to Curator. The API is a fair bit easier to work with, and
>>> it results in less ugly code. I realize that this won't go very far in
>>> resolving more fundamental issues, but it might be something that can
>>> help improve the shape of the code.
>>>
>>> - Bram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



--


--


--
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Mark Miller-3
I suppose I should toss one more out.

Hell yes, we will be using curator.

It's insane for any group larger than 2-3 to directly use ZooKeeper. Even for that group, you want some damn good reasons to not use curator. We can start using more assembly too (joke Yonik).

Curator was an option initially. Then it was yet another project hosted by Netflix. Now it is essential.


- Mark

On Tue, Nov 5, 2019 at 9:41 AM Mark Miller <[hidden email]> wrote:
And look, we started pretty deep in the hole. Solr started with tons of bug or limitations that hardly mattered to it and hit SolrCloud in the eye like a train. And we were not setup to deal with that.

We never had a nice garden for SolrCloud. We started in a mess, thinking, eventually we clear the overgrowth, and we are all good. And then we started building our house and that garden went wild with a life of it's own.

And our development practices, amazingly above many many many groups and standards out there, is woefully inaccurate for what we are doing.

"Test pass, I'm not sure about all this but I'm going to commit" (Tests never pass, must be a lie anyway)
"Leaving on vacation, going to fire this in"
"No one has looked at this huge thing, it's been a while, going to commit"
*commit*

And comments to that affect pretty much wrap up our careful and thoughtful attitude.

And then of course we come and clean up after, careful gardeners that we are ... no, we don't. We are not setup to be gardeners, we are not trying, even if we do, I only like grass and screw the other plants.

Without SolrCloud, Solr wold be in trouble as well. Brute that it is, it could go a few more rounds. SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it. But it is, and it cannot take the beating that the brute does.

- Mark

On Tue, Nov 5, 2019 at 5:19 AM Mark Miller <[hidden email]> wrote:
Basically I can fix 99% of this without you guys - with simple care and effort and time that non of you are likely in the circumstances of being able to duplicate.. Been there done that, made it 100x-1000x faster to boot and added all kinds of fun.

But I can't build the rest of Solr. I don't care about facets. So let's meet half way.

On Tue, Nov 5, 2019 at 5:14 AM Mark Miller <[hidden email]> wrote:
There are 10,000 problems here.

So if you eventually land on one possible solution you agree on, we a little closer.

There is no problem with the current design. Design's can always be improved, sure. I've made this one fast. You won't believe me fast. The low hanging fruit is astronomical, there is more fruit above that.

We never focused on performance. Or at least didn't. That's after we harden.

Except performance is the key to everything.

SolrCloud is not the only problem. The design of Solr, of SolrCloud, they are fine. Change them, I don't care. Later. They are not a problem.

But Solr has as many problems as SolrCloud at this point. This just mater  a whole hell of lot less unless they are messing with SolrCloud. Standalone is more of a brute.

We have 60 modules that are interconnected. We have a huge code base. That is also fine.

We don't tend our garden. That's not fine. I've tended the garden before without one - more than once before. It's a great damn garden. You guys only get to see it grown over and full of weeds.

Anyway, no redesign, no library, no nothing like that gonna save this.

This is hardly concrete awareness of a problem here. The awareness to figure out what actually are the problems and what must be done - that's expensive shit these days if you ask me. I've been wrong lots tough.






On Mon, Nov 4, 2019 at 2:26 PM Jörn Franke <[hidden email]> wrote:
I guess this is also a bit normal with software that grows over the years.
One could also say that one writes the current use cases and interesting future use cases for Solr in a document and designs from scratch new - taking only the good pieces out of the existing software.
Of course there is a certain amount of time where you need to maintain both - but this will be also the case for a major rewrite.

> Am 04.11.2019 um 20:58 schrieb Erick Erickson <[hidden email]>:
>
> If Curator would make that easier and we’re doing major surgery anyway….
>
> But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.
>
> Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.
>
> But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
> 1> are we at that point?
> 2> are we going to put the effort into rewriting some of the worst offenders?
>
>
>
>> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>>
>> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>>
>> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
>> Bram:
>>
>> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>>
>> Erick
>>
>>>> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>>>
>>>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>>>
>>> On an unrelated project, I've stopped using "raw" ZK client access and
>>> have switched to Curator. The API is a fair bit easier to work with, and
>>> it results in less ugly code. I realize that this won't go very far in
>>> resolving more fundamental issues, but it might be something that can
>>> help improve the shape of the code.
>>>
>>> - Bram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



--


--


--


--
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Mark Miller-3
I'll tell you what guys, development right now sucks. I don't enjoy.

But when I start to put things in shape? I get this smile, and I start going with the feeling of I don't need you guys, I don't users, I don't need a job, cause just this is figgen nice.

On Tue, Nov 5, 2019 at 9:59 AM Mark Miller <[hidden email]> wrote:
I suppose I should toss one more out.

Hell yes, we will be using curator.

It's insane for any group larger than 2-3 to directly use ZooKeeper. Even for that group, you want some damn good reasons to not use curator. We can start using more assembly too (joke Yonik).

Curator was an option initially. Then it was yet another project hosted by Netflix. Now it is essential.


- Mark

On Tue, Nov 5, 2019 at 9:41 AM Mark Miller <[hidden email]> wrote:
And look, we started pretty deep in the hole. Solr started with tons of bug or limitations that hardly mattered to it and hit SolrCloud in the eye like a train. And we were not setup to deal with that.

We never had a nice garden for SolrCloud. We started in a mess, thinking, eventually we clear the overgrowth, and we are all good. And then we started building our house and that garden went wild with a life of it's own.

And our development practices, amazingly above many many many groups and standards out there, is woefully inaccurate for what we are doing.

"Test pass, I'm not sure about all this but I'm going to commit" (Tests never pass, must be a lie anyway)
"Leaving on vacation, going to fire this in"
"No one has looked at this huge thing, it's been a while, going to commit"
*commit*

And comments to that affect pretty much wrap up our careful and thoughtful attitude.

And then of course we come and clean up after, careful gardeners that we are ... no, we don't. We are not setup to be gardeners, we are not trying, even if we do, I only like grass and screw the other plants.

Without SolrCloud, Solr wold be in trouble as well. Brute that it is, it could go a few more rounds. SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it. But it is, and it cannot take the beating that the brute does.

- Mark

On Tue, Nov 5, 2019 at 5:19 AM Mark Miller <[hidden email]> wrote:
Basically I can fix 99% of this without you guys - with simple care and effort and time that non of you are likely in the circumstances of being able to duplicate.. Been there done that, made it 100x-1000x faster to boot and added all kinds of fun.

But I can't build the rest of Solr. I don't care about facets. So let's meet half way.

On Tue, Nov 5, 2019 at 5:14 AM Mark Miller <[hidden email]> wrote:
There are 10,000 problems here.

So if you eventually land on one possible solution you agree on, we a little closer.

There is no problem with the current design. Design's can always be improved, sure. I've made this one fast. You won't believe me fast. The low hanging fruit is astronomical, there is more fruit above that.

We never focused on performance. Or at least didn't. That's after we harden.

Except performance is the key to everything.

SolrCloud is not the only problem. The design of Solr, of SolrCloud, they are fine. Change them, I don't care. Later. They are not a problem.

But Solr has as many problems as SolrCloud at this point. This just mater  a whole hell of lot less unless they are messing with SolrCloud. Standalone is more of a brute.

We have 60 modules that are interconnected. We have a huge code base. That is also fine.

We don't tend our garden. That's not fine. I've tended the garden before without one - more than once before. It's a great damn garden. You guys only get to see it grown over and full of weeds.

Anyway, no redesign, no library, no nothing like that gonna save this.

This is hardly concrete awareness of a problem here. The awareness to figure out what actually are the problems and what must be done - that's expensive shit these days if you ask me. I've been wrong lots tough.






On Mon, Nov 4, 2019 at 2:26 PM Jörn Franke <[hidden email]> wrote:
I guess this is also a bit normal with software that grows over the years.
One could also say that one writes the current use cases and interesting future use cases for Solr in a document and designs from scratch new - taking only the good pieces out of the existing software.
Of course there is a certain amount of time where you need to maintain both - but this will be also the case for a major rewrite.

> Am 04.11.2019 um 20:58 schrieb Erick Erickson <[hidden email]>:
>
> If Curator would make that easier and we’re doing major surgery anyway….
>
> But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.
>
> Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.
>
> But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
> 1> are we at that point?
> 2> are we going to put the effort into rewriting some of the worst offenders?
>
>
>
>> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>>
>> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>>
>> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
>> Bram:
>>
>> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>>
>> Erick
>>
>>>> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>>>
>>>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>>>
>>> On an unrelated project, I've stopped using "raw" ZK client access and
>>> have switched to Curator. The API is a fair bit easier to work with, and
>>> it results in less ugly code. I realize that this won't go very far in
>>> resolving more fundamental issues, but it might be something that can
>>> help improve the shape of the code.
>>>
>>> - Bram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



--


--


--


--


--
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Mark Miller-3
It’s like 6-7 years since I quickly added a shitty collections API in my free time because we desperately needed SOMETHING. I don’t know if I tried to make it wait for proper state or what , it was a stub to try get things moving. That call, to this day, along with all our other checks, until some tests ones recently, is garbage.

If I downloaded a database, and a lot the time, after the create a database call returned, my database was not ready, I’d saw wow. Terrible bug got through. If it was a persistent issue for over half a decade? My god. 

Look I just spent that half decade upgrading from Solr 4 to whatever. I was mostly out of the loop. But this is crazy, me in there too.

Mark

On Tue, Nov 5, 2019 at 10:05 AM Mark Miller <[hidden email]> wrote:
I'll tell you what guys, development right now sucks. I don't enjoy.

But when I start to put things in shape? I get this smile, and I start going with the feeling of I don't need you guys, I don't users, I don't need a job, cause just this is figgen nice.

On Tue, Nov 5, 2019 at 9:59 AM Mark Miller <[hidden email]> wrote:
I suppose I should toss one more out.

Hell yes, we will be using curator.

It's insane for any group larger than 2-3 to directly use ZooKeeper. Even for that group, you want some damn good reasons to not use curator. We can start using more assembly too (joke Yonik).

Curator was an option initially. Then it was yet another project hosted by Netflix. Now it is essential.


- Mark

On Tue, Nov 5, 2019 at 9:41 AM Mark Miller <[hidden email]> wrote:
And look, we started pretty deep in the hole. Solr started with tons of bug or limitations that hardly mattered to it and hit SolrCloud in the eye like a train. And we were not setup to deal with that.

We never had a nice garden for SolrCloud. We started in a mess, thinking, eventually we clear the overgrowth, and we are all good. And then we started building our house and that garden went wild with a life of it's own.

And our development practices, amazingly above many many many groups and standards out there, is woefully inaccurate for what we are doing.

"Test pass, I'm not sure about all this but I'm going to commit" (Tests never pass, must be a lie anyway)
"Leaving on vacation, going to fire this in"
"No one has looked at this huge thing, it's been a while, going to commit"
*commit*

And comments to that affect pretty much wrap up our careful and thoughtful attitude.

And then of course we come and clean up after, careful gardeners that we are ... no, we don't. We are not setup to be gardeners, we are not trying, even if we do, I only like grass and screw the other plants.

Without SolrCloud, Solr wold be in trouble as well. Brute that it is, it could go a few more rounds. SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it. But it is, and it cannot take the beating that the brute does.

- Mark

On Tue, Nov 5, 2019 at 5:19 AM Mark Miller <[hidden email]> wrote:
Basically I can fix 99% of this without you guys - with simple care and effort and time that non of you are likely in the circumstances of being able to duplicate.. Been there done that, made it 100x-1000x faster to boot and added all kinds of fun.

But I can't build the rest of Solr. I don't care about facets. So let's meet half way.

On Tue, Nov 5, 2019 at 5:14 AM Mark Miller <[hidden email]> wrote:
There are 10,000 problems here.

So if you eventually land on one possible solution you agree on, we a little closer.

There is no problem with the current design. Design's can always be improved, sure. I've made this one fast. You won't believe me fast. The low hanging fruit is astronomical, there is more fruit above that.

We never focused on performance. Or at least didn't. That's after we harden.

Except performance is the key to everything.

SolrCloud is not the only problem. The design of Solr, of SolrCloud, they are fine. Change them, I don't care. Later. They are not a problem.

But Solr has as many problems as SolrCloud at this point. This just mater  a whole hell of lot less unless they are messing with SolrCloud. Standalone is more of a brute.

We have 60 modules that are interconnected. We have a huge code base. That is also fine.

We don't tend our garden. That's not fine. I've tended the garden before without one - more than once before. It's a great damn garden. You guys only get to see it grown over and full of weeds.

Anyway, no redesign, no library, no nothing like that gonna save this.

This is hardly concrete awareness of a problem here. The awareness to figure out what actually are the problems and what must be done - that's expensive shit these days if you ask me. I've been wrong lots tough.






On Mon, Nov 4, 2019 at 2:26 PM Jörn Franke <[hidden email]> wrote:
I guess this is also a bit normal with software that grows over the years.
One could also say that one writes the current use cases and interesting future use cases for Solr in a document and designs from scratch new - taking only the good pieces out of the existing software.
Of course there is a certain amount of time where you need to maintain both - but this will be also the case for a major rewrite.

> Am 04.11.2019 um 20:58 schrieb Erick Erickson <[hidden email]>:
>
> If Curator would make that easier and we’re doing major surgery anyway….
>
> But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.
>
> Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.
>
> But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
> 1> are we at that point?
> 2> are we going to put the effort into rewriting some of the worst offenders?
>
>
>
>> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>>
>> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>>
>> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
>> Bram:
>>
>> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>>
>> Erick
>>
>>>> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>>>
>>>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>>>
>>> On an unrelated project, I've stopped using "raw" ZK client access and
>>> have switched to Curator. The API is a fair bit easier to work with, and
>>> it results in less ugly code. I realize that this won't go very far in
>>> resolving more fundamental issues, but it might be something that can
>>> help improve the shape of the code.
>>>
>>> - Bram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



--


--


--


--


--
--
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Mark Miller-3
If you had any idea how much suffering just that has caused. Not just users, but us. 

Mark 

On Tue, Nov 5, 2019 at 10:38 AM Mark Miller <[hidden email]> wrote:
It’s like 6-7 years since I quickly added a shitty collections API in my free time because we desperately needed SOMETHING. I don’t know if I tried to make it wait for proper state or what , it was a stub to try get things moving. That call, to this day, along with all our other checks, until some tests ones recently, is garbage.

If I downloaded a database, and a lot the time, after the create a database call returned, my database was not ready, I’d saw wow. Terrible bug got through. If it was a persistent issue for over half a decade? My god. 

Look I just spent that half decade upgrading from Solr 4 to whatever. I was mostly out of the loop. But this is crazy, me in there too.

Mark

On Tue, Nov 5, 2019 at 10:05 AM Mark Miller <[hidden email]> wrote:
I'll tell you what guys, development right now sucks. I don't enjoy.

But when I start to put things in shape? I get this smile, and I start going with the feeling of I don't need you guys, I don't users, I don't need a job, cause just this is figgen nice.

On Tue, Nov 5, 2019 at 9:59 AM Mark Miller <[hidden email]> wrote:
I suppose I should toss one more out.

Hell yes, we will be using curator.

It's insane for any group larger than 2-3 to directly use ZooKeeper. Even for that group, you want some damn good reasons to not use curator. We can start using more assembly too (joke Yonik).

Curator was an option initially. Then it was yet another project hosted by Netflix. Now it is essential.


- Mark

On Tue, Nov 5, 2019 at 9:41 AM Mark Miller <[hidden email]> wrote:
And look, we started pretty deep in the hole. Solr started with tons of bug or limitations that hardly mattered to it and hit SolrCloud in the eye like a train. And we were not setup to deal with that.

We never had a nice garden for SolrCloud. We started in a mess, thinking, eventually we clear the overgrowth, and we are all good. And then we started building our house and that garden went wild with a life of it's own.

And our development practices, amazingly above many many many groups and standards out there, is woefully inaccurate for what we are doing.

"Test pass, I'm not sure about all this but I'm going to commit" (Tests never pass, must be a lie anyway)
"Leaving on vacation, going to fire this in"
"No one has looked at this huge thing, it's been a while, going to commit"
*commit*

And comments to that affect pretty much wrap up our careful and thoughtful attitude.

And then of course we come and clean up after, careful gardeners that we are ... no, we don't. We are not setup to be gardeners, we are not trying, even if we do, I only like grass and screw the other plants.

Without SolrCloud, Solr wold be in trouble as well. Brute that it is, it could go a few more rounds. SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it. But it is, and it cannot take the beating that the brute does.

- Mark

On Tue, Nov 5, 2019 at 5:19 AM Mark Miller <[hidden email]> wrote:
Basically I can fix 99% of this without you guys - with simple care and effort and time that non of you are likely in the circumstances of being able to duplicate.. Been there done that, made it 100x-1000x faster to boot and added all kinds of fun.

But I can't build the rest of Solr. I don't care about facets. So let's meet half way.

On Tue, Nov 5, 2019 at 5:14 AM Mark Miller <[hidden email]> wrote:
There are 10,000 problems here.

So if you eventually land on one possible solution you agree on, we a little closer.

There is no problem with the current design. Design's can always be improved, sure. I've made this one fast. You won't believe me fast. The low hanging fruit is astronomical, there is more fruit above that.

We never focused on performance. Or at least didn't. That's after we harden.

Except performance is the key to everything.

SolrCloud is not the only problem. The design of Solr, of SolrCloud, they are fine. Change them, I don't care. Later. They are not a problem.

But Solr has as many problems as SolrCloud at this point. This just mater  a whole hell of lot less unless they are messing with SolrCloud. Standalone is more of a brute.

We have 60 modules that are interconnected. We have a huge code base. That is also fine.

We don't tend our garden. That's not fine. I've tended the garden before without one - more than once before. It's a great damn garden. You guys only get to see it grown over and full of weeds.

Anyway, no redesign, no library, no nothing like that gonna save this.

This is hardly concrete awareness of a problem here. The awareness to figure out what actually are the problems and what must be done - that's expensive shit these days if you ask me. I've been wrong lots tough.






On Mon, Nov 4, 2019 at 2:26 PM Jörn Franke <[hidden email]> wrote:
I guess this is also a bit normal with software that grows over the years.
One could also say that one writes the current use cases and interesting future use cases for Solr in a document and designs from scratch new - taking only the good pieces out of the existing software.
Of course there is a certain amount of time where you need to maintain both - but this will be also the case for a major rewrite.

> Am 04.11.2019 um 20:58 schrieb Erick Erickson <[hidden email]>:
>
> If Curator would make that easier and we’re doing major surgery anyway….
>
> But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.
>
> Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.
>
> But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
> 1> are we at that point?
> 2> are we going to put the effort into rewriting some of the worst offenders?
>
>
>
>> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>>
>> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>>
>> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
>> Bram:
>>
>> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>>
>> Erick
>>
>>>> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>>>
>>>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>>>
>>> On an unrelated project, I've stopped using "raw" ZK client access and
>>> have switched to Curator. The API is a fair bit easier to work with, and
>>> it results in less ugly code. I realize that this won't go very far in
>>> resolving more fundamental issues, but it might be something that can
>>> help improve the shape of the code.
>>>
>>> - Bram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



--


--


--


--


--
--
--
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Mark Miller-3
And now we are to meat of it. Fill in here: https://issues.apache.org/jira/browse/SOLR-13888

We can play in a better world, we can have fun, but some of you are going to have to adjust your ways. In the most convenient way possible. You are all great people, I don't want to cause you annoyance, but there are certain requirements to building an aircraft, and there certain requirements to building this.

On Tue, Nov 5, 2019 at 10:44 AM Mark Miller <[hidden email]> wrote:
If you had any idea how much suffering just that has caused. Not just users, but us. 

Mark 

On Tue, Nov 5, 2019 at 10:38 AM Mark Miller <[hidden email]> wrote:
It’s like 6-7 years since I quickly added a shitty collections API in my free time because we desperately needed SOMETHING. I don’t know if I tried to make it wait for proper state or what , it was a stub to try get things moving. That call, to this day, along with all our other checks, until some tests ones recently, is garbage.

If I downloaded a database, and a lot the time, after the create a database call returned, my database was not ready, I’d saw wow. Terrible bug got through. If it was a persistent issue for over half a decade? My god. 

Look I just spent that half decade upgrading from Solr 4 to whatever. I was mostly out of the loop. But this is crazy, me in there too.

Mark

On Tue, Nov 5, 2019 at 10:05 AM Mark Miller <[hidden email]> wrote:
I'll tell you what guys, development right now sucks. I don't enjoy.

But when I start to put things in shape? I get this smile, and I start going with the feeling of I don't need you guys, I don't users, I don't need a job, cause just this is figgen nice.

On Tue, Nov 5, 2019 at 9:59 AM Mark Miller <[hidden email]> wrote:
I suppose I should toss one more out.

Hell yes, we will be using curator.

It's insane for any group larger than 2-3 to directly use ZooKeeper. Even for that group, you want some damn good reasons to not use curator. We can start using more assembly too (joke Yonik).

Curator was an option initially. Then it was yet another project hosted by Netflix. Now it is essential.


- Mark

On Tue, Nov 5, 2019 at 9:41 AM Mark Miller <[hidden email]> wrote:
And look, we started pretty deep in the hole. Solr started with tons of bug or limitations that hardly mattered to it and hit SolrCloud in the eye like a train. And we were not setup to deal with that.

We never had a nice garden for SolrCloud. We started in a mess, thinking, eventually we clear the overgrowth, and we are all good. And then we started building our house and that garden went wild with a life of it's own.

And our development practices, amazingly above many many many groups and standards out there, is woefully inaccurate for what we are doing.

"Test pass, I'm not sure about all this but I'm going to commit" (Tests never pass, must be a lie anyway)
"Leaving on vacation, going to fire this in"
"No one has looked at this huge thing, it's been a while, going to commit"
*commit*

And comments to that affect pretty much wrap up our careful and thoughtful attitude.

And then of course we come and clean up after, careful gardeners that we are ... no, we don't. We are not setup to be gardeners, we are not trying, even if we do, I only like grass and screw the other plants.

Without SolrCloud, Solr wold be in trouble as well. Brute that it is, it could go a few more rounds. SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it. But it is, and it cannot take the beating that the brute does.

- Mark

On Tue, Nov 5, 2019 at 5:19 AM Mark Miller <[hidden email]> wrote:
Basically I can fix 99% of this without you guys - with simple care and effort and time that non of you are likely in the circumstances of being able to duplicate.. Been there done that, made it 100x-1000x faster to boot and added all kinds of fun.

But I can't build the rest of Solr. I don't care about facets. So let's meet half way.

On Tue, Nov 5, 2019 at 5:14 AM Mark Miller <[hidden email]> wrote:
There are 10,000 problems here.

So if you eventually land on one possible solution you agree on, we a little closer.

There is no problem with the current design. Design's can always be improved, sure. I've made this one fast. You won't believe me fast. The low hanging fruit is astronomical, there is more fruit above that.

We never focused on performance. Or at least didn't. That's after we harden.

Except performance is the key to everything.

SolrCloud is not the only problem. The design of Solr, of SolrCloud, they are fine. Change them, I don't care. Later. They are not a problem.

But Solr has as many problems as SolrCloud at this point. This just mater  a whole hell of lot less unless they are messing with SolrCloud. Standalone is more of a brute.

We have 60 modules that are interconnected. We have a huge code base. That is also fine.

We don't tend our garden. That's not fine. I've tended the garden before without one - more than once before. It's a great damn garden. You guys only get to see it grown over and full of weeds.

Anyway, no redesign, no library, no nothing like that gonna save this.

This is hardly concrete awareness of a problem here. The awareness to figure out what actually are the problems and what must be done - that's expensive shit these days if you ask me. I've been wrong lots tough.






On Mon, Nov 4, 2019 at 2:26 PM Jörn Franke <[hidden email]> wrote:
I guess this is also a bit normal with software that grows over the years.
One could also say that one writes the current use cases and interesting future use cases for Solr in a document and designs from scratch new - taking only the good pieces out of the existing software.
Of course there is a certain amount of time where you need to maintain both - but this will be also the case for a major rewrite.

> Am 04.11.2019 um 20:58 schrieb Erick Erickson <[hidden email]>:
>
> If Curator would make that easier and we’re doing major surgery anyway….
>
> But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.
>
> Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.
>
> But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
> 1> are we at that point?
> 2> are we going to put the effort into rewriting some of the worst offenders?
>
>
>
>> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>>
>> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>>
>> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
>> Bram:
>>
>> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>>
>> Erick
>>
>>>> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>>>
>>>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>>>
>>> On an unrelated project, I've stopped using "raw" ZK client access and
>>> have switched to Curator. The API is a fair bit easier to work with, and
>>> it results in less ugly code. I realize that this won't go very far in
>>> resolving more fundamental issues, but it might be something that can
>>> help improve the shape of the code.
>>>
>>> - Bram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



--


--


--


--


--
--
--


--
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Mark Miller-3
bq.   SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it.

And this is why I'm so devastated by the Overseer. I don't blame anyone person. Where was the manual, where was my intervention. I whispered Overseer and cut one more thing off my list of responsibilities.

But the overseer is supposed to be so light weight and easy breezy. Giving up leader shop at the most signs of trouble, keeping communication small and tight with tiny json distrib queue pub/sub updates. Little about stat change, hardly needed. Hardly ever talking to Zookeeper.

Our whole system is not moved hard against this, but nothing so much as the Overseer. It has very scary, very tricky, custom ZK code. It has major communication with ZK. It has little to weak ability to properly throttle itself or deal with things intelligently. It's almost a brute force tactic. And it clings to being Overseer like a moth to flame. It's designed to be on a dedicated hardwar and then mostly to not make any reasonable use of that hardware.

I blame me more than anyone for that. I am mad at me. It's just an absolute brain bash with a sledge hammer to the system. And i never communicated the system very well. I was overloaded.

On Tue, Nov 5, 2019 at 11:01 AM Mark Miller <[hidden email]> wrote:
And now we are to meat of it. Fill in here: https://issues.apache.org/jira/browse/SOLR-13888

We can play in a better world, we can have fun, but some of you are going to have to adjust your ways. In the most convenient way possible. You are all great people, I don't want to cause you annoyance, but there are certain requirements to building an aircraft, and there certain requirements to building this.

On Tue, Nov 5, 2019 at 10:44 AM Mark Miller <[hidden email]> wrote:
If you had any idea how much suffering just that has caused. Not just users, but us. 

Mark 

On Tue, Nov 5, 2019 at 10:38 AM Mark Miller <[hidden email]> wrote:
It’s like 6-7 years since I quickly added a shitty collections API in my free time because we desperately needed SOMETHING. I don’t know if I tried to make it wait for proper state or what , it was a stub to try get things moving. That call, to this day, along with all our other checks, until some tests ones recently, is garbage.

If I downloaded a database, and a lot the time, after the create a database call returned, my database was not ready, I’d saw wow. Terrible bug got through. If it was a persistent issue for over half a decade? My god. 

Look I just spent that half decade upgrading from Solr 4 to whatever. I was mostly out of the loop. But this is crazy, me in there too.

Mark

On Tue, Nov 5, 2019 at 10:05 AM Mark Miller <[hidden email]> wrote:
I'll tell you what guys, development right now sucks. I don't enjoy.

But when I start to put things in shape? I get this smile, and I start going with the feeling of I don't need you guys, I don't users, I don't need a job, cause just this is figgen nice.

On Tue, Nov 5, 2019 at 9:59 AM Mark Miller <[hidden email]> wrote:
I suppose I should toss one more out.

Hell yes, we will be using curator.

It's insane for any group larger than 2-3 to directly use ZooKeeper. Even for that group, you want some damn good reasons to not use curator. We can start using more assembly too (joke Yonik).

Curator was an option initially. Then it was yet another project hosted by Netflix. Now it is essential.


- Mark

On Tue, Nov 5, 2019 at 9:41 AM Mark Miller <[hidden email]> wrote:
And look, we started pretty deep in the hole. Solr started with tons of bug or limitations that hardly mattered to it and hit SolrCloud in the eye like a train. And we were not setup to deal with that.

We never had a nice garden for SolrCloud. We started in a mess, thinking, eventually we clear the overgrowth, and we are all good. And then we started building our house and that garden went wild with a life of it's own.

And our development practices, amazingly above many many many groups and standards out there, is woefully inaccurate for what we are doing.

"Test pass, I'm not sure about all this but I'm going to commit" (Tests never pass, must be a lie anyway)
"Leaving on vacation, going to fire this in"
"No one has looked at this huge thing, it's been a while, going to commit"
*commit*

And comments to that affect pretty much wrap up our careful and thoughtful attitude.

And then of course we come and clean up after, careful gardeners that we are ... no, we don't. We are not setup to be gardeners, we are not trying, even if we do, I only like grass and screw the other plants.

Without SolrCloud, Solr wold be in trouble as well. Brute that it is, it could go a few more rounds. SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it. But it is, and it cannot take the beating that the brute does.

- Mark

On Tue, Nov 5, 2019 at 5:19 AM Mark Miller <[hidden email]> wrote:
Basically I can fix 99% of this without you guys - with simple care and effort and time that non of you are likely in the circumstances of being able to duplicate.. Been there done that, made it 100x-1000x faster to boot and added all kinds of fun.

But I can't build the rest of Solr. I don't care about facets. So let's meet half way.

On Tue, Nov 5, 2019 at 5:14 AM Mark Miller <[hidden email]> wrote:
There are 10,000 problems here.

So if you eventually land on one possible solution you agree on, we a little closer.

There is no problem with the current design. Design's can always be improved, sure. I've made this one fast. You won't believe me fast. The low hanging fruit is astronomical, there is more fruit above that.

We never focused on performance. Or at least didn't. That's after we harden.

Except performance is the key to everything.

SolrCloud is not the only problem. The design of Solr, of SolrCloud, they are fine. Change them, I don't care. Later. They are not a problem.

But Solr has as many problems as SolrCloud at this point. This just mater  a whole hell of lot less unless they are messing with SolrCloud. Standalone is more of a brute.

We have 60 modules that are interconnected. We have a huge code base. That is also fine.

We don't tend our garden. That's not fine. I've tended the garden before without one - more than once before. It's a great damn garden. You guys only get to see it grown over and full of weeds.

Anyway, no redesign, no library, no nothing like that gonna save this.

This is hardly concrete awareness of a problem here. The awareness to figure out what actually are the problems and what must be done - that's expensive shit these days if you ask me. I've been wrong lots tough.






On Mon, Nov 4, 2019 at 2:26 PM Jörn Franke <[hidden email]> wrote:
I guess this is also a bit normal with software that grows over the years.
One could also say that one writes the current use cases and interesting future use cases for Solr in a document and designs from scratch new - taking only the good pieces out of the existing software.
Of course there is a certain amount of time where you need to maintain both - but this will be also the case for a major rewrite.

> Am 04.11.2019 um 20:58 schrieb Erick Erickson <[hidden email]>:
>
> If Curator would make that easier and we’re doing major surgery anyway….
>
> But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.
>
> Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.
>
> But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
> 1> are we at that point?
> 2> are we going to put the effort into rewriting some of the worst offenders?
>
>
>
>> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>>
>> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>>
>> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
>> Bram:
>>
>> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>>
>> Erick
>>
>>>> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>>>
>>>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>>>
>>> On an unrelated project, I've stopped using "raw" ZK client access and
>>> have switched to Curator. The API is a fair bit easier to work with, and
>>> it results in less ugly code. I realize that this won't go very far in
>>> resolving more fundamental issues, but it might be something that can
>>> help improve the shape of the code.
>>>
>>> - Bram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



--


--


--


--


--
--
--


--


--
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Mark Miller-3
 Bottom line, garden is a fricken mess and there are tons of attempts to solve shit the wrong way cause nobody understands what was intended here. At least one of those is on me.

Mark

On Tue, Nov 5, 2019 at 10:16 PM Mark Miller <[hidden email]> wrote:
bq.   SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it.

And this is why I'm so devastated by the Overseer. I don't blame anyone person. Where was the manual, where was my intervention. I whispered Overseer and cut one more thing off my list of responsibilities.

But the overseer is supposed to be so light weight and easy breezy. Giving up leader shop at the most signs of trouble, keeping communication small and tight with tiny json distrib queue pub/sub updates. Little about stat change, hardly needed. Hardly ever talking to Zookeeper.

Our whole system is not moved hard against this, but nothing so much as the Overseer. It has very scary, very tricky, custom ZK code. It has major communication with ZK. It has little to weak ability to properly throttle itself or deal with things intelligently. It's almost a brute force tactic. And it clings to being Overseer like a moth to flame. It's designed to be on a dedicated hardwar and then mostly to not make any reasonable use of that hardware.

I blame me more than anyone for that. I am mad at me. It's just an absolute brain bash with a sledge hammer to the system. And i never communicated the system very well. I was overloaded.

On Tue, Nov 5, 2019 at 11:01 AM Mark Miller <[hidden email]> wrote:
And now we are to meat of it. Fill in here: https://issues.apache.org/jira/browse/SOLR-13888

We can play in a better world, we can have fun, but some of you are going to have to adjust your ways. In the most convenient way possible. You are all great people, I don't want to cause you annoyance, but there are certain requirements to building an aircraft, and there certain requirements to building this.

On Tue, Nov 5, 2019 at 10:44 AM Mark Miller <[hidden email]> wrote:
If you had any idea how much suffering just that has caused. Not just users, but us. 

Mark 

On Tue, Nov 5, 2019 at 10:38 AM Mark Miller <[hidden email]> wrote:
It’s like 6-7 years since I quickly added a shitty collections API in my free time because we desperately needed SOMETHING. I don’t know if I tried to make it wait for proper state or what , it was a stub to try get things moving. That call, to this day, along with all our other checks, until some tests ones recently, is garbage.

If I downloaded a database, and a lot the time, after the create a database call returned, my database was not ready, I’d saw wow. Terrible bug got through. If it was a persistent issue for over half a decade? My god. 

Look I just spent that half decade upgrading from Solr 4 to whatever. I was mostly out of the loop. But this is crazy, me in there too.

Mark

On Tue, Nov 5, 2019 at 10:05 AM Mark Miller <[hidden email]> wrote:
I'll tell you what guys, development right now sucks. I don't enjoy.

But when I start to put things in shape? I get this smile, and I start going with the feeling of I don't need you guys, I don't users, I don't need a job, cause just this is figgen nice.

On Tue, Nov 5, 2019 at 9:59 AM Mark Miller <[hidden email]> wrote:
I suppose I should toss one more out.

Hell yes, we will be using curator.

It's insane for any group larger than 2-3 to directly use ZooKeeper. Even for that group, you want some damn good reasons to not use curator. We can start using more assembly too (joke Yonik).

Curator was an option initially. Then it was yet another project hosted by Netflix. Now it is essential.


- Mark

On Tue, Nov 5, 2019 at 9:41 AM Mark Miller <[hidden email]> wrote:
And look, we started pretty deep in the hole. Solr started with tons of bug or limitations that hardly mattered to it and hit SolrCloud in the eye like a train. And we were not setup to deal with that.

We never had a nice garden for SolrCloud. We started in a mess, thinking, eventually we clear the overgrowth, and we are all good. And then we started building our house and that garden went wild with a life of it's own.

And our development practices, amazingly above many many many groups and standards out there, is woefully inaccurate for what we are doing.

"Test pass, I'm not sure about all this but I'm going to commit" (Tests never pass, must be a lie anyway)
"Leaving on vacation, going to fire this in"
"No one has looked at this huge thing, it's been a while, going to commit"
*commit*

And comments to that affect pretty much wrap up our careful and thoughtful attitude.

And then of course we come and clean up after, careful gardeners that we are ... no, we don't. We are not setup to be gardeners, we are not trying, even if we do, I only like grass and screw the other plants.

Without SolrCloud, Solr wold be in trouble as well. Brute that it is, it could go a few more rounds. SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it. But it is, and it cannot take the beating that the brute does.

- Mark

On Tue, Nov 5, 2019 at 5:19 AM Mark Miller <[hidden email]> wrote:
Basically I can fix 99% of this without you guys - with simple care and effort and time that non of you are likely in the circumstances of being able to duplicate.. Been there done that, made it 100x-1000x faster to boot and added all kinds of fun.

But I can't build the rest of Solr. I don't care about facets. So let's meet half way.

On Tue, Nov 5, 2019 at 5:14 AM Mark Miller <[hidden email]> wrote:
There are 10,000 problems here.

So if you eventually land on one possible solution you agree on, we a little closer.

There is no problem with the current design. Design's can always be improved, sure. I've made this one fast. You won't believe me fast. The low hanging fruit is astronomical, there is more fruit above that.

We never focused on performance. Or at least didn't. That's after we harden.

Except performance is the key to everything.

SolrCloud is not the only problem. The design of Solr, of SolrCloud, they are fine. Change them, I don't care. Later. They are not a problem.

But Solr has as many problems as SolrCloud at this point. This just mater  a whole hell of lot less unless they are messing with SolrCloud. Standalone is more of a brute.

We have 60 modules that are interconnected. We have a huge code base. That is also fine.

We don't tend our garden. That's not fine. I've tended the garden before without one - more than once before. It's a great damn garden. You guys only get to see it grown over and full of weeds.

Anyway, no redesign, no library, no nothing like that gonna save this.

This is hardly concrete awareness of a problem here. The awareness to figure out what actually are the problems and what must be done - that's expensive shit these days if you ask me. I've been wrong lots tough.






On Mon, Nov 4, 2019 at 2:26 PM Jörn Franke <[hidden email]> wrote:
I guess this is also a bit normal with software that grows over the years.
One could also say that one writes the current use cases and interesting future use cases for Solr in a document and designs from scratch new - taking only the good pieces out of the existing software.
Of course there is a certain amount of time where you need to maintain both - but this will be also the case for a major rewrite.

> Am 04.11.2019 um 20:58 schrieb Erick Erickson <[hidden email]>:
>
> If Curator would make that easier and we’re doing major surgery anyway….
>
> But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.
>
> Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.
>
> But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
> 1> are we at that point?
> 2> are we going to put the effort into rewriting some of the worst offenders?
>
>
>
>> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>>
>> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>>
>> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
>> Bram:
>>
>> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>>
>> Erick
>>
>>>> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>>>
>>>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>>>
>>> On an unrelated project, I've stopped using "raw" ZK client access and
>>> have switched to Curator. The API is a fair bit easier to work with, and
>>> it results in less ugly code. I realize that this won't go very far in
>>> resolving more fundamental issues, but it might be something that can
>>> help improve the shape of the code.
>>>
>>> - Bram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



--


--


--


--


--
--
--


--


--
--
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Scott Blum
In reply to this post by Mark Miller-3
WHO OVERSEES THE OVERSEER????

On Tue, Nov 5, 2019 at 5:16 PM Mark Miller <[hidden email]> wrote:
bq.   SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it.

And this is why I'm so devastated by the Overseer. I don't blame anyone person. Where was the manual, where was my intervention. I whispered Overseer and cut one more thing off my list of responsibilities.

But the overseer is supposed to be so light weight and easy breezy. Giving up leader shop at the most signs of trouble, keeping communication small and tight with tiny json distrib queue pub/sub updates. Little about stat change, hardly needed. Hardly ever talking to Zookeeper.

Our whole system is not moved hard against this, but nothing so much as the Overseer. It has very scary, very tricky, custom ZK code. It has major communication with ZK. It has little to weak ability to properly throttle itself or deal with things intelligently. It's almost a brute force tactic. And it clings to being Overseer like a moth to flame. It's designed to be on a dedicated hardwar and then mostly to not make any reasonable use of that hardware.

I blame me more than anyone for that. I am mad at me. It's just an absolute brain bash with a sledge hammer to the system. And i never communicated the system very well. I was overloaded.

On Tue, Nov 5, 2019 at 11:01 AM Mark Miller <[hidden email]> wrote:
And now we are to meat of it. Fill in here: https://issues.apache.org/jira/browse/SOLR-13888

We can play in a better world, we can have fun, but some of you are going to have to adjust your ways. In the most convenient way possible. You are all great people, I don't want to cause you annoyance, but there are certain requirements to building an aircraft, and there certain requirements to building this.

On Tue, Nov 5, 2019 at 10:44 AM Mark Miller <[hidden email]> wrote:
If you had any idea how much suffering just that has caused. Not just users, but us. 

Mark 

On Tue, Nov 5, 2019 at 10:38 AM Mark Miller <[hidden email]> wrote:
It’s like 6-7 years since I quickly added a shitty collections API in my free time because we desperately needed SOMETHING. I don’t know if I tried to make it wait for proper state or what , it was a stub to try get things moving. That call, to this day, along with all our other checks, until some tests ones recently, is garbage.

If I downloaded a database, and a lot the time, after the create a database call returned, my database was not ready, I’d saw wow. Terrible bug got through. If it was a persistent issue for over half a decade? My god. 

Look I just spent that half decade upgrading from Solr 4 to whatever. I was mostly out of the loop. But this is crazy, me in there too.

Mark

On Tue, Nov 5, 2019 at 10:05 AM Mark Miller <[hidden email]> wrote:
I'll tell you what guys, development right now sucks. I don't enjoy.

But when I start to put things in shape? I get this smile, and I start going with the feeling of I don't need you guys, I don't users, I don't need a job, cause just this is figgen nice.

On Tue, Nov 5, 2019 at 9:59 AM Mark Miller <[hidden email]> wrote:
I suppose I should toss one more out.

Hell yes, we will be using curator.

It's insane for any group larger than 2-3 to directly use ZooKeeper. Even for that group, you want some damn good reasons to not use curator. We can start using more assembly too (joke Yonik).

Curator was an option initially. Then it was yet another project hosted by Netflix. Now it is essential.


- Mark

On Tue, Nov 5, 2019 at 9:41 AM Mark Miller <[hidden email]> wrote:
And look, we started pretty deep in the hole. Solr started with tons of bug or limitations that hardly mattered to it and hit SolrCloud in the eye like a train. And we were not setup to deal with that.

We never had a nice garden for SolrCloud. We started in a mess, thinking, eventually we clear the overgrowth, and we are all good. And then we started building our house and that garden went wild with a life of it's own.

And our development practices, amazingly above many many many groups and standards out there, is woefully inaccurate for what we are doing.

"Test pass, I'm not sure about all this but I'm going to commit" (Tests never pass, must be a lie anyway)
"Leaving on vacation, going to fire this in"
"No one has looked at this huge thing, it's been a while, going to commit"
*commit*

And comments to that affect pretty much wrap up our careful and thoughtful attitude.

And then of course we come and clean up after, careful gardeners that we are ... no, we don't. We are not setup to be gardeners, we are not trying, even if we do, I only like grass and screw the other plants.

Without SolrCloud, Solr wold be in trouble as well. Brute that it is, it could go a few more rounds. SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it. But it is, and it cannot take the beating that the brute does.

- Mark

On Tue, Nov 5, 2019 at 5:19 AM Mark Miller <[hidden email]> wrote:
Basically I can fix 99% of this without you guys - with simple care and effort and time that non of you are likely in the circumstances of being able to duplicate.. Been there done that, made it 100x-1000x faster to boot and added all kinds of fun.

But I can't build the rest of Solr. I don't care about facets. So let's meet half way.

On Tue, Nov 5, 2019 at 5:14 AM Mark Miller <[hidden email]> wrote:
There are 10,000 problems here.

So if you eventually land on one possible solution you agree on, we a little closer.

There is no problem with the current design. Design's can always be improved, sure. I've made this one fast. You won't believe me fast. The low hanging fruit is astronomical, there is more fruit above that.

We never focused on performance. Or at least didn't. That's after we harden.

Except performance is the key to everything.

SolrCloud is not the only problem. The design of Solr, of SolrCloud, they are fine. Change them, I don't care. Later. They are not a problem.

But Solr has as many problems as SolrCloud at this point. This just mater  a whole hell of lot less unless they are messing with SolrCloud. Standalone is more of a brute.

We have 60 modules that are interconnected. We have a huge code base. That is also fine.

We don't tend our garden. That's not fine. I've tended the garden before without one - more than once before. It's a great damn garden. You guys only get to see it grown over and full of weeds.

Anyway, no redesign, no library, no nothing like that gonna save this.

This is hardly concrete awareness of a problem here. The awareness to figure out what actually are the problems and what must be done - that's expensive shit these days if you ask me. I've been wrong lots tough.






On Mon, Nov 4, 2019 at 2:26 PM Jörn Franke <[hidden email]> wrote:
I guess this is also a bit normal with software that grows over the years.
One could also say that one writes the current use cases and interesting future use cases for Solr in a document and designs from scratch new - taking only the good pieces out of the existing software.
Of course there is a certain amount of time where you need to maintain both - but this will be also the case for a major rewrite.

> Am 04.11.2019 um 20:58 schrieb Erick Erickson <[hidden email]>:
>
> If Curator would make that easier and we’re doing major surgery anyway….
>
> But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.
>
> Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.
>
> But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
> 1> are we at that point?
> 2> are we going to put the effort into rewriting some of the worst offenders?
>
>
>
>> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>>
>> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>>
>> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
>> Bram:
>>
>> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>>
>> Erick
>>
>>>> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>>>
>>>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>>>
>>> On an unrelated project, I've stopped using "raw" ZK client access and
>>> have switched to Curator. The API is a fair bit easier to work with, and
>>> it results in less ugly code. I realize that this won't go very far in
>>> resolving more fundamental issues, but it might be something that can
>>> help improve the shape of the code.
>>>
>>> - Bram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



--


--


--


--


--
--
--


--


--
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Mark Miller-3
Another Overseer :)

I don't mean to contradict your curator statement either - I talk with the authority of god but with the confidence of no one ;)

On Tue, Nov 5, 2019 at 7:44 PM Scott Blum <[hidden email]> wrote:
WHO OVERSEES THE OVERSEER????

On Tue, Nov 5, 2019 at 5:16 PM Mark Miller <[hidden email]> wrote:
bq.   SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it.

And this is why I'm so devastated by the Overseer. I don't blame anyone person. Where was the manual, where was my intervention. I whispered Overseer and cut one more thing off my list of responsibilities.

But the overseer is supposed to be so light weight and easy breezy. Giving up leader shop at the most signs of trouble, keeping communication small and tight with tiny json distrib queue pub/sub updates. Little about stat change, hardly needed. Hardly ever talking to Zookeeper.

Our whole system is not moved hard against this, but nothing so much as the Overseer. It has very scary, very tricky, custom ZK code. It has major communication with ZK. It has little to weak ability to properly throttle itself or deal with things intelligently. It's almost a brute force tactic. And it clings to being Overseer like a moth to flame. It's designed to be on a dedicated hardwar and then mostly to not make any reasonable use of that hardware.

I blame me more than anyone for that. I am mad at me. It's just an absolute brain bash with a sledge hammer to the system. And i never communicated the system very well. I was overloaded.

On Tue, Nov 5, 2019 at 11:01 AM Mark Miller <[hidden email]> wrote:
And now we are to meat of it. Fill in here: https://issues.apache.org/jira/browse/SOLR-13888

We can play in a better world, we can have fun, but some of you are going to have to adjust your ways. In the most convenient way possible. You are all great people, I don't want to cause you annoyance, but there are certain requirements to building an aircraft, and there certain requirements to building this.

On Tue, Nov 5, 2019 at 10:44 AM Mark Miller <[hidden email]> wrote:
If you had any idea how much suffering just that has caused. Not just users, but us. 

Mark 

On Tue, Nov 5, 2019 at 10:38 AM Mark Miller <[hidden email]> wrote:
It’s like 6-7 years since I quickly added a shitty collections API in my free time because we desperately needed SOMETHING. I don’t know if I tried to make it wait for proper state or what , it was a stub to try get things moving. That call, to this day, along with all our other checks, until some tests ones recently, is garbage.

If I downloaded a database, and a lot the time, after the create a database call returned, my database was not ready, I’d saw wow. Terrible bug got through. If it was a persistent issue for over half a decade? My god. 

Look I just spent that half decade upgrading from Solr 4 to whatever. I was mostly out of the loop. But this is crazy, me in there too.

Mark

On Tue, Nov 5, 2019 at 10:05 AM Mark Miller <[hidden email]> wrote:
I'll tell you what guys, development right now sucks. I don't enjoy.

But when I start to put things in shape? I get this smile, and I start going with the feeling of I don't need you guys, I don't users, I don't need a job, cause just this is figgen nice.

On Tue, Nov 5, 2019 at 9:59 AM Mark Miller <[hidden email]> wrote:
I suppose I should toss one more out.

Hell yes, we will be using curator.

It's insane for any group larger than 2-3 to directly use ZooKeeper. Even for that group, you want some damn good reasons to not use curator. We can start using more assembly too (joke Yonik).

Curator was an option initially. Then it was yet another project hosted by Netflix. Now it is essential.


- Mark

On Tue, Nov 5, 2019 at 9:41 AM Mark Miller <[hidden email]> wrote:
And look, we started pretty deep in the hole. Solr started with tons of bug or limitations that hardly mattered to it and hit SolrCloud in the eye like a train. And we were not setup to deal with that.

We never had a nice garden for SolrCloud. We started in a mess, thinking, eventually we clear the overgrowth, and we are all good. And then we started building our house and that garden went wild with a life of it's own.

And our development practices, amazingly above many many many groups and standards out there, is woefully inaccurate for what we are doing.

"Test pass, I'm not sure about all this but I'm going to commit" (Tests never pass, must be a lie anyway)
"Leaving on vacation, going to fire this in"
"No one has looked at this huge thing, it's been a while, going to commit"
*commit*

And comments to that affect pretty much wrap up our careful and thoughtful attitude.

And then of course we come and clean up after, careful gardeners that we are ... no, we don't. We are not setup to be gardeners, we are not trying, even if we do, I only like grass and screw the other plants.

Without SolrCloud, Solr wold be in trouble as well. Brute that it is, it could go a few more rounds. SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it. But it is, and it cannot take the beating that the brute does.

- Mark

On Tue, Nov 5, 2019 at 5:19 AM Mark Miller <[hidden email]> wrote:
Basically I can fix 99% of this without you guys - with simple care and effort and time that non of you are likely in the circumstances of being able to duplicate.. Been there done that, made it 100x-1000x faster to boot and added all kinds of fun.

But I can't build the rest of Solr. I don't care about facets. So let's meet half way.

On Tue, Nov 5, 2019 at 5:14 AM Mark Miller <[hidden email]> wrote:
There are 10,000 problems here.

So if you eventually land on one possible solution you agree on, we a little closer.

There is no problem with the current design. Design's can always be improved, sure. I've made this one fast. You won't believe me fast. The low hanging fruit is astronomical, there is more fruit above that.

We never focused on performance. Or at least didn't. That's after we harden.

Except performance is the key to everything.

SolrCloud is not the only problem. The design of Solr, of SolrCloud, they are fine. Change them, I don't care. Later. They are not a problem.

But Solr has as many problems as SolrCloud at this point. This just mater  a whole hell of lot less unless they are messing with SolrCloud. Standalone is more of a brute.

We have 60 modules that are interconnected. We have a huge code base. That is also fine.

We don't tend our garden. That's not fine. I've tended the garden before without one - more than once before. It's a great damn garden. You guys only get to see it grown over and full of weeds.

Anyway, no redesign, no library, no nothing like that gonna save this.

This is hardly concrete awareness of a problem here. The awareness to figure out what actually are the problems and what must be done - that's expensive shit these days if you ask me. I've been wrong lots tough.






On Mon, Nov 4, 2019 at 2:26 PM Jörn Franke <[hidden email]> wrote:
I guess this is also a bit normal with software that grows over the years.
One could also say that one writes the current use cases and interesting future use cases for Solr in a document and designs from scratch new - taking only the good pieces out of the existing software.
Of course there is a certain amount of time where you need to maintain both - but this will be also the case for a major rewrite.

> Am 04.11.2019 um 20:58 schrieb Erick Erickson <[hidden email]>:
>
> If Curator would make that easier and we’re doing major surgery anyway….
>
> But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.
>
> Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.
>
> But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
> 1> are we at that point?
> 2> are we going to put the effort into rewriting some of the worst offenders?
>
>
>
>> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>>
>> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>>
>> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
>> Bram:
>>
>> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>>
>> Erick
>>
>>>> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>>>
>>>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>>>
>>> On an unrelated project, I've stopped using "raw" ZK client access and
>>> have switched to Curator. The API is a fair bit easier to work with, and
>>> it results in less ugly code. I realize that this won't go very far in
>>> resolving more fundamental issues, but it might be something that can
>>> help improve the shape of the code.
>>>
>>> - Bram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



--


--


--


--


--
--
--


--


--


--
Reply | Threaded
Open this post in threaded view
|

Re: SolrCloud is sick.

Mark Miller-3
Sorry you guys got to play therapist for a bit. 10 years to get over, most of it buried. Had to let that beast out.

On Wed, Nov 6, 2019 at 12:33 AM Mark Miller <[hidden email]> wrote:
Another Overseer :)

I don't mean to contradict your curator statement either - I talk with the authority of god but with the confidence of no one ;)

On Tue, Nov 5, 2019 at 7:44 PM Scott Blum <[hidden email]> wrote:
WHO OVERSEES THE OVERSEER????

On Tue, Nov 5, 2019 at 5:16 PM Mark Miller <[hidden email]> wrote:
bq.   SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it.

And this is why I'm so devastated by the Overseer. I don't blame anyone person. Where was the manual, where was my intervention. I whispered Overseer and cut one more thing off my list of responsibilities.

But the overseer is supposed to be so light weight and easy breezy. Giving up leader shop at the most signs of trouble, keeping communication small and tight with tiny json distrib queue pub/sub updates. Little about stat change, hardly needed. Hardly ever talking to Zookeeper.

Our whole system is not moved hard against this, but nothing so much as the Overseer. It has very scary, very tricky, custom ZK code. It has major communication with ZK. It has little to weak ability to properly throttle itself or deal with things intelligently. It's almost a brute force tactic. And it clings to being Overseer like a moth to flame. It's designed to be on a dedicated hardwar and then mostly to not make any reasonable use of that hardware.

I blame me more than anyone for that. I am mad at me. It's just an absolute brain bash with a sledge hammer to the system. And i never communicated the system very well. I was overloaded.

On Tue, Nov 5, 2019 at 11:01 AM Mark Miller <[hidden email]> wrote:
And now we are to meat of it. Fill in here: https://issues.apache.org/jira/browse/SOLR-13888

We can play in a better world, we can have fun, but some of you are going to have to adjust your ways. In the most convenient way possible. You are all great people, I don't want to cause you annoyance, but there are certain requirements to building an aircraft, and there certain requirements to building this.

On Tue, Nov 5, 2019 at 10:44 AM Mark Miller <[hidden email]> wrote:
If you had any idea how much suffering just that has caused. Not just users, but us. 

Mark 

On Tue, Nov 5, 2019 at 10:38 AM Mark Miller <[hidden email]> wrote:
It’s like 6-7 years since I quickly added a shitty collections API in my free time because we desperately needed SOMETHING. I don’t know if I tried to make it wait for proper state or what , it was a stub to try get things moving. That call, to this day, along with all our other checks, until some tests ones recently, is garbage.

If I downloaded a database, and a lot the time, after the create a database call returned, my database was not ready, I’d saw wow. Terrible bug got through. If it was a persistent issue for over half a decade? My god. 

Look I just spent that half decade upgrading from Solr 4 to whatever. I was mostly out of the loop. But this is crazy, me in there too.

Mark

On Tue, Nov 5, 2019 at 10:05 AM Mark Miller <[hidden email]> wrote:
I'll tell you what guys, development right now sucks. I don't enjoy.

But when I start to put things in shape? I get this smile, and I start going with the feeling of I don't need you guys, I don't users, I don't need a job, cause just this is figgen nice.

On Tue, Nov 5, 2019 at 9:59 AM Mark Miller <[hidden email]> wrote:
I suppose I should toss one more out.

Hell yes, we will be using curator.

It's insane for any group larger than 2-3 to directly use ZooKeeper. Even for that group, you want some damn good reasons to not use curator. We can start using more assembly too (joke Yonik).

Curator was an option initially. Then it was yet another project hosted by Netflix. Now it is essential.


- Mark

On Tue, Nov 5, 2019 at 9:41 AM Mark Miller <[hidden email]> wrote:
And look, we started pretty deep in the hole. Solr started with tons of bug or limitations that hardly mattered to it and hit SolrCloud in the eye like a train. And we were not setup to deal with that.

We never had a nice garden for SolrCloud. We started in a mess, thinking, eventually we clear the overgrowth, and we are all good. And then we started building our house and that garden went wild with a life of it's own.

And our development practices, amazingly above many many many groups and standards out there, is woefully inaccurate for what we are doing.

"Test pass, I'm not sure about all this but I'm going to commit" (Tests never pass, must be a lie anyway)
"Leaving on vacation, going to fire this in"
"No one has looked at this huge thing, it's been a while, going to commit"
*commit*

And comments to that affect pretty much wrap up our careful and thoughtful attitude.

And then of course we come and clean up after, careful gardeners that we are ... no, we don't. We are not setup to be gardeners, we are not trying, even if we do, I only like grass and screw the other plants.

Without SolrCloud, Solr wold be in trouble as well. Brute that it is, it could go a few more rounds. SolrCloud is a ballerina. Doesn't look it, cause we dont take care of it. But it is, and it cannot take the beating that the brute does.

- Mark

On Tue, Nov 5, 2019 at 5:19 AM Mark Miller <[hidden email]> wrote:
Basically I can fix 99% of this without you guys - with simple care and effort and time that non of you are likely in the circumstances of being able to duplicate.. Been there done that, made it 100x-1000x faster to boot and added all kinds of fun.

But I can't build the rest of Solr. I don't care about facets. So let's meet half way.

On Tue, Nov 5, 2019 at 5:14 AM Mark Miller <[hidden email]> wrote:
There are 10,000 problems here.

So if you eventually land on one possible solution you agree on, we a little closer.

There is no problem with the current design. Design's can always be improved, sure. I've made this one fast. You won't believe me fast. The low hanging fruit is astronomical, there is more fruit above that.

We never focused on performance. Or at least didn't. That's after we harden.

Except performance is the key to everything.

SolrCloud is not the only problem. The design of Solr, of SolrCloud, they are fine. Change them, I don't care. Later. They are not a problem.

But Solr has as many problems as SolrCloud at this point. This just mater  a whole hell of lot less unless they are messing with SolrCloud. Standalone is more of a brute.

We have 60 modules that are interconnected. We have a huge code base. That is also fine.

We don't tend our garden. That's not fine. I've tended the garden before without one - more than once before. It's a great damn garden. You guys only get to see it grown over and full of weeds.

Anyway, no redesign, no library, no nothing like that gonna save this.

This is hardly concrete awareness of a problem here. The awareness to figure out what actually are the problems and what must be done - that's expensive shit these days if you ask me. I've been wrong lots tough.






On Mon, Nov 4, 2019 at 2:26 PM Jörn Franke <[hidden email]> wrote:
I guess this is also a bit normal with software that grows over the years.
One could also say that one writes the current use cases and interesting future use cases for Solr in a document and designs from scratch new - taking only the good pieces out of the existing software.
Of course there is a certain amount of time where you need to maintain both - but this will be also the case for a major rewrite.

> Am 04.11.2019 um 20:58 schrieb Erick Erickson <[hidden email]>:
>
> If Curator would make that easier and we’re doing major surgery anyway….
>
> But yeah, a nifty, new, more modern tool isn’t going to magically help if the design is flawed.
>
> Or, if I’m putting my philosophical hat on, code doesn’t get gnarly intentionally. It gets gnarly because there are a bunch of problems to be solved and you don’t know what they are until you run into them. And it’s always a tension between fixing it enough to get by and fixing it by refactoring/redesign.
>
> But eventually “fixing it enough to get by” totters under it’s own weight and becomes increasingly fragile and you must take the hit and redo major portions of it. The questions now are:
> 1> are we at that point?
> 2> are we going to put the effort into rewriting some of the worst offenders?
>
>
>
>> On Nov 4, 2019, at 1:28 PM, Scott Blum <[hidden email]> wrote:
>>
>> Figuring out a better overall algorithmic & data structure design that's an order of magnitude improvement seems far more important than swapping out libraries.  And I say this as a Curator fan and committer. ;)
>>
>> On Mon, Nov 4, 2019 at 11:44 AM Erick Erickson <[hidden email]> wrote:
>> Bram:
>>
>> Using Curator has been proposed before. It would require significant refactoring b/c of how deeply entwined raw ZK is in the code. That said, if we’re going to do major surgery it may be the right time to consider it.
>>
>> Erick
>>
>>>> On Nov 4, 2019, at 9:24 AM, Bram Van Dam <[hidden email]> wrote:
>>>
>>>> SolrCloud is sick right now. The way low level Zookeeper is handeled
>>>
>>> On an unrelated project, I've stopped using "raw" ZK client access and
>>> have switched to Curator. The API is a fair bit easier to work with, and
>>> it results in less ugly code. I realize that this won't go very far in
>>> resolving more fundamental issues, but it might be something that can
>>> help improve the shape of the code.
>>>
>>> - Bram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [hidden email]
>>> For additional commands, e-mail: [hidden email]
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> For additional commands, e-mail: [hidden email]
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]



--


--


--


--


--
--
--


--


--


--


--
12