Tika coding style (Was: [jira] Commented: (TIKA-6) Port Nutch (or better) MimeType detection system into Tika)

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Tika coding style (Was: [jira] Commented: (TIKA-6) Port Nutch (or better) MimeType detection system into Tika)

Jukka Zitting
Hi,

On 9/21/07, Chris A. Mattmann (JIRA) <[hidden email]> wrote:
> As an FYI: do we want to adopt the Sun convention across the board here?
> What do others think?

I'd prefer Sun conventions (with spaces instead of tabs for indentation).

BR,

Jukka Zitting
Reply | Threaded
Open this post in threaded view
|

Re: Tika coding style (Was: [jira] Commented: (TIKA-6) Port Nutch (or better) MimeType detection system into Tika)

Keith R. Bennett
Jukka Zitting wrote
Hi,

On 9/21/07, Chris A. Mattmann (JIRA) <jira@apache.org> wrote:
> As an FYI: do we want to adopt the Sun convention across the board here?
> What do others think?

I'd prefer Sun conventions (with spaces instead of tabs for indentation).

BR,

Jukka Zitting
All -

Here are my opinions, for what they're worth...

BTW, the Sun coding conventions can be found at http://java.sun.com/docs/codeconv/CodeConventions.pdf.

I think that on the whole, the Sun conventions are excellent.


--- Exceptions ---

1) (Already mentioned by Jukka) Contrary to Sun's recommendation, tabs should not be used for indentation.  I wholeheartedly agree.  My suggestion would be to ban tabs altogether, except in string literals, where they would be represented as "\t".

2) I believe we need to make an exception to Section 3.1.1, which says that the author's name should be included.  (The example does not include author names, but the descriptive text does.)  I believe we've decided not to include author names.


--- Clarify? ---

3) Section 4.2 Wrapping lines:  Breaking before an operator includes the dot operator, right?  As in:

   .......................................foo(bar
       .someReallyReallyReallyReallyLongMethodName())



--- I Will Conform If Necessary, But Here's My Peeve ---

I realize it would not be a good idea to spend time debating minor points, but I figured I'd mention these just in case there is a consensus to allow these exceptions.

4) 6.2 includes "Don’t wait to declare variables until their first use".  I believe that the code is clearer when one *does* wait until their first use because doing so:

a) eliminates the need for an additional unnecessarily verbose line and possibly initialization.

b) communicates more information to the reader, namely that the variable is not used until the point at which it is declared/initialized.


5) Section 8.1, Methods should be separated by only 1 blank line between methods:  Vertical space is one of the tools we have to communicate.  The amount of vertical space can be used to communicate the degree of connection or separateness.  Single lines are appropriate within methods to separate sections (even better to have smaller methods that don't need this).  I use two lines between methods to more clearly indicate to the reader the boundary between methods.

Regards,
Keith
Reply | Threaded
Open this post in threaded view
|

Re: Tika coding style (Was: [jira] Commented: (TIKA-6) Port Nutch (or better) MimeType detection system into Tika)

Sami Siren-2
In reply to this post by Jukka Zitting
Jukka Zitting wrote:
> Hi,
>
> On 9/21/07, Chris A. Mattmann (JIRA) <[hidden email]> wrote:
>> As an FYI: do we want to adopt the Sun convention across the board here?
>> What do others think?
>
> I'd prefer Sun conventions (with spaces instead of tabs for indentation).

Currently there are differences in indentation 2 vs 4 spaces. Which one
should we use?

--
 Sami Siren
Reply | Threaded
Open this post in threaded view
|

Re: Tika coding style (Was: [jira] Commented: (TIKA-6) Port Nutch (or better) MimeType detection system into Tika)

Keith R. Bennett
Sami -

Although you directed the question to Jukka, I'll express my opinion too (and I suspect that from what he said he will agree with me).

In my opinion, 4 characters express the logical hierarchy much more clearly than 2 characters, so 4 should be used.

FWIW, the Sun conventions are pretty clear about indentation levels being visually 4 characters width in difference, although they introduce tab chars which wreak havoc on the readability for many editor configurations.

- Keith


Sami Siren-2 wrote
Jukka Zitting wrote:
> Hi,
>
> On 9/21/07, Chris A. Mattmann (JIRA) <jira@apache.org> wrote:
>> As an FYI: do we want to adopt the Sun convention across the board here?
>> What do others think?
>
> I'd prefer Sun conventions (with spaces instead of tabs for indentation).

Currently there are differences in indentation 2 vs 4 spaces. Which one
should we use?

--
 Sami Siren
Reply | Threaded
Open this post in threaded view
|

Re: Tika coding style (Was: [jira] Commented: (TIKA-6) Port Nutch (or better) MimeType detection system into Tika)

Doug Cutting
In reply to this post by Sami Siren-2
Sami Siren wrote:
> Currently there are differences in indentation 2 vs 4 spaces. Which one
> should we use?

I prefer 2-space indentation, since it makes it much easier to fit code
into 80 columns.

Doug
Reply | Threaded
Open this post in threaded view
|

Re: Tika coding style (Was: [jira] Commented: (TIKA-6) Port Nutch (or better) MimeType detection system into Tika)

Jukka Zitting
Hi,

On 10/1/07, Doug Cutting <[hidden email]> wrote:
> Sami Siren wrote:
> > Currently there are differences in indentation 2 vs 4 spaces. Which one
> > should we use?
>
> I prefer 2-space indentation, since it makes it much easier to fit code
> into 80 columns.

I prefer wider indentations since they make it very clear when you
really should start breaking your code into smaller methods.

Also, 4 spaces is what the Sun Java conventions explicitly mention as
the unit of indentation.

BR,

Jukka Zitting
Reply | Threaded
Open this post in threaded view
|

Re: Tika coding style (Was: [jira] Commented: (TIKA-6) Port Nutch (or better) MimeType detection system into Tika)

chrismattmann
Folks,

I'm fine with either 2 space or 4 space indentations. I'm -1 for any tab
indentations though, and I think we all agree on that.

Cheers,
  Chris



On 10/5/07 5:03 AM, "Jukka Zitting" <[hidden email]> wrote:

> Hi,
>
> On 10/1/07, Doug Cutting <[hidden email]> wrote:
>> Sami Siren wrote:
>>> Currently there are differences in indentation 2 vs 4 spaces. Which one
>>> should we use?
>>
>> I prefer 2-space indentation, since it makes it much easier to fit code
>> into 80 columns.
>
> I prefer wider indentations since they make it very clear when you
> really should start breaking your code into smaller methods.
>
> Also, 4 spaces is what the Sun Java conventions explicitly mention as
> the unit of indentation.
>
> BR,
>
> Jukka Zitting

______________________________________________
Chris Mattmann, Ph.D.
[hidden email]
Cognizant Development Engineer
Early Detection Research Network Project

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                     Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.