Re: Missing some trunk commit history

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Missing some trunk commit history

Sunil G
Hi Eric.

A branch merge has happened during that time, and hence you might have seen
some old commits from that branch. If you go down further, you could see
those commits.

Copied from my git log:

commit 40b0045ebe0752cd3d1d09be00acbabdea983799
Author: Weiwei Yang <[hidden email]>
Date:   Wed Dec 6 17:52:41 2017 +0800

    YARN-7610. Extend Distributed Shell to support launching job with
opportunistic containers. Contributed by Weiwei Yang.

commit 56b1ff80dd9fbcde8d21a604eff0babb3a16418f
Author: Xiao Chen <[hidden email]>
Date:   Tue Dec 5 20:48:02 2017 -0800

    HDFS-12872. EC Checksum broken when BlockAccessToken is enabled.

commit 05c347fe51c01494ed8110f8f116a01c90205f13
Author: Weiwei Yang <[hidden email]>
Date:   Wed Dec 6 12:21:52 2017 +0800

    YARN-7611. Node manager web UI should display container type in
containers page. Contributed by Weiwei Yang.

commit 73b86979d661f4ad56fcfc3a05a403dfcb2a860e
Author: Kai Zheng <[hidden email]>
Date:   Wed Dec 6 12:01:36 2017 +0800

    HADOOP-15039. Move SemaphoredDelegatingExecutor to hadoop-common.
Contributed by Genmao Yu

commit 44b06d34a537f8b558007cc92a5d1a8e59b5d86b
Author: Akira Ajisaka <[hidden email]>
Date:   Wed Dec 6 11:40:33 2017 +0900

    HDFS-12889. Router UI is missing robots.txt file. Contributed by Bharat
Viswanadham.

commit 0311cf05358cd75388f48f048c44fba52ec90f00
Author: Wangda Tan <[hidden email]>
Date:   Tue Dec 5 13:09:49 2017 -0800

    YARN-7381. Enable the configuration:
yarn.nodemanager.log-container-debug-info.enabled by default in
yarn-default.xml. (Xuan Gong via wangda)

    Change-Id: I1ed58dafad5cc276eea5c0b0813cf04f57d73a87

commit 6555af81a26b0b72ec3bee7034e01f5bd84b1564
Author: Aaron Fabbri <[hidden email]>
Date:   Tue Dec 5 11:06:32 2017 -0800

    HADOOP-14475 Metrics of S3A don't print out when enabled. Contributed
by Younger and Sean Mackrory.



- Sunil


On Fri, Dec 15, 2017 at 12:29 AM Eric Yang <[hidden email]> wrote:

> Hi all,
>
> While troubleshooting a trunk build failure, I notice the commit history
> for trunk between Nov 30th to Dec 6th are squashed or disappeared for no
> reason.  This seems to have taken place in the last 24 hours.  I can see
> the commit logs from github UI.  When doing a new clone from Apache Git and
> Github, the commit histories between those dates are gone.  I usually
> maintain two git repositories, one for testing and one for development.
> Both repositories were sync up with github frequently, and only test
> repository was updated today and the missing history only reflect in test
> repository.  This is the reason that I have the impression that this might
> have happened in the last 24 hours.  I did some spot check to see if the
> missing commits are in trunk.  The code seems to be in place, and only
> commit history is gone.
>
> Is there any way to fix the commit history?  Hopefully this is not a git
> bug, but some peer review might find out the root cause that could help to
> understand the damage.  Thank you
>
> Regards,
> Eric
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Missing some trunk commit history

Eric Yang-3
Hi Chris,

I am looking for a way to reduce time spent on testing latest commits.
When trunk broke, it is usually someone didn't do a full build to test the
patch.  If a feature merge results in 100+ commit additions.  It is
difficult to tell where breakage occurred.  A single merge point is easier
to isolate the problem for other people who did not do the merge.  If I did
a git pull, and the last tracking hash code is 5 pages beyond git log.  It
can cause panic like I did this morning, and quite exhausting to find the
actual commit that broke trunk and attempt to test the 100+ commits
individually with full build to isolate the problem.  People who did the
feature merge is likely already did the full build test to ensure they
didn't break trunk, but there is no easy indicator where the rebase start
and ends.  Therefore, other people will have to spend extra time to test
each commit individually.  It reduces the productivity for me to prove that
my pre-commit patch unit test failure was caused by other's check in.  I
lost the entire day to isolate trunk build breakage for node manager was
caused by YARN-7381, and I was only able to find this base on github method
to sort commits by date instead of git log approach of showing commit
histories.  If I was testing this one by one based on git log, then I am
probably not done testing yet.  If we can propose to use merge without
rebase for trunk, it might be more efficient for analyze bugs for
pre-commit builds.

regards,
Eric

On Thu, Dec 14, 2017 at 6:52 PM, Chris Douglas <[hidden email]> wrote:

> Eric-
>
> What problem are you trying to solve? Most of us understand how git works,
> you can omit that. -C
>
> On Thu, Dec 14, 2017 at 6:31 PM Eric Yang <[hidden email]> wrote:
>
> > We are currently requesting committer to commit code base on:
> > https://wiki.apache.org/hadoop/HowToCommit
> >
> > To set branch.autosetuprebase always:
> >
> > Base on the current preference, the history is linear, and it is
> described
> > in this graph as Rebase and Merge:
> >
> >
> > https://wac-cdn.atlassian.com/dam/jcr:df39b1f1-2686-4ee5-
> 90bf-9836783342ce/10.svg?cdnVersion=iq
> >
> > It could cause a false alarm on blaming the wrong person for trunk
> > breakage because it takes more time to iterate through all commits from
> > feature branch, while the recent commits (blue dots), are much further
> back
> > in history base on the rebase.  If it was only one merge commit, it would
> > be faster to skip through the entire branch and find recent breakages.
> >
> > When there are several feature branches merged in short period of time,
> > the extra work done to check history revision of branches took much more
> > time.  This is a pain point for people that care about trunk stability
> but
> > can’t afford all day to run full build base on each commit to isolate the
> > breakage.
> >
> > I understand your usage for looking at multiple branches to find a commit
> > to make sure maintenance branches have the proper commits or backport.
> > Rebase + merge works best for maintenance branches.  However, I am not
> > convinced that rebase + merge strategy is the efficient way to manage
> trunk
> > stability.  Is there be a better way to manage this?  Probably, we can
> > recommend trunk to use merge without rebase, but maintenance branches
> apply
> > rebase + merge strategy.  Thoughts?
> >
> > regards,
> > Eric
> >
> > On 12/14/17, 5:16 PM, "Chris Douglas" <[hidden email]> wrote:
> >
> >     I'm sorry, I literally don't understand what you've written. What do
> > clicks
> >     on github have to do with merges?
> >
> >     Are you talking about git bisect, where one would first identify the
> > branch
> >     where the error was introduced, then run a second regression over the
> >     feature branch? With similar semantics for blame?
> >
> >     Again, I'd rather have the history of the branch, with rebases prior
> to
> >     merge to ensure that feature branches don't create particularly
> > complicated
> >     graphs.
> >
> >     Perhaps I haven't understood the problem you're solving. The thread
> > started
> >     with confusion over dates. Is that the problem? Or that rebases
> create
> >     intermediate states that never existed on the branch (due to
> > conflicts),
> >     and that complicates analysis? -C
> >
> >     On Thu, Dec 14, 2017 at 2:31 PM Eric Yang <[hidden email]>
> > wrote:
> >
> >     > When details are rebased, the number of entries to test through the
> > linear
> >     > history is much more than a merge point to isolate where the error
> > might
> >     > have occurred.  It is similar to traverse a tree structure, for
> each
> >     > branch, there are n branches to walk through.  If we can know where
> > the
> >     > problem is before traverse to individual branches.  It can
> expertise
> > the
> >     > process to find the root cause.  IMHO, I think the number of clicks
> > between
> >     > pagination vs drop down on github branch selection, the later seems
> > more
> >     > work, but it is usually less clicks for feature branches that lived
> > for a
> >     > couple months.
> >     >
> >     > Regards,
> >     > Eric
> >     >
> >     > On 12/14/17, 2:09 PM, "Chris Douglas" <[hidden email]> wrote:
> >     >
> >     >     I'd rather have the history. Otherwise tools like blame point
> > only to
> >     >     a parent/umbrella JIRA, not the issue where the change was
> > discussed.
> >     >
> >     >     We can force a merge commit so it's clear the branch was
> > developed
> >     >     outside the mainline. -C
> >     >
> >     >
> >     >     On Thu, Dec 14, 2017 at 1:18 PM, Eric Yang <
> > [hidden email]>
> >     > wrote:
> >     >     > +1 on squash merge to keep history compressed.  The rebase +
> > merge
> >     > contains good deals, but it is easy to get confused for people that
> > doesn’t
> >     > know about the rebase option is turned on by default for Hadoop.
> >     >     >
> >     >     > Regards,
> >     >     > Eric
> >     >     >
> >     >     > On 12/14/17, 12:06 PM, "Arun Suresh" <[hidden email]>
> > wrote:
> >     >     >
> >     >     >     Another option - atleast for feature branches is to maybe
> > squash
> >     > merge -
> >     >     >     this way we see it as a single commit ? Although we will
> > loose
> >     > the feature
> >     >     >     branch history (I am ok with that though)
> >     >     >
> >     >     >     Cheers
> >     >     >     -Arun
> >     >     >
> >     >     >     On Thu, Dec 14, 2017 at 11:32 AM, Eric Yang <
> >     > [hidden email]> wrote:
> >     >     >
> >     >     >     > Thank you for the pointer.  I guess all merge are done
> > using
> >     > rebase +
> >     >     >     > merge.  This is the reason that timeline is out of
> order.
> >     >     >     >
> >     >     >     > Would it be more useful to merge without rebasing for
> > feature
> >     > branch merge
> >     >     >     > to avoid timeline confusions?  The argument for not
> > rebasing,
> >     > it would be
> >     >     >     > easier to find the root cause of trunk failure was due
> to
> >     > merge or some
> >     >     >     > recent commits.
> >     >     >     >
> >     >     >     > Regards,
> >     >     >     > Eric
> >     >     >     >
> >     >     >     > From: Sunil G <[hidden email]>
> >     >     >     > Date: Thursday, December 14, 2017 at 11:11 AM
> >     >     >     > To: Eric Yang <[hidden email]>
> >     >     >     > Cc: Hadoop Common <[hidden email]>
> >     >     >     > Subject: Re: Missing some trunk commit history
> >     >     >     >
> >     >     >     > Hi Eric.
> >     >     >     >
> >     >     >     > A branch merge has happened during that time, and hence
> > you
> >     > might have
> >     >     >     > seen some old commits from that branch. If you go down
> >     > further, you could
> >     >     >     > see those commits.
> >     >     >     >
> >     >     >     > Copied from my git log:
> >     >     >     >
> >     >     >     > commit 40b0045ebe0752cd3d1d09be00acbabdea983799
> >     >     >     > Author: Weiwei Yang <[hidden email]<mailto:
> > [hidden email]>>
> >     >     >     > Date:   Wed Dec 6 17:52:41 2017 +0800
> >     >     >     >
> >     >     >     >     YARN-7610. Extend Distributed Shell to support
> > launching
> >     > job with
> >     >     >     > opportunistic containers. Contributed by Weiwei Yang.
> >     >     >     >
> >     >     >     > commit 56b1ff80dd9fbcde8d21a604eff0babb3a16418f
> >     >     >     > Author: Xiao Chen <[hidden email]<mailto:
> > [hidden email]>>
> >     >     >     > Date:   Tue Dec 5 20:48:02 2017 -0800
> >     >     >     >
> >     >     >     >     HDFS-12872. EC Checksum broken when
> BlockAccessToken
> > is
> >     > enabled.
> >     >     >     >
> >     >     >     > commit 05c347fe51c01494ed8110f8f116a01c90205f13
> >     >     >     > Author: Weiwei Yang <[hidden email]<mailto:
> > [hidden email]>>
> >     >     >     > Date:   Wed Dec 6 12:21:52 2017 +0800
> >     >     >     >
> >     >     >     >     YARN-7611. Node manager web UI should display
> > container
> >     > type in
> >     >     >     > containers page. Contributed by Weiwei Yang.
> >     >     >     >
> >     >     >     > commit 73b86979d661f4ad56fcfc3a05a403dfcb2a860e
> >     >     >     > Author: Kai Zheng <[hidden email]<mailto:
> >     > zhengkai.zk@alibaba-
> >     >     >     > inc.com>>
> >     >     >     > Date:   Wed Dec 6 12:01:36 2017 +0800
> >     >     >     >
> >     >     >     >     HADOOP-15039. Move SemaphoredDelegatingExecutor to
> >     > hadoop-common.
> >     >     >     > Contributed by Genmao Yu
> >     >     >     >
> >     >     >     > commit 44b06d34a537f8b558007cc92a5d1a8e59b5d86b
> >     >     >     > Author: Akira Ajisaka <[hidden email]<mailto:
> >     > [hidden email]>>
> >     >     >     > Date:   Wed Dec 6 11:40:33 2017 +0900
> >     >     >     >
> >     >     >     >     HDFS-12889. Router UI is missing robots.txt file.
> >     > Contributed by
> >     >     >     > Bharat Viswanadham.
> >     >     >     >
> >     >     >     > commit 0311cf05358cd75388f48f048c44fba52ec90f00
> >     >     >     > Author: Wangda Tan <[hidden email]<mailto:
> > [hidden email]
> >     > >>
> >     >     >     > Date:   Tue Dec 5 13:09:49 2017 -0800
> >     >     >     >
> >     >     >     >     YARN-7381. Enable the configuration:
> >     > yarn.nodemanager.log-container-debug-info.enabled
> >     >     >     > by default in yarn-default.xml. (Xuan Gong via wangda)
> >     >     >     >
> >     >     >     >     Change-Id: I1ed58dafad5cc276eea5c0b0813cf
> 04f57d73a87
> >     >     >     >
> >     >     >     > commit 6555af81a26b0b72ec3bee7034e01f5bd84b1564
> >     >     >     > Author: Aaron Fabbri <[hidden email]<mailto:
> >     > [hidden email]>>
> >     >     >     > Date:   Tue Dec 5 11:06:32 2017 -0800
> >     >     >     >
> >     >     >     >     HADOOP-14475 Metrics of S3A don't print out when
> > enabled.
> >     > Contributed
> >     >     >     > by Younger and Sean Mackrory.
> >     >     >     >
> >     >     >     >
> >     >     >     >
> >     >     >     > - Sunil
> >     >     >     >
> >     >     >     >
> >     >     >     > On Fri, Dec 15, 2017 at 12:29 AM Eric Yang <
> >     > [hidden email]<mailto:
> >     >     >     > [hidden email]>> wrote:
> >     >     >     > Hi all,
> >     >     >     >
> >     >     >     > While troubleshooting a trunk build failure, I notice
> the
> >     > commit history
> >     >     >     > for trunk between Nov 30th to Dec 6th are squashed or
> >     > disappeared for no
> >     >     >     > reason.  This seems to have taken place in the last 24
> > hours.
> >     > I can see
> >     >     >     > the commit logs from github UI.  When doing a new clone
> > from
> >     > Apache Git and
> >     >     >     > Github, the commit histories between those dates are
> > gone.  I
> >     > usually
> >     >     >     > maintain two git repositories, one for testing and one
> > for
> >     > development.
> >     >     >     > Both repositories were sync up with github frequently,
> > and
> >     > only test
> >     >     >     > repository was updated today and the missing history
> only
> >     > reflect in test
> >     >     >     > repository.  This is the reason that I have the
> > impression
> >     > that this might
> >     >     >     > have happened in the last 24 hours.  I did some spot
> > check to
> >     > see if the
> >     >     >     > missing commits are in trunk.  The code seems to be in
> > place,
> >     > and only
> >     >     >     > commit history is gone.
> >     >     >     >
> >     >     >     > Is there any way to fix the commit history?  Hopefully
> > this is
> >     > not a git
> >     >     >     > bug, but some peer review might find out the root cause
> > that
> >     > could help to
> >     >     >     > understand the damage.  Thank you
> >     >     >     >
> >     >     >     > Regards,
> >     >     >     > Eric
> >     >     >     >
> >     >     >
> >     >     >
> >     >
> >     >
> >  ---------------------------------------------------------------------
> >     >     To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.
> apache.org
> >     >     For additional commands, e-mail:
> > [hidden email]
> >     >
> >     >
> >     >
> >     >
> >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Missing some trunk commit history

Chris Douglas
On Thu, Dec 14, 2017 at 9:40 PM, Eric Yang <[hidden email]> wrote:
> I am looking for a way to reduce time spent on testing latest commits.
> [...]
> People who did the
> feature merge is likely already did the full build test to ensure they
> didn't break trunk, but there is no easy indicator where the rebase start
> and ends.

OK, I think I understand. If we force a merge commit (i.e., specify
--no-ff during the merge) then I think that has the property you're
looking for without squashing all the history into a single commit. -C

> Therefore, other people will have to spend extra time to test
> each commit individually.  It reduces the productivity for me to prove that
> my pre-commit patch unit test failure was caused by other's check in.  I
> lost the entire day to isolate trunk build breakage for node manager was
> caused by YARN-7381, and I was only able to find this base on github method
> to sort commits by date instead of git log approach of showing commit
> histories.  If I was testing this one by one based on git log, then I am
> probably not done testing yet.  If we can propose to use merge without
> rebase for trunk, it might be more efficient for analyze bugs for
> pre-commit builds.
>
> regards,
> Eric
>
> On Thu, Dec 14, 2017 at 6:52 PM, Chris Douglas <[hidden email]> wrote:
>
>> Eric-
>>
>> What problem are you trying to solve? Most of us understand how git works,
>> you can omit that. -C
>>
>> On Thu, Dec 14, 2017 at 6:31 PM Eric Yang <[hidden email]> wrote:
>>
>> > We are currently requesting committer to commit code base on:
>> > https://wiki.apache.org/hadoop/HowToCommit
>> >
>> > To set branch.autosetuprebase always:
>> >
>> > Base on the current preference, the history is linear, and it is
>> described
>> > in this graph as Rebase and Merge:
>> >
>> >
>> > https://wac-cdn.atlassian.com/dam/jcr:df39b1f1-2686-4ee5-
>> 90bf-9836783342ce/10.svg?cdnVersion=iq
>> >
>> > It could cause a false alarm on blaming the wrong person for trunk
>> > breakage because it takes more time to iterate through all commits from
>> > feature branch, while the recent commits (blue dots), are much further
>> back
>> > in history base on the rebase.  If it was only one merge commit, it would
>> > be faster to skip through the entire branch and find recent breakages.
>> >
>> > When there are several feature branches merged in short period of time,
>> > the extra work done to check history revision of branches took much more
>> > time.  This is a pain point for people that care about trunk stability
>> but
>> > can’t afford all day to run full build base on each commit to isolate the
>> > breakage.
>> >
>> > I understand your usage for looking at multiple branches to find a commit
>> > to make sure maintenance branches have the proper commits or backport.
>> > Rebase + merge works best for maintenance branches.  However, I am not
>> > convinced that rebase + merge strategy is the efficient way to manage
>> trunk
>> > stability.  Is there be a better way to manage this?  Probably, we can
>> > recommend trunk to use merge without rebase, but maintenance branches
>> apply
>> > rebase + merge strategy.  Thoughts?
>> >
>> > regards,
>> > Eric
>> >
>> > On 12/14/17, 5:16 PM, "Chris Douglas" <[hidden email]> wrote:
>> >
>> >     I'm sorry, I literally don't understand what you've written. What do
>> > clicks
>> >     on github have to do with merges?
>> >
>> >     Are you talking about git bisect, where one would first identify the
>> > branch
>> >     where the error was introduced, then run a second regression over the
>> >     feature branch? With similar semantics for blame?
>> >
>> >     Again, I'd rather have the history of the branch, with rebases prior
>> to
>> >     merge to ensure that feature branches don't create particularly
>> > complicated
>> >     graphs.
>> >
>> >     Perhaps I haven't understood the problem you're solving. The thread
>> > started
>> >     with confusion over dates. Is that the problem? Or that rebases
>> create
>> >     intermediate states that never existed on the branch (due to
>> > conflicts),
>> >     and that complicates analysis? -C
>> >
>> >     On Thu, Dec 14, 2017 at 2:31 PM Eric Yang <[hidden email]>
>> > wrote:
>> >
>> >     > When details are rebased, the number of entries to test through the
>> > linear
>> >     > history is much more than a merge point to isolate where the error
>> > might
>> >     > have occurred.  It is similar to traverse a tree structure, for
>> each
>> >     > branch, there are n branches to walk through.  If we can know where
>> > the
>> >     > problem is before traverse to individual branches.  It can
>> expertise
>> > the
>> >     > process to find the root cause.  IMHO, I think the number of clicks
>> > between
>> >     > pagination vs drop down on github branch selection, the later seems
>> > more
>> >     > work, but it is usually less clicks for feature branches that lived
>> > for a
>> >     > couple months.
>> >     >
>> >     > Regards,
>> >     > Eric
>> >     >
>> >     > On 12/14/17, 2:09 PM, "Chris Douglas" <[hidden email]> wrote:
>> >     >
>> >     >     I'd rather have the history. Otherwise tools like blame point
>> > only to
>> >     >     a parent/umbrella JIRA, not the issue where the change was
>> > discussed.
>> >     >
>> >     >     We can force a merge commit so it's clear the branch was
>> > developed
>> >     >     outside the mainline. -C
>> >     >
>> >     >
>> >     >     On Thu, Dec 14, 2017 at 1:18 PM, Eric Yang <
>> > [hidden email]>
>> >     > wrote:
>> >     >     > +1 on squash merge to keep history compressed.  The rebase +
>> > merge
>> >     > contains good deals, but it is easy to get confused for people that
>> > doesn’t
>> >     > know about the rebase option is turned on by default for Hadoop.
>> >     >     >
>> >     >     > Regards,
>> >     >     > Eric
>> >     >     >
>> >     >     > On 12/14/17, 12:06 PM, "Arun Suresh" <[hidden email]>
>> > wrote:
>> >     >     >
>> >     >     >     Another option - atleast for feature branches is to maybe
>> > squash
>> >     > merge -
>> >     >     >     this way we see it as a single commit ? Although we will
>> > loose
>> >     > the feature
>> >     >     >     branch history (I am ok with that though)
>> >     >     >
>> >     >     >     Cheers
>> >     >     >     -Arun
>> >     >     >
>> >     >     >     On Thu, Dec 14, 2017 at 11:32 AM, Eric Yang <
>> >     > [hidden email]> wrote:
>> >     >     >
>> >     >     >     > Thank you for the pointer.  I guess all merge are done
>> > using
>> >     > rebase +
>> >     >     >     > merge.  This is the reason that timeline is out of
>> order.
>> >     >     >     >
>> >     >     >     > Would it be more useful to merge without rebasing for
>> > feature
>> >     > branch merge
>> >     >     >     > to avoid timeline confusions?  The argument for not
>> > rebasing,
>> >     > it would be
>> >     >     >     > easier to find the root cause of trunk failure was due
>> to
>> >     > merge or some
>> >     >     >     > recent commits.
>> >     >     >     >
>> >     >     >     > Regards,
>> >     >     >     > Eric
>> >     >     >     >
>> >     >     >     > From: Sunil G <[hidden email]>
>> >     >     >     > Date: Thursday, December 14, 2017 at 11:11 AM
>> >     >     >     > To: Eric Yang <[hidden email]>
>> >     >     >     > Cc: Hadoop Common <[hidden email]>
>> >     >     >     > Subject: Re: Missing some trunk commit history
>> >     >     >     >
>> >     >     >     > Hi Eric.
>> >     >     >     >
>> >     >     >     > A branch merge has happened during that time, and hence
>> > you
>> >     > might have
>> >     >     >     > seen some old commits from that branch. If you go down
>> >     > further, you could
>> >     >     >     > see those commits.
>> >     >     >     >
>> >     >     >     > Copied from my git log:
>> >     >     >     >
>> >     >     >     > commit 40b0045ebe0752cd3d1d09be00acbabdea983799
>> >     >     >     > Author: Weiwei Yang <[hidden email]<mailto:
>> > [hidden email]>>
>> >     >     >     > Date:   Wed Dec 6 17:52:41 2017 +0800
>> >     >     >     >
>> >     >     >     >     YARN-7610. Extend Distributed Shell to support
>> > launching
>> >     > job with
>> >     >     >     > opportunistic containers. Contributed by Weiwei Yang.
>> >     >     >     >
>> >     >     >     > commit 56b1ff80dd9fbcde8d21a604eff0babb3a16418f
>> >     >     >     > Author: Xiao Chen <[hidden email]<mailto:
>> > [hidden email]>>
>> >     >     >     > Date:   Tue Dec 5 20:48:02 2017 -0800
>> >     >     >     >
>> >     >     >     >     HDFS-12872. EC Checksum broken when
>> BlockAccessToken
>> > is
>> >     > enabled.
>> >     >     >     >
>> >     >     >     > commit 05c347fe51c01494ed8110f8f116a01c90205f13
>> >     >     >     > Author: Weiwei Yang <[hidden email]<mailto:
>> > [hidden email]>>
>> >     >     >     > Date:   Wed Dec 6 12:21:52 2017 +0800
>> >     >     >     >
>> >     >     >     >     YARN-7611. Node manager web UI should display
>> > container
>> >     > type in
>> >     >     >     > containers page. Contributed by Weiwei Yang.
>> >     >     >     >
>> >     >     >     > commit 73b86979d661f4ad56fcfc3a05a403dfcb2a860e
>> >     >     >     > Author: Kai Zheng <[hidden email]<mailto:
>> >     > zhengkai.zk@alibaba-
>> >     >     >     > inc.com>>
>> >     >     >     > Date:   Wed Dec 6 12:01:36 2017 +0800
>> >     >     >     >
>> >     >     >     >     HADOOP-15039. Move SemaphoredDelegatingExecutor to
>> >     > hadoop-common.
>> >     >     >     > Contributed by Genmao Yu
>> >     >     >     >
>> >     >     >     > commit 44b06d34a537f8b558007cc92a5d1a8e59b5d86b
>> >     >     >     > Author: Akira Ajisaka <[hidden email]<mailto:
>> >     > [hidden email]>>
>> >     >     >     > Date:   Wed Dec 6 11:40:33 2017 +0900
>> >     >     >     >
>> >     >     >     >     HDFS-12889. Router UI is missing robots.txt file.
>> >     > Contributed by
>> >     >     >     > Bharat Viswanadham.
>> >     >     >     >
>> >     >     >     > commit 0311cf05358cd75388f48f048c44fba52ec90f00
>> >     >     >     > Author: Wangda Tan <[hidden email]<mailto:
>> > [hidden email]
>> >     > >>
>> >     >     >     > Date:   Tue Dec 5 13:09:49 2017 -0800
>> >     >     >     >
>> >     >     >     >     YARN-7381. Enable the configuration:
>> >     > yarn.nodemanager.log-container-debug-info.enabled
>> >     >     >     > by default in yarn-default.xml. (Xuan Gong via wangda)
>> >     >     >     >
>> >     >     >     >     Change-Id: I1ed58dafad5cc276eea5c0b0813cf
>> 04f57d73a87
>> >     >     >     >
>> >     >     >     > commit 6555af81a26b0b72ec3bee7034e01f5bd84b1564
>> >     >     >     > Author: Aaron Fabbri <[hidden email]<mailto:
>> >     > [hidden email]>>
>> >     >     >     > Date:   Tue Dec 5 11:06:32 2017 -0800
>> >     >     >     >
>> >     >     >     >     HADOOP-14475 Metrics of S3A don't print out when
>> > enabled.
>> >     > Contributed
>> >     >     >     > by Younger and Sean Mackrory.
>> >     >     >     >
>> >     >     >     >
>> >     >     >     >
>> >     >     >     > - Sunil
>> >     >     >     >
>> >     >     >     >
>> >     >     >     > On Fri, Dec 15, 2017 at 12:29 AM Eric Yang <
>> >     > [hidden email]<mailto:
>> >     >     >     > [hidden email]>> wrote:
>> >     >     >     > Hi all,
>> >     >     >     >
>> >     >     >     > While troubleshooting a trunk build failure, I notice
>> the
>> >     > commit history
>> >     >     >     > for trunk between Nov 30th to Dec 6th are squashed or
>> >     > disappeared for no
>> >     >     >     > reason.  This seems to have taken place in the last 24
>> > hours.
>> >     > I can see
>> >     >     >     > the commit logs from github UI.  When doing a new clone
>> > from
>> >     > Apache Git and
>> >     >     >     > Github, the commit histories between those dates are
>> > gone.  I
>> >     > usually
>> >     >     >     > maintain two git repositories, one for testing and one
>> > for
>> >     > development.
>> >     >     >     > Both repositories were sync up with github frequently,
>> > and
>> >     > only test
>> >     >     >     > repository was updated today and the missing history
>> only
>> >     > reflect in test
>> >     >     >     > repository.  This is the reason that I have the
>> > impression
>> >     > that this might
>> >     >     >     > have happened in the last 24 hours.  I did some spot
>> > check to
>> >     > see if the
>> >     >     >     > missing commits are in trunk.  The code seems to be in
>> > place,
>> >     > and only
>> >     >     >     > commit history is gone.
>> >     >     >     >
>> >     >     >     > Is there any way to fix the commit history?  Hopefully
>> > this is
>> >     > not a git
>> >     >     >     > bug, but some peer review might find out the root cause
>> > that
>> >     > could help to
>> >     >     >     > understand the damage.  Thank you
>> >     >     >     >
>> >     >     >     > Regards,
>> >     >     >     > Eric
>> >     >     >     >
>> >     >     >
>> >     >     >
>> >     >
>> >     >
>> >  ---------------------------------------------------------------------
>> >     >     To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.
>> apache.org
>> >     >     For additional commands, e-mail:
>> > [hidden email]
>> >     >
>> >     >
>> >     >
>> >     >
>> >
>> >
>> >
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Missing some trunk commit history

Eric Yang-4
Good catch on the history votes.  It looks like we didn’t follow our own process because the voting thread wasn’t very clear about –no-ff vs –squash.  I think –no-ff is the right solution to make sure the check point contains the delta of the merge to make sure the person who did the merge, did it right.  In how to commit page, this information is also lacking and auto rebase is turned on.  I added a section to how to commit page with recommendation to use –no-ff option.  Please review for correctness.  Thank you

Regards,
Eric

From: Andrew Wang <[hidden email]>
Date: Friday, December 15, 2017 at 11:06 AM
To: Eric Yang <[hidden email]>
Cc: Chris Douglas <[hidden email]>, Eric Yang <[hidden email]>, Arun Suresh <[hidden email]>, Sunil G <[hidden email]>, Hadoop Common <[hidden email]>
Subject: Re: Missing some trunk commit history

We actually already did:

https://lists.apache.org/thread.html/43cd65c6b6c3c0e8ac2b3c76afd9eff1f78b177fabe9c4a96d9b3d0b@1440189889@%3Ccommon-dev.hadoop.apache.org%3E

On Fri, Dec 15, 2017 at 10:54 AM, Eric Yang <[hidden email]<mailto:[hidden email]>> wrote:
+1 for merge –no-ff for feature merge.
Do we all agree on this optimization for going forward?

Regards,
Eric

On 12/15/17, 10:34 AM, "Chris Douglas" <[hidden email]<mailto:[hidden email]>> wrote:

    On Thu, Dec 14, 2017 at 9:40 PM, Eric Yang <[hidden email]<mailto:[hidden email]>> wrote:
    > I am looking for a way to reduce time spent on testing latest commits.
    > [...]
    > People who did the
    > feature merge is likely already did the full build test to ensure they
    > didn't break trunk, but there is no easy indicator where the rebase start
    > and ends.

    OK, I think I understand. If we force a merge commit (i.e., specify
    --no-ff during the merge) then I think that has the property you're
    looking for without squashing all the history into a single commit. -C

    > Therefore, other people will have to spend extra time to test
    > each commit individually.  It reduces the productivity for me to prove that
    > my pre-commit patch unit test failure was caused by other's check in.  I
    > lost the entire day to isolate trunk build breakage for node manager was
    > caused by YARN-7381, and I was only able to find this base on github method
    > to sort commits by date instead of git log approach of showing commit
    > histories.  If I was testing this one by one based on git log, then I am
    > probably not done testing yet.  If we can propose to use merge without
    > rebase for trunk, it might be more efficient for analyze bugs for
    > pre-commit builds.
    >
    > regards,
    > Eric
    >
    > On Thu, Dec 14, 2017 at 6:52 PM, Chris Douglas <[hidden email]<mailto:[hidden email]>> wrote:
    >
    >> Eric-
    >>
    >> What problem are you trying to solve? Most of us understand how git works,
    >> you can omit that. -C
    >>
    >> On Thu, Dec 14, 2017 at 6:31 PM Eric Yang <[hidden email]<mailto:[hidden email]>> wrote:
    >>
    >> > We are currently requesting committer to commit code base on:
    >> > https://wiki.apache.org/hadoop/HowToCommit
    >> >
    >> > To set branch.autosetuprebase always:
    >> >
    >> > Base on the current preference, the history is linear, and it is
    >> described
    >> > in this graph as Rebase and Merge:
    >> >
    >> >
    >> > https://wac-cdn.atlassian.com/dam/jcr:df39b1f1-2686-4ee5-
    >> 90bf-9836783342ce/10.svg?cdnVersion=iq
    >> >
    >> > It could cause a false alarm on blaming the wrong person for trunk
    >> > breakage because it takes more time to iterate through all commits from
    >> > feature branch, while the recent commits (blue dots), are much further
    >> back
    >> > in history base on the rebase.  If it was only one merge commit, it would
    >> > be faster to skip through the entire branch and find recent breakages.
    >> >
    >> > When there are several feature branches merged in short period of time,
    >> > the extra work done to check history revision of branches took much more
    >> > time.  This is a pain point for people that care about trunk stability
    >> but
    >> > can’t afford all day to run full build base on each commit to isolate the
    >> > breakage.
    >> >
    >> > I understand your usage for looking at multiple branches to find a commit
    >> > to make sure maintenance branches have the proper commits or backport.
    >> > Rebase + merge works best for maintenance branches.  However, I am not
    >> > convinced that rebase + merge strategy is the efficient way to manage
    >> trunk
    >> > stability.  Is there be a better way to manage this?  Probably, we can
    >> > recommend trunk to use merge without rebase, but maintenance branches
    >> apply
    >> > rebase + merge strategy.  Thoughts?
    >> >
    >> > regards,
    >> > Eric
    >> >
    >> > On 12/14/17, 5:16 PM, "Chris Douglas" <[hidden email]<mailto:[hidden email]>> wrote:
    >> >
    >> >     I'm sorry, I literally don't understand what you've written. What do
    >> > clicks
    >> >     on github have to do with merges?
    >> >
    >> >     Are you talking about git bisect, where one would first identify the
    >> > branch
    >> >     where the error was introduced, then run a second regression over the
    >> >     feature branch? With similar semantics for blame?
    >> >
    >> >     Again, I'd rather have the history of the branch, with rebases prior
    >> to
    >> >     merge to ensure that feature branches don't create particularly
    >> > complicated
    >> >     graphs.
    >> >
    >> >     Perhaps I haven't understood the problem you're solving. The thread
    >> > started
    >> >     with confusion over dates. Is that the problem? Or that rebases
    >> create
    >> >     intermediate states that never existed on the branch (due to
    >> > conflicts),
    >> >     and that complicates analysis? -C
    >> >
    >> >     On Thu, Dec 14, 2017 at 2:31 PM Eric Yang <[hidden email]<mailto:[hidden email]>>
    >> > wrote:
    >> >
    >> >     > When details are rebased, the number of entries to test through the
    >> > linear
    >> >     > history is much more than a merge point to isolate where the error
    >> > might
    >> >     > have occurred.  It is similar to traverse a tree structure, for
    >> each
    >> >     > branch, there are n branches to walk through.  If we can know where
    >> > the
    >> >     > problem is before traverse to individual branches.  It can
    >> expertise
    >> > the
    >> >     > process to find the root cause.  IMHO, I think the number of clicks
    >> > between
    >> >     > pagination vs drop down on github branch selection, the later seems
    >> > more
    >> >     > work, but it is usually less clicks for feature branches that lived
    >> > for a
    >> >     > couple months.
    >> >     >
    >> >     > Regards,
    >> >     > Eric
    >> >     >
    >> >     > On 12/14/17, 2:09 PM, "Chris Douglas" <[hidden email]<mailto:[hidden email]>> wrote:
    >> >     >
    >> >     >     I'd rather have the history. Otherwise tools like blame point
    >> > only to
    >> >     >     a parent/umbrella JIRA, not the issue where the change was
    >> > discussed.
    >> >     >
    >> >     >     We can force a merge commit so it's clear the branch was
    >> > developed
    >> >     >     outside the mainline. -C
    >> >     >
    >> >     >
    >> >     >     On Thu, Dec 14, 2017 at 1:18 PM, Eric Yang <
    >> > [hidden email]<mailto:[hidden email]>>
    >> >     > wrote:
    >> >     >     > +1 on squash merge to keep history compressed.  The rebase +
    >> > merge
    >> >     > contains good deals, but it is easy to get confused for people that
    >> > doesn’t
    >> >     > know about the rebase option is turned on by default for Hadoop.
    >> >     >     >
    >> >     >     > Regards,
    >> >     >     > Eric
    >> >     >     >
    >> >     >     > On 12/14/17, 12:06 PM, "Arun Suresh" <[hidden email]<mailto:[hidden email]>>
    >> > wrote:
    >> >     >     >
    >> >     >     >     Another option - atleast for feature branches is to maybe
    >> > squash
    >> >     > merge -
    >> >     >     >     this way we see it as a single commit ? Although we will
    >> > loose
    >> >     > the feature
    >> >     >     >     branch history (I am ok with that though)
    >> >     >     >
    >> >     >     >     Cheers
    >> >     >     >     -Arun
    >> >     >     >
    >> >     >     >     On Thu, Dec 14, 2017 at 11:32 AM, Eric Yang <
    >> >     > [hidden email]<mailto:[hidden email]>> wrote:
    >> >     >     >
    >> >     >     >     > Thank you for the pointer.  I guess all merge are done
    >> > using
    >> >     > rebase +
    >> >     >     >     > merge.  This is the reason that timeline is out of
    >> order.
    >> >     >     >     >
    >> >     >     >     > Would it be more useful to merge without rebasing for
    >> > feature
    >> >     > branch merge
    >> >     >     >     > to avoid timeline confusions?  The argument for not
    >> > rebasing,
    >> >     > it would be
    >> >     >     >     > easier to find the root cause of trunk failure was due
    >> to
    >> >     > merge or some
    >> >     >     >     > recent commits.
    >> >     >     >     >
    >> >     >     >     > Regards,
    >> >     >     >     > Eric
    >> >     >     >     >
    >> >     >     >     > From: Sunil G <[hidden email]<mailto:[hidden email]>>
    >> >     >     >     > Date: Thursday, December 14, 2017 at 11:11 AM
    >> >     >     >     > To: Eric Yang <[hidden email]<mailto:[hidden email]>>
    >> >     >     >     > Cc: Hadoop Common <[hidden email]<mailto:[hidden email]>>
    >> >     >     >     > Subject: Re: Missing some trunk commit history
    >> >     >     >     >
    >> >     >     >     > Hi Eric.
    >> >     >     >     >
    >> >     >     >     > A branch merge has happened during that time, and hence
    >> > you
    >> >     > might have
    >> >     >     >     > seen some old commits from that branch. If you go down
    >> >     > further, you could
    >> >     >     >     > see those commits.
    >> >     >     >     >
    >> >     >     >     > Copied from my git log:
    >> >     >     >     >
    >> >     >     >     > commit 40b0045ebe0752cd3d1d09be00acbabdea983799
    >> >     >     >     > Author: Weiwei Yang <[hidden email]<mailto:[hidden email]><mailto:
    >> > [hidden email]<mailto:[hidden email]>>>
    >> >     >     >     > Date:   Wed Dec 6 17:52:41 2017 +0800
    >> >     >     >     >
    >> >     >     >     >     YARN-7610. Extend Distributed Shell to support
    >> > launching
    >> >     > job with
    >> >     >     >     > opportunistic containers. Contributed by Weiwei Yang.
    >> >     >     >     >
    >> >     >     >     > commit 56b1ff80dd9fbcde8d21a604eff0babb3a16418f
    >> >     >     >     > Author: Xiao Chen <[hidden email]<mailto:[hidden email]><mailto:
    >> > [hidden email]<mailto:[hidden email]>>>
    >> >     >     >     > Date:   Tue Dec 5 20:48:02 2017 -0800
    >> >     >     >     >
    >> >     >     >     >     HDFS-12872. EC Checksum broken when
    >> BlockAccessToken
    >> > is
    >> >     > enabled.
    >> >     >     >     >
    >> >     >     >     > commit 05c347fe51c01494ed8110f8f116a01c90205f13
    >> >     >     >     > Author: Weiwei Yang <[hidden email]<mailto:[hidden email]><mailto:
    >> > [hidden email]<mailto:[hidden email]>>>
    >> >     >     >     > Date:   Wed Dec 6 12:21:52 2017 +0800
    >> >     >     >     >
    >> >     >     >     >     YARN-7611. Node manager web UI should display
    >> > container
    >> >     > type in
    >> >     >     >     > containers page. Contributed by Weiwei Yang.
    >> >     >     >     >
    >> >     >     >     > commit 73b86979d661f4ad56fcfc3a05a403dfcb2a860e
    >> >     >     >     > Author: Kai Zheng <[hidden email]<mailto:[hidden email]><mailto:
    >> >     > zhengkai.zk@alibaba-
    >> >     >     >     > inc.com<http://inc.com>>>
    >> >     >     >     > Date:   Wed Dec 6 12:01:36 2017 +0800
    >> >     >     >     >
    >> >     >     >     >     HADOOP-15039. Move SemaphoredDelegatingExecutor to
    >> >     > hadoop-common.
    >> >     >     >     > Contributed by Genmao Yu
    >> >     >     >     >
    >> >     >     >     > commit 44b06d34a537f8b558007cc92a5d1a8e59b5d86b
    >> >     >     >     > Author: Akira Ajisaka <[hidden email]<mailto:[hidden email]><mailto:
    >> >     > [hidden email]<mailto:[hidden email]>>>
    >> >     >     >     > Date:   Wed Dec 6 11:40:33 2017 +0900
    >> >     >     >     >
    >> >     >     >     >     HDFS-12889. Router UI is missing robots.txt file.
    >> >     > Contributed by
    >> >     >     >     > Bharat Viswanadham.
    >> >     >     >     >
    >> >     >     >     > commit 0311cf05358cd75388f48f048c44fba52ec90f00
    >> >     >     >     > Author: Wangda Tan <[hidden email]<mailto:[hidden email]><mailto:
    >> > [hidden email]<mailto:[hidden email]>
    >> >     > >>
    >> >     >     >     > Date:   Tue Dec 5 13:09:49 2017 -0800
    >> >     >     >     >
    >> >     >     >     >     YARN-7381. Enable the configuration:
    >> >     > yarn.nodemanager.log-container-debug-info.enabled
    >> >     >     >     > by default in yarn-default.xml. (Xuan Gong via wangda)
    >> >     >     >     >
    >> >     >     >     >     Change-Id: I1ed58dafad5cc276eea5c0b0813cf
    >> 04f57d73a87
    >> >     >     >     >
    >> >     >     >     > commit 6555af81a26b0b72ec3bee7034e01f5bd84b1564
    >> >     >     >     > Author: Aaron Fabbri <[hidden email]<mailto:[hidden email]><mailto:
    >> >     > [hidden email]<mailto:[hidden email]>>>
    >> >     >     >     > Date:   Tue Dec 5 11:06:32 2017 -0800
    >> >     >     >     >
    >> >     >     >     >     HADOOP-14475 Metrics of S3A don't print out when
    >> > enabled.
    >> >     > Contributed
    >> >     >     >     > by Younger and Sean Mackrory.
    >> >     >     >     >
    >> >     >     >     >
    >> >     >     >     >
    >> >     >     >     > - Sunil
    >> >     >     >     >
    >> >     >     >     >
    >> >     >     >     > On Fri, Dec 15, 2017 at 12:29 AM Eric Yang <
    >> >     > [hidden email]<mailto:[hidden email]><mailto:
    >> >     >     >     > [hidden email]<mailto:[hidden email]>>> wrote:
    >> >     >     >     > Hi all,
    >> >     >     >     >
    >> >     >     >     > While troubleshooting a trunk build failure, I notice
    >> the
    >> >     > commit history
    >> >     >     >     > for trunk between Nov 30th to Dec 6th are squashed or
    >> >     > disappeared for no
    >> >     >     >     > reason.  This seems to have taken place in the last 24
    >> > hours.
    >> >     > I can see
    >> >     >     >     > the commit logs from github UI.  When doing a new clone
    >> > from
    >> >     > Apache Git and
    >> >     >     >     > Github, the commit histories between those dates are
    >> > gone.  I
    >> >     > usually
    >> >     >     >     > maintain two git repositories, one for testing and one
    >> > for
    >> >     > development.
    >> >     >     >     > Both repositories were sync up with github frequently,
    >> > and
    >> >     > only test
    >> >     >     >     > repository was updated today and the missing history
    >> only
    >> >     > reflect in test
    >> >     >     >     > repository.  This is the reason that I have the
    >> > impression
    >> >     > that this might
    >> >     >     >     > have happened in the last 24 hours.  I did some spot
    >> > check to
    >> >     > see if the
    >> >     >     >     > missing commits are in trunk.  The code seems to be in
    >> > place,
    >> >     > and only
    >> >     >     >     > commit history is gone.
    >> >     >     >     >
    >> >     >     >     > Is there any way to fix the commit history?  Hopefully
    >> > this is
    >> >     > not a git
    >> >     >     >     > bug, but some peer review might find out the root cause
    >> > that
    >> >     > could help to
    >> >     >     >     > understand the damage.  Thank you
    >> >     >     >     >
    >> >     >     >     > Regards,
    >> >     >     >     > Eric
    >> >     >     >     >
    >> >     >     >
    >> >     >     >
    >> >     >
    >> >     >
    >> >  ---------------------------------------------------------------------
    >> >     >     To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.
    >> apache.org<http://apache.org>
    >> >     >     For additional commands, e-mail:
    >> > [hidden email]<mailto:[hidden email]>
    >> >     >
    >> >     >
    >> >     >
    >> >     >
    >> >
    >> >
    >> >
    >>