Re: [DISCUSS] Docker build process

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Docker build process

Jim Brennan
I agree with Steve and Marton.   I am ok with having the docker build as an
option, but I don't want it to be the default.
Jim


On Tue, Mar 19, 2019 at 12:19 PM Eric Yang <[hidden email]> wrote:

> Hi Marton,
>
> Thank you for your input.  I agree with most of what you said with a few
> exceptions.  Security fix should result in a different version of the image
> instead of replace of a certain version.  Dockerfile is most likely to
> change to apply the security fix.  If it did not change, the source has
> instability over time, and result in non-buildable code over time.  When
> maven release is automated through Jenkins, this is a breeze of clicking a
> button.  Jenkins even increment the target version automatically with
> option to edit.  It makes release manager's job easier than Homer Simpson's
> job.
>
> If versioning is done correctly, older branches can have the same docker
> subproject, and Hadoop 2.7.8 can be released for older Hadoop branches.  We
> don't generate timeline paradox to allow changing the history of Hadoop
> 2.7.1.  That release has passed and let it stay that way.
>
> There are mounting evidence that Hadoop community wants docker profile for
> developer image.  Precommit build will not catch some build errors because
> more codes are allowed to slip through using profile build process.  I will
> make adjustment accordingly unless 7 more people comes out and say
> otherwise.
>
> Regards,
> Eric
>
> On 3/19/19, 1:18 AM, "Elek, Marton" <[hidden email]> wrote:
>
>
>
>     Thank you Eric to describe the problem.
>
>     I have multiple small comments, trying to separate them.
>
>     I. separated vs in-build container image creation
>
>     > The disadvantages are:
>     >
>     > 1.  Require developer to have access to docker.
>     > 2.  Default build takes longer.
>
>
>     These are not the only disadvantages (IMHO) as I wrote it in in the
>     previous thread and the issue [1]
>
>     Using in-build container image creation doesn't enable:
>
>     1. to modify the image later (eg. apply security fixes to the container
>     itself or apply improvements for the startup scripts)
>     2. create images for older releases (eg. hadoop 2.7.1)
>
>     I think there are two kind of images:
>
>     a) images for released artifacts
>     b) developer images
>
>     I would prefer to manage a) with separated branch repositories but b)
>     with (optional!) in-build process.
>
>     II. Agree with Steve. I think it's better to make it optional as most
> of
>     the time it's not required. I think it's better to support the default
>     dev build with the default settings (=just enough to start)
>
>     III. Maven best practices
>
>     (https://dzone.com/articles/maven-profile-best-practices)
>
>     I think this is a good article. But this is not against profiles but
>     creating multiple versions from the same artifact with the same name
>     (eg. jdk8/jdk11). In Hadoop, profiles are used to introduce optional
>     steps. I think it's fine as the maven lifecycle/phase model is very
>     static (compare it with the tree based approach in Gradle).
>
>     Marton
>
>     [1]: https://issues.apache.org/jira/browse/HADOOP-16091
>
>     On 3/13/19 11:24 PM, Eric Yang wrote:
>     > Hi Hadoop developers,
>     >
>     > In the recent months, there were various discussions on creating
> docker build process for Hadoop.  There was convergence to make docker
> build process inline in the mailing list last month when Ozone team is
> planning new repository for Hadoop/ozone docker images.  New feature has
> started to add docker image build process inline in Hadoop build.
>     > A few lessons learnt from making docker build inline in YARN-7129.
> The build environment must have docker to have a successful docker build.
> BUILD.txt stated for easy build environment use Docker.  There is logic in
> place to ensure that absence of docker does not trigger docker build.  The
> inline process tries to be as non-disruptive as possible to existing
> development environment with one exception.  If docker’s presence is
> detected, but user does not have rights to run docker.  This will cause the
> build to fail.
>     >
>     > Now, some developers are pushing back on inline docker build process
> because existing environment did not make docker build process mandatory.
> However, there are benefits to use inline docker build process.  The listed
> benefits are:
>     >
>     > 1.  Source code tag, maven repository artifacts and docker hub
> artifacts can all be produced in one build.
>     > 2.  Less manual labor to tag different source branches.
>     > 3.  Reduce intermediate build caches that may exist in multi-stage
> builds.
>     > 4.  Release engineers and developers do not need to search a maze of
> build flags to acquire artifacts.
>     >
>     > The disadvantages are:
>     >
>     > 1.  Require developer to have access to docker.
>     > 2.  Default build takes longer.
>     >
>     > There is workaround for above disadvantages by using -DskipDocker
> flag to avoid docker build completely or -pl !modulename to bypass
> subprojects.
>     > Hadoop development did not follow Maven best practice because a full
> Hadoop build requires a number of profile and configuration parameters.
> Some evolutions are working against Maven design and require fork of
> separate source trees for different subprojects and pom files.  Maven best
> practice (https://dzone.com/articles/maven-profile-best-practices) has
> explained that do not use profile to trigger different artifact builds
> because it will introduce maven artifact naming conflicts on maven
> repository using this pattern.  Maven offers flags to skip certain
> operations, such as -DskipTests -Dmaven.javadoc.skip=true -pl or
> -DskipDocker.  It seems worthwhile to make some corrections to follow best
> practice for Hadoop build.
>     >
>     > Some developers have advocated for separate build process for docker
> images.  We need consensus on the direction that will work best for Hadoop
> development community.  Hence, my questions are:
>     >
>     > Do we want to have inline docker build process in maven?
>     > If yes, it would be developer’s responsibility to pass -DskipDocker
> flag to skip docker.  Docker is mandatory for default build.
>     > If no, what is the release flow for docker images going to look like?
>     >
>     > Thank you for your feedback.
>     >
>     > Regards,
>     > Eric
>     >
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: [hidden email]
>     For additional commands, e-mail: [hidden email]
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [hidden email]
> For additional commands, e-mail: [hidden email]
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Docker build process

Eric Yang-4
Hi Jonathan,

Thank you for your input.  There are 15,300 matches for querying Google: dockerfile-maven-plugin site:github.com and 377 matches for query apache hosted projects.  I see that many projects opt in to use profile to work around building docker images all the time while others stay true to have the process inline.  People have the rights to opt out using effective root user to compile by giving -DskipDocker flag.  Hence, the effective root user requirement is not permanent.

People did not change their view point after the discussions of this email thread.  I understand the reason that no one likes disruptive changes.  I don’t expect calling vote on this issue will change the outcome.  There are sufficient facts presented from both point of views in this email thread.  I feel enough push back from the community on mandatory inline process and flexible to make the change to a profile-based process.  I don’t need to feel guilty for implementing a half-baked release process and respect the community decision.  Let’s digest the presented facts for rest of the day.  I am ok for not calling the vote unless others think a voting procedure is required.

Regards,
Eric

From: Jonathan Eagles <[hidden email]>
Date: Tuesday, March 19, 2019 at 11:48 AM
To: Eric Yang <[hidden email]>
Cc: "Elek, Marton" <[hidden email]>, Hadoop Common <[hidden email]>, "[hidden email]" <[hidden email]>, Hdfs-dev <[hidden email]>, Eric Badger <[hidden email]>, Eric Payne <[hidden email]>, Jim Brennan <[hidden email]>
Subject: Re: [DISCUSS] Docker build process

This email discussion thread is the result of failing to reach consensus in the JIRA. If you participate in this discussion thread, please recognize that a considerable effort has been made by contributors on this JIRA. On the other hand, contributors to this JIRA need to listen carefully to the comments in this discussion thread since they represent the thoughts and voices of the open source community that will a) benefit from and b) bear the burden of this feature. Failing to listen to these voices is failing to deliver a feature in its best form.

My thoughts-

As shown from my comments on YARN-7129, I have particular concerns that resonate other posters on this thread.
https://issues.apache.org/jira/browse/YARN-7129?focusedCommentId=16790842&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16790842
- Docker images don't evolve at the same rate as Hadoop (tends to favor a separate release cycle, perhaps project)
- Docker images could have many flavors and favoring one flavor (say ubuntu, or windows) over another takes away from Apache Hadoop's platform neutral stance (providing a single "one image fits all" stance is optimistic).
- Introduces release processes that could limit the community's ability to produce releases at a regular rate. (Effective root user permissions needed to create image limiting who can release, extra Docker image only releases)
- In addition, I worry this send a complicated message to our consumers and will stagnate release adoption.

> I will make adjustment accordingly unless 7 more people comes out and say otherwise.

I'm sorry if this is a bit of humor which is lost on me. However, Apache Hadoop has a set of bylaws that dictate the community's process on decision making.
https://hadoop.apache.org/bylaws.html

Best Regards,
jeagles