Monorepo 탐색 #5: 완벽한 도커
오늘 우리는 문제 해결에 집중할 것입니다.
이러한 문제가 어떻게 발생했는지에 대한 자세한 내용은 이전 블로그를 참조하십시오. 이제 문제를 해결하는 방법을 살펴보겠습니다.
목차
Dockerfile
계획에 수렴
It's critical to understand that Docker caches each line in the Dockerfile, and that the output of one line is the input of the next. So if a line generates new output all subsequent caches are invalidated. With that in mind, here's a common Docker anti-pattern that causes issue 1:
COPY . .
RUN pnpm install
If anything changes in any file then pnpm install
has to run from scratch, because the COPY . .
would produce a different output. This should always be optimized so only the files necessary to install dependencies are copied in first, then dependencies are installed, and then the rest of the source-files are copied in. Something like this:
COPY package.json .
COPY pnpm-lock.yaml .
COPY pnpm-workspaces.yaml .
COPY apps/web/package.json ./apps/web/
COPY libs/types/package.json ./libs/types/
RUN pnpm install
COPY . .
Now all steps up to and including pnpm install
remain cached so long as none of those meta-files change, and so Docker will skip all those steps. This is a massive speedup.
The downside is we're now manually specifying all those meta-files ☹️. And that leads to issue 2:
Using the COPY <meta-file>
construct scales poorly because we have to author each Dockerfile with explicit and detailed information about which dependencies to copy in. And by using the COPY . .
construct we copy all monorepo files, which needlessly bloats the image because for this example we only need the source-files from apps/web
and libs/types
(it's but web
only depends on types
).
The key insight is that pnpm already understands how dependencies depend on each other, so we should be able to leverage that. We can't use pnpm directly from Dockerfile's COPY construct, but what if we use pnpm to generate a context that only contains the files needed for a specific package? Then the Dockerfile for that package could use COPY . .
but it'd actually only copy in just the right files…
And, hang on, lets consider the meta-files too. The challenge is we can't isolate all the package.json
files easily so we resort to path-specific COPY commands, but what if we get really clever and create our custom context such that all the meta-files are placed in a /meta
folder inside the context for easy copying, and we put the rest of the source-files in another folder?
Let's see if that'll work!
사용자 정의 컨텍스트 스크립트
We introduced the custom context technique where we simply piped tar
into Docker:
$ cd apps/web
$ tar -cf - ../.. | docker build -f apps/web/Dockerfile -
Now it's time we discard the naive tar
command and come up with something more bespoke.
I've made a script that takes a Dockerfile and finds just the right files needed for that package, and outputs it all as a tarball so it's a drop-in replacement for the tar
command.
ℹ️ BTW, the full script is available on GitHub1s.com if you'd like to have a look.
사용 방법은 다음과 같습니다.
$ pnpm --silent pnpm-context -- --list-files apps/web/Dockerfile
Dockerfile
deps/libs/types/.gitignore
deps/libs/types/Dockerfile
deps/libs/types/libs-types.iml
deps/libs/types/package.json
deps/libs/types/src/index.ts
deps/libs/types/tsconfig.json
meta/apps/web/package.json
meta/libs/types/package.json
meta/package.json
meta/pnpm-lock.yaml
meta/pnpm-workspace.yaml
pkg/apps/web/.gitignore
pkg/apps/web/apps-web.iml
pkg/apps/web/package.json
pkg/apps/web/src/client.tsx
pkg/apps/web/src/index.ts
pkg/apps/web/src/node.d.ts
pkg/apps/web/src/pages/App.css
pkg/apps/web/src/pages/App.tsx
pkg/apps/web/src/pages/Home.css
pkg/apps/web/src/pages/Home.spec.tsx
pkg/apps/web/src/pages/Home.tsx
pkg/apps/web/src/pages/react.svg
pkg/apps/web/src/server.tsx
pkg/apps/web/tsconfig.json
pkg/apps/web/typings/index.d.ts
이제 린 컨텍스트입니다! "libs/types"및 "apps/web"파일만 존재하고 파일이 "deps", "meta"및 "pkg"의 세 폴더로 분할되는 방법에 주목하십시오. 이것이 Dockerfile에서 메타 파일만 복사하는 데 사용할 메커니즘이지만 잠시 후에 살펴보겠습니다.
사실 이 컨텍스트는 너무 간결합니다 😅: 루트
tsconfig.json
파일은 pnpm이 사용 여부를 알 수 없기 때문에 포함되지 않지만 패키지는 이에 의존합니다. 그리고 bin/postinstall
스크립트도 필요합니다. 이 문제를 해결하기 위해 -p
인수를 사용하여 추가 포함 패턴을 지정할 수 있습니다.$ pnpm --silent pnpm-context -- -p 'tsconfig.json' -p 'bin/' --list-files apps/web/Dockerfile
...
pkg/bin/preinstall
pkg/tsconfig.json
ℹ️ BTW, the repository actually calls
pnpm-context.mjs
with a few more arguments, see the "docker:build" script inpackage.json
on GitHub1s.com for all the details.
이제 컨텍스트가 좋습니다. Docker로 파이프하여 이미지를 빌드하는 방법을 살펴보겠습니다.
$ pnpm --silent pnpm-context -- -p 'tsconfig.json' -p 'bin/'\
apps/web/Dockerfile | docker build --build-arg PACKAGE_PATH=apps/web - -t mono-web
[+] Building 3.1s (19/19) FINISHED
효과가있다! 그러나 Dockerfile이 이 새로운 컨텍스트에서 실제로 어떻게 작동하는지 봅시다.
도커파일
ℹ️ BTW, in this article I'll only show explanatory snippets/examples of the Dockerfile, but you can see the full Dockerfile on GitHub1s.com.
새 사용자 지정 컨텍스트 하위 폴더를 사용하는 것은 매우 간단합니다. 다음은 새 Dockerfile이 구성되는 방식의 예입니다.
ARG PACKAGE_PATH
# ↑ Specified via Docker's `--build-arg` argument
COPY ./meta .
RUN pnpm install --filter "{${PACKAGE_PATH}}..." --frozen-lockfile
# ↑ `...` selects the package and its dependencies
COPY ./deps .
RUN pnpm build --if-present --filter "{${PACKAGE_PATH}}^..."
# ↑ `^...` ONLY selects the dependencies of the package, but not the package itself
COPY ./pkg .
RUN pnpm build --if-present --filter "{${PACKAGE_PATH}}"
RUN pnpm test --if-present --filter "{${PACKAGE_PATH}}"
# Everything's built and good to go 🎉
이 구조
pnpm install
를 사용하면 메타 파일이 변경되고 Dockerfile에 수동으로 지정된 패키지별 경로가 포함되지 않은 경우에만 실행됩니다. 우리는 문제 #1과 2를 분쇄했습니다! 🎉pnpm 저장소 캐시
It's fine we preserve the pnpm install
cache as much as we can, but when it does have to run it frustratingly re-downloads every single dependency from scratch. That's very wasteful in time and bandwidth! On our own machines pnpm downloads to a persisted store so it never has to re-download a package, but that store never gets persisted inside Docker because it evaporates as soon as a meta-file changes.
But Docker has a mechanism for exactly this: It allows a RUN command to mount a folder which is persisted on the host machine, so when the command runs it has access to files from previous runs. The code for this ends up a bit complex-looking, but it's worth the performance boost so let's try it out:
ARG PACKAGE_PATH
COPY ./meta .
RUN --mount=type=cache,id=pnpm-store,target=/root/.pnpm-store\
# ↑ By caching the content-addressable store we stop
# downloading the same dependencies again and again.
# Unfortunately, doing this causes Docker to place
# the pnpm content-addressable store on a different
# virtual drive, which prohibits pnpm from
# symlinking its content to its virtual store,
# and that causes pnpm to fall back on copying the
# files, and… that's totally fine! Except pnpm emits
# many warnings that its not using symlinks, so
# we also must use `grep` to filter out those warnings.
pnpm install --filter "{${PACKAGE_PATH}}..." \
--frozen-lockfile\
| grep --invert-match "cross-device link not permitted\|Falling back to copying packages from store"
# ↑ Using `--invert-match` to discard annoying output
It would be nice if we could tell pnpm to be quiet when it can't symlink, but we can survive this complexity.
개발 종속성 제거
We've reached the last issue: We're bloating the final image with dev-dependencies because we don't clean up after building apps/web
inside the image. It's a waste we shouldn't allow.
The solution is to reset back to having no dependencies installed, and then only installing the production dependencies. This is pretty straightforward to do by using Docker stages:
FROM node:16-alpine AS base
# Install pnpm
FROM base AS dev
# Install all dependencies and build the package
FROM base as prod
# Install just prod dependencies
With this approach the "prod" stage isn't affected by whatever happens in the "dev" stage. Nice! But because dev builds the package we do need some way to transfer files from dev to prod, because we need the final build code to be moved to prod stage. For that we can introduce an "assets" layer where we isolate just the files that should go into the prod stage. So we can do something like this:
FROM node:16-alpine AS base
RUN npm --global install pnpm
WORKDIR /root/monorepo
FROM base AS dev
# Install all dependencies and build the package
FROM dev AS assets
RUN rm -rf node_modules && pnpm recursive exec -- rm -rf ./node_modules ./src
# ↑ Reset back to no dependencies installed, and delete all
# src folders because we don't need source-files.
# This way whatever files got built are left behind.
FROM base as prod
pnpm install --prod --filter "{${PACKAGE_PATH}}..."
# ↑ Install just prod dependencies
COPY --from=assets /root/monorepo .
So here the "assets" stage isolates whatever code was generated in the dev stage, which the prod stage then copies into itself. Does it work?
$ cd apps/web
$ pnpm build
$ docker run mono-web
[razzle] > Started on port 3000
🎉
CI 스크립트 업데이트
It's one thing to get all this working locally, but we also need to update our GitHub Actions CI script.
첫 번째 문제는 필요한 종속성을 실제로 설치하지 않기 때문에
pnpm-context.mjs
스크립트를 전혀 실행하지 않는다는 것입니다. 그렇게 하려면 모노 저장소의 루트에 대해서만 pnpm install
를 실행해야 합니다. pnpm/action-setup
라는 Github Action을 사용하면 쉽게 할 수 있습니다. pnpm을 설치하고 pnpm install
실행할 수 있으므로 monorepository에 대한 종속성을 설치하도록 지시할 수 있습니다. - uses: pnpm/action-setup@v2
with:
run_install: |
- args: [--frozen-lockfile, --filter "exploring-the-monorepo"]
하지만 또 다른 흥미로운 오류가 발생합니다.
mount
기능(pnpm 저장소를 캐시하기 위해)을 사용하기 때문에 Docker 빌드가 실패하고 이를 사용하려면 "Buildkit"모드를 활성화해야 합니다. Buildkit은 기본적으로 아직 활성화되지 않은 Docker의 향후 기능 세트이며 솔루션은 다소 간단합니다. 환경 변수 설정DOCKER_BUILDKIT
:$ DOCKER_BUILDKIT=1 docker build
결론
The issues we set out to vanquish have been resolved 🎉. We now build images that play nice with Docker caching, the Dockerfiles are free from manually-specified dependency concerns, and the final images are very lean and optimal. Quite nice!
I feel the pnpm investment is really paying off, it was already a nice CLI to use but how amazing they also have a pretty straightforward API to use programmatically to do our dependency-graph logic!
This article's title promised "perfect", did we achieve that? Well, no, perfection is a high bar, but we've addressed all the practical concerns I've experienced so I'm happy to call it a day here. We wouldn't want to get too carried away after all 👀 (I think for some, this entire article-series is already deep into "carried away" territory).
I'd love to hear if you have any questions or comments, or if there are any directions you'd like to see explored in future articles. So please leave a comment.
Reference
이 문제에 관하여(Monorepo 탐색 #5: 완벽한 도커), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://dev.to/jonlauridsen/exploring-the-monorepo-5-perfect-docker-52aj텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.
우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)