Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building with kaniko results in missing packets on node_modules #973

Open
Lewis-software opened this issue Mar 3, 2024 · 4 comments
Open

Comments

@Lewis-software
Copy link

I don't know if anyone is experiencing this issue, but when we try to build with kaniko everything goes fine (no error shown).
When we deploy the image in production we find out that packets are missing from node_modules.
This is weird, because building locally works fine and packets are present in package.json.

Reading kaniko logs, it seems that it reads the folder from cache, and we're wondering that it may be the cause of the problem.

This is the build stage in our .gitlab-ci.yaml:

build-with-kaniko:
  stage: build
  image: gperdomor/nx-kaniko:20.11.1-alpine
  variables:
    # Nx Container
    INPUT_PUSH: 'true' # To push your image to the registry
    INPUT_ENGINE: 'kaniko' # Overriding engine of project.json files
  cache:
    key:
      files:
        - pnpm-lock.yaml
    paths:
      - .pnpm-store
  before_script:
    - npm i -g pnpm
    - pnpm config set store-dir .pnpm-store
    - pnpm i
    - NX_HEAD=$CI_COMMIT_SHA
    - NX_BASE=${CI_MERGE_REQUEST_DIFF_BASE_SHA:-$CI_COMMIT_BEFORE_SHA}
    # Login to registry
    - echo "{\"auths\":{\"$CI_REGISTRY\":{\"auth\":\"$(echo -n $CI_REGISTRY_USER:$CI_REGISTRY_PASSWORD | base64)\"}}}" > /kaniko/.docker/config.json
  script:
    - pnpm nx affected --base=$NX_BASE --head=$NX_HEAD --target=container --configuration=production --parallel=1

This is the error shown when launching the image in docker:

node:internal/modules/cjs/loader:1147
  throw err;
  ^
Error: Cannot find module 'cookie-parser'
Require stack:
- /usr/src/app/main.js
    at Module._resolveFilename (node:internal/modules/cjs/loader:1144:15)
    at Module._load (node:internal/modules/cjs/loader:985:27)
    at Module.require (node:internal/modules/cjs/loader:1235:19)
    at require (node:internal/modules/helpers:176:18)
    at Array.__webpack_modules__ (/usr/src/app/main.js:2492:18)
    at __webpack_require__ (/usr/src/app/main.js:2515:41)
    at /usr/src/app/main.js:2531:49
    at /usr/src/app/main.js:2581:3
    at /usr/src/app/main.js:2584:12
    at webpackUniversalModuleDefinition (/usr/src/app/main.js:3:20) {
  code: 'MODULE_NOT_FOUND',
  requireStack: [ '/usr/src/app/main.js' ]
}
Node.js v20.11.1

This is our package.json:

{
  "name": "my-app",
  "version": "0.0.1",
  "dependencies": {
    "@apollo/gateway": "2.7.1",
    "@apollo/subgraph": "2.7.1",
    "@nestjs-plugins/nestjs-nats-jetstream-transport": "2.2.6",
    "@nestjs/apollo": "12.1.0",
    "@nestjs/axios": "3.0.2",
    "@nestjs/common": "10.3.3",
    "@nestjs/config": "3.2.0",
    "@nestjs/core": "10.3.3",
    "@nestjs/graphql": "12.1.1",
    "@nestjs/jwt": "10.2.0",
    "@nestjs/microservices": "10.3.3",
    "@nestjs/platform-express": "10.3.3",
    "@nestjs/terminus": "10.2.3",
    "@nestjs/typeorm": "10.0.2",
    "@songkeys/nestjs-redis": "10.0.0",
    "axios": "1.6.7",
    "class-transformer": "0.5.1",
    "class-validator": "0.14.1",
    "cookie-parser": "1.4.6",
    "express": "4.18.3",
    "graphql": "16.8.1",
    "ioredis": "5.3.2",
    "multer": "1.4.5-lts.1",
    "nestjs-i18n": "10.4.5",
    "nestjs-pino": "4.0.0",
    "pg": "8.11.3",
    "pino-pretty": "10.3.1",
    "reflect-metadata": "0.2.1",
    "rxjs": "7.8.1",
    "tslib": "2.6.2",
    "typeorm": "0.3.20",
    "xml2js": "0.6.2"
  },
  "main": "main.js"
}

And this is our Dockerfile:

FROM docker.io/node:lts-alpine as deps
# Check https://github.com/nodejs/docker-node/tree/b4117f9333da4138b03a546ec926ef50a31506c3#nodealpine to understand why libc6-compat might be needed.
RUN apk add --no-cache libc6-compat
RUN npm i -g pnpm
WORKDIR /usr/src/app
COPY dist/apps/my-app/package*.json ./
COPY dist/apps/my-app/pnpm-lock.yaml ./
RUN pnpm install --frozen-lockfile --prod

# Production image, copy all the files and run nest
FROM docker.io/node:lts-alpine as runner
RUN apk add --no-cache dumb-init curl
ENV NODE_ENV production
ENV PORT 3000
WORKDIR /usr/src/app
COPY --from=deps /usr/src/app/node_modules ./node_modules
COPY --from=deps /usr/src/app/package.json ./package.json
COPY dist/apps/my-app .
RUN chown -R node:node .
USER node
EXPOSE 3000
CMD ["dumb-init", "node", "main.js"]

Anyone has a solution to this?

@enlight3d
Copy link

I have the same issue on a project where I build a backend with nestjs framework. For me it's the module tslib that is missing, resulting in docker images that are 100mb instead of usually 200mb. Only fix I found is to remove cache of each built app from my gitlab repo's container registry and then rebuild the app.

@Lewis-software
Copy link
Author

I have the same issue on a project where I build a backend with nestjs framework. For me it's the module tslib that is missing, resulting in docker images that are 100mb instead of usually 200mb. Only fix I found is to remove cache of each built app from my gitlab repo's container registry and then rebuild the app.

Thanks, I may try that as a temporary workaround. I hope they'll fix this eventually.

@gperdomor
Copy link
Owner

Hi folks... This seems to be related to Kaniko and not to the plugin itself, the plugin basically build the final command and arguments which are executed to build the image, but all the build step is do it by Docker, Podman or in your case, Kaniko... In any case, can you provide at the link to the repo if is public please?...

Also if you execute the same command (extracted from the gitlab logs) in your local and the final image works, then is another confirmation that the problem is not the plugin

@enlight3d
Copy link

enlight3d commented Apr 26, 2024

Hello @gperdomor, thanks for your reply. That's what I thought too. I was using gperdomor/nx-kaniko:18.12.0-alpine image and now I'm trying using gperdomor/nx-kaniko:20.12.2-alpine. Could you tell me what version of Kaniko are those images using ?

Also, as an attempt to prevent caching issues I tried following https://github.com/gperdomor/nx-tools/blob/main/packages/nx-container/docs/advanced/cache.md and implemented "cache-from" and "cache-to" on all my apps of my NX projects but I don't see any changes in my Gitlab's container registry. Cache folders are still appName/cache instead of appName:buildcache.
For reference, here is the relevant part that I modified in each project.json of my apps :

"container": {
      "executor": "@nx-tools/nx-container:build",
      "dependsOn": ["build"],
      "options": {
        "engine": "docker",
        "metadata": {
          "images": ["$CI_REGISTRY/$CI_PROJECT_PATH/backend-features"],
          "load": true,
          "tags": [
            "type=schedule",
            "type=ref,event=branch",
            "type=ref,event=pr",
            "type=sha,prefix=sha-"
          ],
          "cache-from": [
            "type=registry,ref=$CI_REGISTRY/$CI_PROJECT_PATH/backend-features:buildcache"
          ],
          "cache-to": [
            "type=registry,ref=$CI_REGISTRY/$CI_PROJECT_PATH/backend-features:buildcache,mode=max"
          ]
        }
      }
    }

and here is my gitlab ci's job :

build affected apps:
  stage: build
  image: gperdomor/nx-kaniko:20.12.2-alpine
  interruptible: true
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - .npm/
  variables:
    # Nx Container
    INPUT_PUSH: 'true' # To push your image to the registry
    INPUT_ENGINE: 'kaniko' # Override the engine to use for building the image
  before_script:
    - npm ci -f --cache .npm --prefer-offline
    - NX_HEAD=$CI_COMMIT_SHA
    - NX_BASE=${CI_MERGE_REQUEST_DIFF_BASE_SHA:-$CI_COMMIT_BEFORE_SHA}
    # Login to registry
    - echo "{\"auths\":{\"$CI_REGISTRY\":{\"auth\":\"$(echo -n ${CI_REGISTRY_USER}:${CI_REGISTRY_PASSWORD} | base64)\"}}}" > /kaniko/.docker/config.json
  script:
    - echo "Building apps..."
    - npx nx show projects --affected --with-target container --base=$NX_BASE --head=$NX_HEAD > apps.txt
    - npx nx affected --base=$NX_BASE --head=$NX_HEAD --target=container --parallel=1
  artifacts:
    paths:
      - apps.txt
  rules: # Only run on main branch but not tags
    - if: '$CI_COMMIT_BRANCH == "main" && $CI_COMMIT_TAG == null'

PS: I'm using latest libraries versions :

    "@nx-tools/container-metadata": "^5.3.1",
    "@nx-tools/nx-container": "^5.3.1",

EDIT: OH WAIT ! I put "cache-from" and "cache-to" inside metadata, I'm going to move them inside options and see if it fixes my issue !

EDIT 2: yup, I see now in the logs that it's pushing to appName:buildcache now

EDIT 3: but there are still some erros, see:

INFO[0066] Taking snapshot of files...                  
INFO[0066] Pushing layer type=registry,ref=registry.companyName.com/groupName/backend/backend-app/buildcache:9dbbf534e4cb2b3c653eb31ceed07b924d89357bc7db9013dc177c3bdacb8467 to cache now 
INFO[0066] Pushing image to type=registry,ref=registry.companyName.com/groupName/backend/backend-app/buildcache:9dbbf534e4cb2b3c653eb31ceed07b924d89357bc7db9013dc177c3bdacb8467 
INFO[0066] USER node                                    
INFO[0066] Cmd: USER                                    
INFO[0066] No files changed in this command, skipping snapshotting. 
INFO[0066] EXPOSE 3000                                  
INFO[0066] Cmd: EXPOSE                                  
INFO[0066] Adding exposed port: 3000/tcp                
INFO[0066] No files changed in this command, skipping snapshotting. 
INFO[0066] CMD ["dumb-init", "node", "main.js"]         
INFO[0066] No files changed in this command, skipping snapshotting. 
WARN[0066] Error uploading layer to cache: failed to push to destination type=registry,ref=registry.companyName.com/groupName/backend/backend-app/buildcache:23b6d6d66222c4e991ae8565aae3c4ea7cba20cf4bd727d3a9b8712484b8dbd2: Get "https://type=registry,ref=registry.companyName.com/v2/": dial tcp: lookup type=registry,ref=registry.companyName.com: no such host 
INFO[0066] Pushing image to registry.companyName.com/groupName/backend/backend-app:main 
INFO[0072] Pushed registry.companyName.com/groupName/backend/backend-app@sha256:069ef95d4d7cdafc7e19f754ae0e2226ccccfe1650062644a11b83cd1b099b19 
INFO[0072] Pushing image to registry.companyName.com/groupName/backend/backend-app:sha-d70ea95 
INFO[0072] Pushed registry.companyName.com/groupName/backend/backend-app@sha256:069ef95d4d7cdafc7e19f754ae0e2226ccccfe1650062644a11b83cd1b099b19 

If you take a look, there are errors pushing layers to cache.. any ideas ? is it that because if using kaniko, the cach-to and cache-from are not the same that for docker engine ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants