`TargetGroup` sometimes does not attach to `ApplicationLoadBalancer` #1254

rpmccarter · 2024-04-03T23:26:57Z

What happened?

I was trying to create a single FargateService with two different TargetGroups attached to an ApplicationLoadBalancer (one tg for HTTP requests, one tg for socket connections). When deployed, one target group simply doesn't attach to the load balancer. What's even more concerning is that, when the exact same code is deployed to a second stack, it attaches just fine. I'm relatively new to Pulumi so there might be something I'm missing, but I assumed identical code should result in identical resources.

I understand this might not be reproducible, I mostly just want to flag that I'm seeing inconsistency between environments and hopefully get some answers on how this is possible

Example

Unfortunately, this is part of our private infra so I won't be able to send the entire deploy script, but I'll try to send as much relevant info as possible. Here is the code for the target groups and load balancer:

const serverTg = new aws.lb.TargetGroup(`leaves-server-tg-${stack}`, {
  vpcId: defaultVpc.vpcId,
  stickiness: {
    type: 'lb_cookie',
  },
  port,
  protocol: 'HTTP',
  targetType: 'ip',
  protocolVersion: 'HTTP1',
  healthCheck: {
    path: '/api',
    port: 'traffic-port',
    protocol: 'HTTP',
    matcher: '200',
    enabled: true,
    interval: 60,
    timeout: 30,
  },
});

const socketTg = new aws.lb.TargetGroup(`leaves-socket-tg-${stack}`, {
  vpcId: defaultVpc.vpcId,
  port: 5001,
  protocol: 'HTTP',
  stickiness: {
    type: 'lb_cookie',
  },
  targetType: 'ip',
  protocolVersion: 'HTTP1',
  healthCheck: {
    path: '/api',
    port: `${port}`,
    protocol: 'HTTP',
    matcher: '200',
    enabled: true,
    interval: 60,
    timeout: 30,
  },
});

const lb = new awsx.lb.ApplicationLoadBalancer(`leaves-lb-${stack}`, {
  listeners: [
    {
      port: 443,
      protocol: 'HTTPS',
      certificateArn: lb_cert.arn,
      defaultActions: [
        {
          type: 'forward',
          targetGroupArn: serverTg.arn,
        },
      ],
    },
    {
      port: 8443,
      protocol: 'HTTPS',
      certificateArn: lb_cert.arn,
      defaultActions: [
        {
          type: 'forward',
          targetGroupArn: socketTg.arn,
        },
      ],
    },
  ],
});

And here's the code for the target service:

new awsx.ecs.FargateService(`leaves-server-service-${stack}`, {
  networkConfiguration: {
    assignPublicIp: true,
    securityGroups: [serviceSg.id],
    subnets: defaultVpc.publicSubnetIds,
  },
  cluster: cluster.arn,
  desiredCount: 4,
  taskDefinitionArgs: {
    taskRole: {
      roleArn: role.arn,
    },
    container: {
      name: 'server',
      image: image.imageUri,
      command: ['infisical', 'run', `--env=${stack}`, '--', 'yarn', 'server'],
      cpu: 2 * 1024,
      memory: 4 * 1024,
      environment: serverEnvironment,
      essential: true,
      portMappings: [
        {
          targetGroup: serverTg,
          containerPort: port,
        },
        {
          targetGroup: socketTg,
          containerPort: 5001,
        },
      ],
      healthCheck: {
        command: ['CMD-SHELL', `curl -f http://localhost:${port}/api/ || exit 1`],
        interval: 30,
        timeout: 5,
        retries: 3,
      },
    },
  },
});

Here are the target groups - the relevant ones are selected. Note that leaves-socket-tg-dev has no associated load balancer:

Output of `pulumi about`

CLI          
Version      3.112.0
Go Version   go1.22.1
Go Compiler  gc

Plugins
NAME        VERSION
aws         6.28.2
awsx        2.5.0
cloudflare  5.22.0
docker      4.5.3
docker      3.6.1
nodejs      unknown
tls         5.0.1

Host     
OS       darwin
Version  14.4
Arch     arm64

This project is written in nodejs: executable='/Users/rpmccarter/.nvm/versions/node/v20.10.0/bin/node' version='v20.10.0'

Current Stack: Mintlify/leaves/dev

TYPE                                                      URN
[removed]

Found no pending operations associated with dev

Backend        
Name           pulumi.com
URL            https://app.pulumi.com/Mintlify
User           Mintlify
Organizations  Mintlify
Token type     personal

Dependencies:
NAME                VERSION
@pulumi/aws         6.28.2
@pulumi/awsx        2.5.0
@pulumi/cloudflare  5.22.0
@pulumi/pulumi      3.109.0
@pulumi/tls         5.0.1
@types/node         16.18.22
rimraf              5.0.5
typescript          5.3.3

Pulumi locates its logs in /var/folders/dn/z0by0dcj1gnbkjr6_t71hp_m0000gn/T/ by default

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

The text was updated successfully, but these errors were encountered:

t0yv0 · 2024-04-05T13:13:07Z

Thanks for reporting this @rpmccarter this sounds pretty concerning. To clarify does the failed state happen sporadically or every single time? Are there no errors reported? Does the condition not resolve after a certain time (5 min later)?

This sounds pretty concerning but will be difficult for our team to diagnose so anything along the lines of narrowing down the repro would be super helpful. If anyone is running into this please let us know also what you are observing.

mjeffryes · 2024-05-20T23:18:25Z

Any further context you can offer to help us reproduce this @rpmccarter ?

rpmccarter · 2024-05-22T04:19:30Z

Hey team, I'm fairly confident this is just a symptom of #1253. I'm just now running into a very similar issue with a Cloudflare Record failing to be created due to a missing field which is lb.loadBalancer.dnsName - closing this as a duplicate

rpmccarter added kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team labels Apr 3, 2024

t0yv0 added impact/reliability Something that feels unreliable or flaky and removed needs-triage Needs attention from the triage team labels Apr 5, 2024

mikhailshilkov added the awaiting-feedback label Apr 17, 2024

rpmccarter closed this as not planned Won't fix, can't repro, duplicate, stale May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`TargetGroup` sometimes does not attach to `ApplicationLoadBalancer` #1254

`TargetGroup` sometimes does not attach to `ApplicationLoadBalancer` #1254

rpmccarter commented Apr 3, 2024

t0yv0 commented Apr 5, 2024

mjeffryes commented May 20, 2024

rpmccarter commented May 22, 2024

TargetGroup sometimes does not attach to ApplicationLoadBalancer #1254

TargetGroup sometimes does not attach to ApplicationLoadBalancer #1254

Comments

rpmccarter commented Apr 3, 2024

What happened?

Example

Output of pulumi about

Additional context

Contributing

t0yv0 commented Apr 5, 2024

mjeffryes commented May 20, 2024

rpmccarter commented May 22, 2024

`TargetGroup` sometimes does not attach to `ApplicationLoadBalancer` #1254

`TargetGroup` sometimes does not attach to `ApplicationLoadBalancer` #1254

Output of `pulumi about`