-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeout while waiting for state to become 'tfSTABLE' (last state: 'tfPENDING', timeout: 20m0s) #1112
Comments
Following the links I've found this prior art: pulumi/terraform-provider-aws#59 One possibility here is to raise default timeouts again. |
pulumi/terraform-provider-aws#59 I've found has some prior art on editing default timeouts. Perhaps we could increase the values found in https://github.com/hashicorp/terraform-provider-aws/blob/master/internal/service/ecs/service.go#L50 |
I'm leaving this in the tracker to accumulate upvotes, and if it does we can circle back to pulumi-aws and increase default timeouts by patching upstream. For the moment issues with flaky tests and examples in this repository can be resolved by applying the custom timeout transformation suggested by @danielrbradley . |
Looking into this further by checking in on the AWS console, I realized that this isn't really a timeout issue. The container fails to come up due to a configuration issue, and the provider gives up waiting after 20m so it looks like a timeout. In this case, it was a Cloudwatch issue, but presumably it could be other reasons.
We should look into detecting such issues and notifying the user promptly and correctly. |
I'm currently experencing this issue.
If I understand correctly (which I may not, still learning a bunch) it seems like the awsx implementation of the fargate service needs to update how it handles logConfigurations and creating log groups when no logConfiguration is provided. |
I ran into this as well, deployments kept timing out but then I would retry immediately and it would complete successfully almost instantly, yet the service was unavailable. Spent a good deal of time thinking it was some network config issue, but turns out that the whole thing was due to the task failing to start due to the missing log group issue. I think 3 things could be improved here:
Frankly, I would prioritize 1 and 2, since they really gave me a sense of "spooky action", making it difficult to reason about how Pulumi works with AWS and eventually making me consider that there was something wrong with Pulumi. |
What happened?
I'm receiving this error a lot when trying to test examples locally:
This timeout happens when trying to record example baseline behavior, say for ecs/nodejs/ on AWS 5.42.0 and AWSX 1.x.x, but also when running examples on latest versions or the dependencies.
I have seen this affect the aws:ecs/service:Service through FargateService and other component resource wrappers.
For users affected by this issue, the current workaround per @danielrbradley is to apply a transformation that increases the custom timeout for the ECS service, see #1118 for a fully worked out example.
Please upvote this issue if this affects your workflow, and we can consider increasing default timeouts in the AWS provider.
Example
N/A
Output of
pulumi about
Additional context
No response
Contributing
Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).
The text was updated successfully, but these errors were encountered: