Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Projects that have a broken tailscale connection for more than 5 minutes can not reestablish it #522

Closed
Tpuljak opened this issue May 10, 2024 · 3 comments · Fixed by #575
Assignees
Labels
bug Something isn't working

Comments

@Tpuljak
Copy link
Member

Tpuljak commented May 10, 2024

Describe the bug
If a project's tailscale connection is broken for more than 5 minutes (e.g. the server is stopped for more than 5 minutes), the connection can not be reestablished. This is due to how the headscale server is configured and due to the fact that projects use ephemeral network keys.

To Reproduce
Steps to reproduce the behavior:

  1. Run the Daytona Server
  2. Create a workspace and observe that you can use daytona ssh/code to connect to it.
  3. Stop the Daytona Server for more than 5 minutes, then start it again
  4. Observe that you can not connect to the same workspace you created in step 2.

Expected behavior
Projects should be able to reestablish a connection to the server no matter how long the connection was broken for.

Desktop (please complete the following information):

  • OS: any
  • Daytona Version: v0.14.0

Additional context
There are 3 possible solutions to the issue:

  1. Increase the ephemeral inactivity timeout to a bigger duration.
  2. Allow projects to create non-ephemeral network keys. This would require the removal of created network keys once the workspace is removed.
  3. Introduce a retry policy in the agent that would attempt to regenerate an ephemeral network key if the tailscale connection is broken.

I favor approach 3 over 2, and avoid 1 as that could lead to node pileup on the headscale server.

@Tpuljak Tpuljak added the bug Something isn't working label May 10, 2024
@vedranjukic
Copy link
Member

Solution 2 seems very exact. It will require a bit more effort, but in the end, the workspace removal can require a cleanup anyway.

@Tpuljak
Copy link
Member Author

Tpuljak commented May 13, 2024

Solution 2 seems very exact. It will require a bit more effort, but in the end, the workspace removal can require a cleanup anyway.

Agreed. I would suggest that the user of the auth key becomes PROJECT_NAME-WORKSPACE_ID (currently it's daytona for all keys), this way we can list, and then wipe, all the users keys with:

client.ListPreAuthKeys(ctx, &v1.ListPreAuthKeysRequest{
  User: fmt.Sprintf("%s-%s", workspaceId, projectName),
})

@Tpuljak
Copy link
Member Author

Tpuljak commented May 14, 2024

Upon further discussion, we will go with option 3. as it is more clean and the agent would be self-sufficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants