Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback/improvements for dss #44

Open
DnPlas opened this issue Mar 8, 2024 · 4 comments
Open

Feedback/improvements for dss #44

DnPlas opened this issue Mar 8, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@DnPlas
Copy link
Contributor

DnPlas commented Mar 8, 2024

Bug Description

There are a couple of user experience limitations that are preventing us from using dss correctly. There are also small errors in the code. Here are a couple I have found while reviewing some PRs:

  1. Timeout - I am working on a small multipass machine with network limitations and the dss initialize --kubeconfig $KUBECONFIG command just times out with a message that is not very helpful message:
2024-03-08 09:17:36 [ERROR] [initialize] [initialize]: Timeout waiting for deployment 'mlflow-deployment' in namespace 'dss' to be ready. Deleting resources...

For enhancing the user experience it would be ideal to have:

  • Options to change the timeout
  • A better logging to understand what's wrong
  1. manifest_templates are not installed in /usr/local/lib/python3.10/dist-packages/dss/ which causes some of the dss commands to fail while looking for those package data files. I have identified some missing items in the setup.cfg file:
  • [options.package_data] should point to the files in manifest_templates/ instead of just manifest.yaml
  • Potentially we'll need to include MANIFESTS.in
  1. There seems to be a typo here. It should be 60(?).

  2. The logs in the initialise command is confusing:

2024-03-08 09:33:49 [INFO] [utils] [wait_for_deployment_ready]: Waiting for deployment mlflow in namespace dss to be ready...
2024-03-08 09:33:49 [ERROR] [initialize] [initialize]: Timeout waiting for deployment 'mlflow-deployment' in namespace 'dss' to be ready. Deleting resources...

We are waiting for the mlflow deployment to be ready, but then we refer to the mlflow-deployment. I see two problems with this:

  • We are using two names to refer to the same thing
  • The initialise command is talking about mlflow, which seems out of context considering that we are running dss initialize .... I'd expect better messaging for this, like: initializing dss -> deploying mlflow -> waiting for mlflow -> etc.
  1. Repository doesn't have a README.md, CONTRIBUTING.md.
  2. We could include a message in dss that tells users when any of the dependencies are missing or prerequisites are not met
  • pip packages
  • microk8s addons(?) <--- this could be tricky because the k8s node could be anything

To reproduce

  1. Install dss from source and run commands in the description
KUBECONFIG=~/.kube/config
pip install .
  1. Follow each step in the bug description
@DnPlas DnPlas added the bug Something isn't working label Mar 8, 2024
Copy link

Thank you for reporting us your feedback!

The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5436.

This message was autogenerated

@DnPlas DnPlas changed the title Feedback for dss Feedback/improvements for dss Mar 8, 2024
@ca-scribner
Copy link
Contributor

ca-scribner commented Mar 8, 2024

I'm +1 on pretty much everything here. Adding some notes:

Timeout

agreed that dss initialize has an arbitrary timeout. would a --timeout arg for all the commands work for you? I'm not sure what the best timeout behaviour is either - should the command just stop? should it stop and delete everything?

Logging

Agreed that the logging is inconsistent. My preference would be document anything we dislike, but not to bother changing it until most of the commands are implemented and tested. Reason being that we're going to have to go through at the end anyway and make sure it all feels consistent. wdyt?

manifest template installs

Yep that's a bug. To test it atm before this gets fixed, it should work if you do an editable install (pip install -e .)

@ca-scribner
Copy link
Contributor

I tried sorting out the manifest stuff and I don't know what I'm doing wrong, but I can't get it to include anything in the dist. If anyone else knows how to make it work, that would be helpful

@DnPlas
Copy link
Contributor Author

DnPlas commented Mar 13, 2024

Another piece of feedback: every time a command timesout, ALL the resources in the dss namespace are removed. I ran into it after initialising and trying to create a notebook. The latter timed out and all my resources were deleted. This behaviour is far from ideal, and we should avoid it at all costs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants