Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TESTING.md #694

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Update TESTING.md #694

wants to merge 3 commits into from

Conversation

aramaraju
Copy link

@aramaraju aramaraju commented Feb 16, 2024

Added Guardrails test conditions for JB client

Test plan

Documentation change -reading through and review.

Added Guardrails test conditions for JB client
TESTING.md Outdated

### Guardrails:

- [ ] Check if Guardrails check passes for any code longer than 10 lines generated in Cody Chat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thank for adding test scenario for Guardrails but it looks a bit incomplete to me.
It does not explain what Guardrails check is and how to check if it passed.
Ideally it should be in the form:

  1. Do to trigger some action
  2. Check if results looks like that:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to put in a detailed description of actions, and tangible expectation. Let me know if that works OK.

TESTING.md Outdated Show resolved Hide resolved
@aramaraju
Copy link
Author

@pkukielka Does this loom help explain the feature and I can accommodate that into the test plans:

https://www.loom.com/share/077cf0f2b6174c5dbd2825653f6ed4cd?sid=6f76b12b-afd6-4181-802d-2eaa0c35b005

@pkukielka
Copy link
Contributor

@pkukielka Does this loom help explain the feature and I can accommodate that into the test plans:

https://www.loom.com/share/077cf0f2b6174c5dbd2825653f6ed4cd?sid=6f76b12b-afd6-4181-802d-2eaa0c35b005

Unfortunately I do not have access to this video.
But I'm sure any context added will be helpful.

@aramaraju
Copy link
Author

@pkukielka sharing the video through GDrive

@pkukielka
Copy link
Contributor

@pkukielka Does this loom help explain the feature and I can accommodate that into the test plans:

https://www.loom.com/share/077cf0f2b6174c5dbd2825653f6ed4cd?sid=6f76b12b-afd6-4181-802d-2eaa0c35b005

Thanks, that looks fine.
I found the easiest way to trigger the check is to simply ask chat this question:

Give me snipped to the javascript opensource code of your choice.

For me it always returns a bit of code which fails the guardrails.

@taylorsperry
Copy link
Contributor

@aramaraju are you still working on updates to this PR and sourcegraph/cody#3204?

@aramaraju
Copy link
Author

@taylorsperry looks like @pkukielka was able to find a test to verify this. I'm good with that to check for testing for the beta version of it

@taylorsperry
Copy link
Contributor

@aramaraju I think it would be good to have this as part of our TESTING.md doc too. Automated tests are obviously great, but I want to make sure we have QA eyes on all of our features too. And this is helpful resource for cross-functional teams (docs, marketing, etc) to understand the features the clients support.

@dominiccooney
Copy link
Contributor

+1 to @pkukielka 's comments about making the test plan literal with specific actions and expected outputs. It's not going to scale to have a 4 minute Loom for every feature.

Copy link
Contributor

@dominiccooney dominiccooney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting this back in @aramaraju 's queue for @pkukielka 's feedback.

@aramaraju
Copy link
Author

aramaraju commented Mar 29, 2024

@cbart Can you suggest a few test cases that you used with the JBs team? You know better the OSS code that's indexed. Thank you :)

@pkukielka
Copy link
Contributor

@aramaraju @cbart this is hanging unmerged for 2 months now.
Could you please reword the tests steps (I already give one working example earlier) so we can free you from this one? :)

@cbart
Copy link
Contributor

cbart commented Apr 16, 2024

Apologies for not responding more promptly here. I did not see the notifications. I'll work through this today.

@cbart
Copy link
Contributor

cbart commented Apr 16, 2024

Done now. The test only goes through success scenario (guardrails check passes).

Testing a guardrails-failed scenario for a generated snippet proves to be extremely hard in the regular use case. Even when giving Cody exact instructions to generate - really rewrite a given snippet it always seems to adjust indentation or make some tiny changes here and there - and attribution check is an exact match.

The way we've been doing this so far was to actually put a snippet in Cody chat ourselves (guardrails runs there too!) and then we have full control over how the snippet looks like, but in both editors that's not possible right now (link).

@cbart cbart enabled auto-merge (squash) April 16, 2024 09:45
Copy link
Contributor

@pkukielka pkukielka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants