Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interactive automated browsing? (GPT-4V autopilot, simple scripting, etc) #131

Closed
walking-octopus opened this issue Dec 14, 2023 · 2 comments

Comments

@walking-octopus
Copy link

I wonder if adding some mechanisms for keeping a session alive and using the CLI to send commands to it can help automate simple actions or hack together entire agents.

Say I want to create a script to auto-order N of item X. Or maybe open a session, start a chat with GPT-4, ask it a question, and have it use that headless browser to take a screenshot, annotate clickable parts with IDs, send back the screenshot and have it choose the interaction, etc, until it got the information it wanted to continue the chat with the user.

I don't know if it's a little beyond the scope of the project, but I think it can definitely be neat.

@nmstoker
Copy link

Wouldn't this be better done directly with Playwright which underlies shot-scraper?

@walking-octopus
Copy link
Author

Yes, revisiting this issue, I do see it is most definitely not in the spirit of doing one thing well... I'll close it now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants