Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache tests #124

Open
slavingia opened this issue Dec 20, 2024 · 6 comments
Open

Cache tests #124

slavingia opened this issue Dec 20, 2024 · 6 comments

Comments

@slavingia
Copy link
Contributor

slavingia commented Dec 20, 2024

A way to make tests faster.

Via snapshot images? Via URLs?

Bounty: $1,000 for each PR that improves the speed of a shortest test.

@bojl
Copy link

bojl commented Dec 24, 2024

One of way of speeding up tests is to maintain a cache that maps the test input (prompt, site) to the steps the AI decides to take (mouse move, click, type, button press, etc). That way, during the next identical test run you can just have playwright execute the steps without calling the LLM until the test is complete (to verify that the test was successful). You might also be able to store jest verification steps along the way (like validating divs exist, etc) to avoid the need to call the LLM at all.

This is useful for testing regressions, but obviously not as useful when the page you're testing has changed since the last test run.

Another thing you can do is allow the LLM to suggest multiple action in a single prompt rather than 1 action = 1 prompt

@bojl
Copy link

bojl commented Dec 24, 2024

Ah I've been beaten to the punch, at least on the first suggestion-- #134

@slavingia
Copy link
Contributor Author

Yep, that's probably the right approach. And rerun a step/whole test upon failure in case of a page change.

@gladyshcodes
Copy link

@slavingia Hi there! That is how I see caching implemented, what do you think?

Essentially, we could keep track of every step made along with LLM resps in cache. I believe it's better to do it per test rather than per test file. On initial run, each step will be stored in cache. On subsequent test runs, we would replay those instructions. If any of them fail or UI have changed, we will fall back to a normal test execution and call LLM, rebuilding cache on the fly. I am not sure about the performance gain and effectiveness of such approach yet, just wanted to gather some feedback and thoughts :)

Screenshot 2024-12-26 at 15 42 35

@slavingia
Copy link
Contributor Author

Overall flow makes sense! Should make most tests very fast and computer use only needed in the case of failure, where certain things changed.

@gladyshcodes
Copy link

gladyshcodes commented Dec 26, 2024

I will make PR implementing this functionality soon 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants