Delvin

A program trying to solve tasks from the SWE-bench Lite dataset with simple actions

Achieves 23% pass on SWE Bench Lite through LLM eval, so take that result with a big grain of salt!

A lot more details in our blog post.

Requirements

Python 3.9+
git
An Opper API key

Run it

python main.py

How does it work?

Delvin is simple agent that can view, edit, and search files. Every action goes through a reflection stage and results from the environment (bad diffs, linting errors) are fed back to the model.

We use the Opper SDK to easily build structured functions. No dirty parsing needed :)

Can I use this without using Opper?

No. This repo relies extensively on our SDK and backend to interface with models and get tracing. It's just a hack project that turned out to work surprisingly well!

Can I run this on my own GitHub issues?

No. If you want to adapt this to your own issues then you will have to edit the main file to load issues from github. PRs welcome:)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Delvin

Requirements

Run it

How does it work?

Can I use this without using Opper?

Can I run this on my own GitHub issues?

Files

README.md

Latest commit

History

README.md

File metadata and controls

Delvin

Requirements

Run it

How does it work?

Can I use this without using Opper?

Can I run this on my own GitHub issues?