Skip to content

Latest commit

 

History

History
46 lines (25 loc) · 1.32 KB

README.md

File metadata and controls

46 lines (25 loc) · 1.32 KB

Delvin

Delvin

A program trying to solve tasks from the SWE-bench Lite dataset with simple actions

Achieves 23% pass on SWE Bench Lite through LLM eval, so take that result with a big grain of salt!

A lot more details in our blog post.

Requirements

  • Python 3.9+
  • git
  • An Opper API key

Run it

python main.py

How does it work?

Architecture

Delvin is simple agent that can view, edit, and search files. Every action goes through a reflection stage and results from the environment (bad diffs, linting errors) are fed back to the model.

We use the Opper SDK to easily build structured functions. No dirty parsing needed :)

Can I use this without using Opper?

No. This repo relies extensively on our SDK and backend to interface with models and get tracing. It's just a hack project that turned out to work surprisingly well!

Can I run this on my own GitHub issues?

No. If you want to adapt this to your own issues then you will have to edit the main file to load issues from github. PRs welcome:)