Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] A competitive Web Browsing agent #1856

Merged
merged 18 commits into from
May 21, 2024

Conversation

frankxu2004
Copy link
Collaborator

@frankxu2004 frankxu2004 commented May 17, 2024

This PR aims at enabling a competitive browsing agent for #1470.

Now I transplanted the simplified demo agent used in WebArena in our agent hub.

To test, it works best with GPT-4 LLMs such as GPT-4o.

poetry run python ./opendevin/core/main.py -i 5 -t "tell me the usa's president using google search" -c BrowsingAgent -m gpt-4o-2024-05-13

@frankxu2004 frankxu2004 marked this pull request as ready for review May 20, 2024 21:57
@frankxu2004
Copy link
Collaborator Author

frankxu2004 commented May 20, 2024

Example logs:

17:52:07 - opendevin:INFO: browsing_agent.py:128 - Last action failed:
click('235')
Try again with the current state of the page.


# Current Accessibility Tree:
RootWebArea 'Google', focused
        [20] navigation ''
                [22] link 'About'
                [23] link 'Store'
                [31] link 'Gmail'
                [33] link 'Search for Images'
                [38] button 'Google apps', expanded=False
                        [39] image ''
                [40] link 'Sign in'
                [a] IframePresentational ''
        [48] image 'Google'
        [78] search ''
                [88] image ''
                [92] combobox 'Search' value='current president of the USA', focused, autocomplete='both', hasPopup='listbox', expanded=True, controls='Alh6id'
                [98] button 'Clear'
                        [100] image ''
                [103] button 'Search by voice'
                        [104] image ''
                [106] button 'Search by image'
                        [107] image ''
                [127] listbox '', multiselectable=False, orientation='vertical'
                        [141] option 'current president of the usa', selected=False
                        [141] option 'who is the president of the usa', selected=False
                        [141] option 'who is the president of the usa now', selected=False
                        [141] option 'who is the president of the usa 2024', selected=False
                        [141] option 'who is the president of the usa 2023', selected=False
                        [141] option 'president of the senate us', selected=False
                        [141] option 'who is the president of the usa 2020', selected=False
                        [141] option 'who is the president of the usa 2021', selected=False
                        [141] option 'who is the president of the usa today', selected=False
                        [141] option 'who is the president of the usa during ww1', selected=False
                [226] button 'Google Search'
                [227] button "I'm Feeling Lucky"
                [230] button 'Report inappropriate predictions'
                [235] button 'Google Search'
                [236] button "I'm Feeling Lucky"
        [271] contentinfo ''
                [275] link 'Advertising'
                [276] link 'Business'
                [277] link 'How Search works'
                [279] link 'Our third decade of climate action: join us'
                        [280] image ''
                [283] link 'Privacy'
                [284] link 'Terms'
                [289] button 'Settings', hasPopup='menu', expanded=False
                        generic '', hasPopup='menu'

# Previous Actions
goto('https://www.google.com')
fill('92', 'current president of the USA')
click('235')
click('235')


Here is an example with chain of thought of a valid action when clicking on a button:
"
In order to accomplish my goal I need to click on the button with bid 12
```click("12")```
"

17:52:07 - opendevin:INFO: browsing_agent.py:129 - In order to accomplish my goal, I need to click on the button with bid 226 to perform the Google search.
```click('226')```
17:52:07 - ACTION
BrowseInteractiveAction(browser_actions="click('226')", thought='In order to accomplish my goal, I need to click on the button with bid 226 to perform the Google search.', action='browse_interactive')


==============
STEP 4

17:52:13 - opendevin:INFO: browsing_agent.py:141 - Cost: 0.02 USD | Accumulated Cost: 0.05 USD
17:52:13 - opendevin:INFO: browsing_agent.py:128 - 

# Current Accessibility Tree:
RootWebArea 'current president of the USA - Google Search', focused
        [15] heading 'Accessibility Links'
        [18] link 'Skip to main content'
        [19] link 'Turn off continuous scrolling'
        [27] link 'Accessibility help'
        [31] link 'Accessibility feedback'
        [35] search ''
                [39] link 'Google'
                        [40] image 'Google'
                [56] combobox 'Search' value='current president of the USA', autocomplete='both', hasPopup='listbox', expanded=False, controls='Alh6id'
                [59] button 'Clear'
                        [61] image ''
                [63] button 'Search by voice'
                        [64] image ''
                [65] button 'Search by image'
                        [66] image ''
                [67] button 'Search'
                        [70] image ''
        [248] button 'Settings'
                [250] image ''
        [253] banner ''
                [256] button 'Google apps', expanded=False
                        [257] image ''
                [259] link 'Sign in'
        [280] navigation ''
                [283] navigation ''
                        [287] heading 'Filters and Topics'
                        [291] list ''
                                [292] listitem ''
                                        StaticText 'All'
                                [295] listitem ''
                                        [296] link 'Images'
                                [298] listitem ''
                                        [299] link 'News'
                                [301] listitem ''
                                        [302] link 'Videos'
                                [304] listitem ''
                                        [305] link 'Shopping'
                                [307] listitem ''
                                        [309] button 'More', hasPopup='menu', expanded=False
                                                [312] image ''
                        [336] button 'Tools', expanded=False, controls='hdtbMenus'
                [340] list ''
                        [346] button 'SafeSearch', hasPopup='menu', expanded=False
                                [353] image ''
        [554] main ''
                [558] heading 'Search Results'
                [575] heading 'United States/President'
                [583] heading 'Joe Biden'
                        [585] link 'Joe Biden'
                [597] button 'Credit: Getty Images/The White House'
                        [600] image 'Credit: Getty Images/The White House'
                StaticText 'The 46th and current president of the United States is Joseph R. Biden, Jr. He was sworn into office on January 20, 2021.'
                StaticText 'Dec 6, 2023'
                [618] link 'Presidents, vice presidents, and first ladies | USAGov USA.gov https://www.usa.gov › ... › U.S. facts and figures'
                        [620] heading 'Presidents, vice presidents, and first ladies | USAGov'
                [649] button 'About this result'
                        [652] image ''
                [656] heading 'People also search for'
                        [657] link 'People also search for'
                [660] link 'Benjamin Netanyahu (Trending)'
                        [666] image ''
                [670] link 'Donald Trump'
                [676] link 'Katie Britt (Trending)'
                        [682] image ''
                [686] link 'Jill Biden'
                [692] link 'Barack Obama'
                [698] link 'Neilia Hunter Biden'
                [704] link 'Kamala Harris'
                [725] button 'Feedback'
                StaticText 'Sources include:'
                [730] link 'Ballotpedia'
                StaticText ','
                [731] link 'Wikipedia'
                StaticText '.'
                [732] link 'Learn more'
                [753] heading 'People also ask'
                [760] button 'About this result'
                        [763] image ''
                [771] button 'Who is next in line for president of us?', expanded=False, controls='_CMZLZpjVDMvXseMP7LSn8Ao_44'
                [830] button 'Who is the new president of United States?', expanded=False, controls='_CMZLZpjVDMvXseMP7LSn8Ao_34'
                [889] button 'Who is the number 1 US president?', expanded=False, controls='_CMZLZpjVDMvXseMP7LSn8Ao_42'
                [948] button 'What number president is Joe Biden?', expanded=False, controls='_CMZLZpjVDMvXseMP7LSn8Ao_43'
                [1029] button 'Feedback'
                [1051] link 'President of the United States Wikipedia https://en.wikipedia.org › wiki › President_of_the_Unit...'
                        [1053] heading 'President of the United States'
                [1082] button 'About this result'
                        [1085] image ''
                [1089] emphasis ''
                        StaticText 'Joe Biden'
                StaticText 'is the 46th and current president of the United States, having assumed office on January 20, 2021.'
                StaticText '\u200e'
                [1094] link 'List'
                StaticText '· \u200e'
                [1095] link 'Powers'
                [1096] link 'Executive Office of the'
                [1097] link 'Vice President'
                [1105] link 'Joe Biden: The President The White House (.gov) https://www.whitehouse.gov › administration › presiden...'
                        [1107] heading 'Joe Biden: The President'
                [1136] button 'About this result'
                        [1139] image ''
                StaticText 'As President,'
                [1143] emphasis ''
                        StaticText 'Biden'
                StaticText "will restore America's leadership and build our communities back better. Joseph Robinette Biden, Jr. was born in Scranton, Pennsylvania, the\xa0..."
                [1153] link 'President Joe Biden (@potus) • Instagram photos and videos Instagram\xa0·\xa0potus 19.2M+ followers'
                        [1155] heading 'President Joe Biden (@potus) • Instagram photos and videos'
                [1182] button 'About this result'
                        [1185] image ''
                [1189] emphasis ''
                        StaticText '46th'
                StaticText 'President of the United States, husband to @flotus, proud dad and pop. Finishing the job for all Americans. Text me: (302) 404-0880 ... Photo by President\xa0...'
                [1199] link 'President Joe Biden Facebook\xa0·\xa0President Joe Biden 11.9M+ followers'
                        [1201] heading 'President Joe Biden'
                [1228] button 'About this result'
                        [1231] image ''
                [1235] emphasis ''
                        StaticText 'President Joe Biden'
                StaticText '. 10M likes · 72129 talking about this. 46th President of the United States, husband to @FLOTUS, proud father and pop. Text me (302)...'
                [1245] link 'The Executive Branch The White House (.gov) https://www.whitehouse.gov › ... › Our Government'
                        [1247] heading 'The Executive Branch'
                [1276] button 'About this result'
                        [1279] image ''
                [1283] emphasis ''
                        StaticText 'President'
                StaticText 'is both the head of state and head of government of the'
                [1284] emphasis ''
                        StaticText 'United States of America'
                StaticText ', and Commander-in-Chief of the armed forces. Under Article II of\xa0...'
                [1294] link 'President of the United States United States Mission to the United Nations (.gov) https://usun.usmission.gov › Our Leaders'
                        [1296] heading 'President of the United States'
                [1325] button 'About this result'
                        [1328] image ''
                [1332] emphasis ''
                        StaticText 'Joseph R. Biden'
                StaticText '. President Biden represented Delaware for 36 years in the U.S. Senate before becoming the 47th Vice President of the United States.'
                [1342] link 'President of the United States Ballotpedia https://ballotpedia.org › President_of_the_United_States'
                        [1344] heading 'President of the United States'
                [1373] button 'About this result'
                        [1376] image ''
                StaticText 'The current president is'
                [1380] emphasis ''
                        StaticText 'Joe Biden (D'
                StaticText '). Election ... The executive Power shall be vested in a President of the United States of America. ... The President, Vice\xa0...'
                [1391] link 'Joe Biden Wikipedia https://en.wikipedia.org › wiki › Joe_Biden'
                        [1393] heading 'Joe Biden'
                [1422] button 'About this result'
                        [1425] image ''
                [1429] emphasis ''
                        StaticText 'Joseph Robinette Biden Jr'
                StaticText 'is an American politician who is the 46th and current president of the United States since 2021. A member of the Democratic Party,\xa0...'
                StaticText '\u200e'
                [1434] link 'Political positions'
                StaticText '· \u200e'
                [1435] link 'Electoral history'
                [1436] link '2008 Presidential Campaign'
                [1437] link 'Jill Biden'
                [1445] link 'Images'
                [1452] button 'About this result'
                        [1455] image ''
                [1462] button 'Joe Biden: The President | The White House'
                        [1465] image 'Joe Biden: The President | The White House'
                [1468] link 'Joe Biden: The President | The White House The White House'
                        [1473] image ''
                [1483] button 'About this result'
                        [1486] image ''
                [1488] button 'President of the USA | Current Leader'
                        [1491] image 'President of the USA | Current Leader'
                [1494] link 'President of the USA | Current Leader PlanetRulers'
                        [1499] image ''
                [1509] button 'About this result'
                        [1512] image ''
                [1514] button 'Joe Biden: The President | The White House'
                        [1517] image 'Joe Biden: The President | The White House'
                [1520] link 'Joe Biden: The President | The White House The White House'
                        [1525] image ''
                [1535] button 'About this result'
                        [1538] image ''
                [1708] button 'Feedback'
                [1720] button '6 more images'
                        [1726] image ''
        [1771] heading 'Related searches'
        [1778] button 'About this result'
                [1781] image ''
        [1786] link 'who is the 46th president'
        [1792] link 'who is the vice president of the united states'
        [1799] link 'who is the prime minister of usa'
        [1805] link 'all presidents in order'
        [1811] link 'first president of usa'
        [1816] link '5 requirements to be president'
        [1821] link 'joe biden'
        [1826] link 'presidential line of succession today'
        generic '', hidden=True
        generic '', hidden=True
        generic '', owns='rhs'
                [1868] complementary ''
                        generic '', hidden=True
                        generic '', hidden=True
                        [1872] heading 'Complementary Results'
                        [1891] link 'Joe Biden'
                                [1892] heading 'Joe Biden'
                        [1895] heading '46th U.S. President'
                        [1900] button 'More options', hasPopup='menu', expanded=False
                                [1901] image 'More options'
                                        [1902] image ''
                        [1992] link ''
                                [1994] image ''
                        [2009] link 'whitehouse.gov'
                                [2011] image ''
                        [2018] heading 'Description'
                        StaticText 'Joseph Robinette Biden Jr. is an American politician who is the 46th and current president of the United States since 2021.'
                        [2023] link 'Wikipedia'
                        StaticText 'Born'
                        StaticText ':'
                        StaticText 'November 20, 1942 (age 81\xa0years),'
                        [2033] link 'Scranton, PA'
                        StaticText 'Edited works'
                        StaticText ':'
                        [2042] link 'Halting the Spread of HIV/AIDS: Future Efforts in the U. S. Bilateral and Multilateral Response: Congressional Hearings'
                        StaticText ','
                        [2045] link 'MORE'
                        StaticText 'Organizations founded'
                        StaticText ':'
                        [2055] link 'United States Department of Defense China Task Force'
                        StaticText ','
                        [2058] link 'MORE'
                        StaticText 'Grandchildren'
                        StaticText ':'
                        [2068] link 'Navy Joan Roberts'
                        StaticText ','
                        [2069] link 'Natalie Biden'
                        [2070] link 'Maisy Biden'
                        [2071] link 'Robert Biden II'
                        StaticText ','
                        [2072] link 'Naomi Biden'
                        [2073] link 'Finnegan Biden'
                        StaticText 'Grandparents'
                        StaticText ':'
                        [2082] link 'Ambrose J. Finnegan'
                        StaticText ','
                        [2083] link 'Mary Elizabeth Robinette Biden'
                        [2084] link 'Joseph H. Biden'
                        [2085] link 'Geraldine C. Blewitt'
                        StaticText 'Great-grandparents'
                        StaticText ':'
                        [2094] link 'George Hamilton Robinette'
                        StaticText ','
                        [2097] link 'MORE'
                        StaticText 'Marriage location'
                        StaticText ':'
                        [2107] link 'New York, NY'
                        StaticText 'Sources include:'
                        [2111] link 'Ballotpedia'
                        [2112] link 'Wikipedia'
                        StaticText '.'
                        [2113] link 'Learn more'
                        [2119] heading 'Profiles'
                        [2125] link 'Instagram'
                                [2127] image ''
                        [2132] link 'X (Twitter)'
                                [2134] image ''
                        [2139] link 'Facebook'
                                [2141] image ''
                        [2146] link 'YouTube'
                                [2148] image ''
                        [2159] link 'More about Joe Biden'
                        [2166] button 'Feedback'
                        generic '', hidden=True
                        generic '', hidden=True
        [1843] progressbar 'Loading...', live='polite', relevant='additions text', valuemin=0, valuemax=100, valuetext=''
        [1847] heading 'Page Navigation'
        [1848] button 'More results'
        [1856] button '', live='polite', relevant='additions text'
        [1864] navigation ''
        generic '', live='polite', relevant='additions text'
        generic '', live='polite', relevant='additions text'
        generic '', live='polite', relevant='additions text'
        generic '', live='polite', relevant='additions text'

# Previous Actions
goto('https://www.google.com')
fill('92', 'current president of the USA')
click('235')
click('235')
click('226')


Here is an example with chain of thought of a valid action when clicking on a button:
"
In order to accomplish my goal I need to click on the button with bid 12
```click("12")```
"

17:52:13 - opendevin:INFO: browsing_agent.py:129 - In order to accomplish my goal of telling you the current president of the USA, I need to send a message with the relevant information found in the search results.

```send_msg_to_user('The current president of the USA is Joe Biden.')```
17:52:13 - ACTION
MessageAction(content='The current president of the USA is Joe Biden.', wait_for_response=False, action='message')

Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! This mainly adds a browsing agent to the agent hub and tweaked a little bit about browser env. I think we can approve it to unblock the integration of BrowserGym.

EDIT: I also locally tested and confirmed the sample command works on my end!

PS: When we figure out a way to do task decomposition, CodeAct can eventually delegate tasks to this BrowserAgent for complex web browsing tasks!

Copy link
Collaborator

@yufansong yufansong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave some nits. Mostly LGTM. I would be appricate it if you can add more comments or simply elaborate your design and some parameter setting. Then other people can add more work on your codebase. I don't want to block our integration progress and AP it. I can help for some follow up refactor or nits if you have no time.

agenthub/browsing_agent/README.md Outdated Show resolved Hide resolved
agenthub/browsing_agent/browsing_agent.py Outdated Show resolved Hide resolved
agenthub/browsing_agent/browsing_agent.py Outdated Show resolved Hide resolved
agenthub/browsing_agent/prompt.py Outdated Show resolved Hide resolved
agenthub/browsing_agent/prompt.py Outdated Show resolved Hide resolved
agenthub/browsing_agent/prompt.py Show resolved Hide resolved
@frankxu2004
Copy link
Collaborator Author

frankxu2004 commented May 21, 2024

Thanks! @yufansong I added some comments for things that are not clear. Hope it's good for now -- since I changed the BrowserOutputObservation a bit, the integration tests are failing for some, would you mind taking a look how to fix those?

EDIT: NVM, just fixed those, should be ready to go

@yufansong yufansong enabled auto-merge (squash) May 21, 2024 18:54
@yufansong yufansong disabled auto-merge May 21, 2024 19:02
@yufansong yufansong enabled auto-merge (squash) May 21, 2024 19:03
@yufansong yufansong merged commit 1fe290a into OpenDevin:main May 21, 2024
23 checks passed
@frankxu2004 frankxu2004 deleted the browsing-agent branch May 21, 2024 20:03
@li-boxuan
Copy link
Collaborator

li-boxuan commented May 22, 2024

Sad, our project test coverage reduced by 5.87%... let me see if there's anything we could do to test this.

@li-boxuan
Copy link
Collaborator

I've made some progress in creating an integration test for this agent! Will create a PR in a day.

)


class SystemPrompt(PromptElement):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frankxu2004 this prompt (along with many other prompts in this file) seems unused? Is it by intention?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, basically this whole prompt.py file is not currently used. Currently the agent is a simplified version for ease of understanding. However I included here with the intention of incorporating a more complex agent using more comprehensive information as next steps. Here it's still useful as it provides others of building blocks of prompts and understanding what possible information to include as context for LLMs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These PRs are mostly for chasing the neurips paper deadline so not all features are implemented yet.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, sounds fair. I am just having a bit trouble reproducing poetry run python ./opendevin/core/main.py -i 5 -t "tell me the usa's president using google search" -c BrowsingAgent -m gpt-4o-2024-05-13... I tried like 5 times and only succeeded once.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a bit weird, what error are you seeing? do you have logs

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the agent does not return AgentFinishAction, so to the eyes of the frame, it's always error in the end. Maybe I should add this Finish thing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logs.zip

image

Basically, keep clicking without making progress

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sometimes it's like this. I improved the agent a bit and fixed some issues here #1993

super-dainiu pushed a commit to super-dainiu/OpenDevin that referenced this pull request May 23, 2024
* initial attempt at a browsing only agent

* add browsing agent

* update

* implement agent

* update

* fix comments

* remove unnecessary things from memory extras

* update image processing

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
li-boxuan added a commit that referenced this pull request Jun 5, 2024
* add ml-bench w/o exec env

* fix typos (#1956)

no functional change

* Refactored Logs (#1939)

* [Feat] A competitive Web Browsing agent (#1856)

* initial attempt at a browsing only agent

* add browsing agent

* update

* implement agent

* update

* fix comments

* remove unnecessary things from memory extras

* update image processing

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update README.md SWE-bench score (#1959)

* Update README.md SWE-bench score

Our most recent results on swe-bench lite are 25%, so this updates the README accordingly.

* Update

* fix: llm is_local function logic error (#1961)

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* doc: update documentation about poetry update (#1962)

* add doc

* Update Development.md

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* feat: add metrics related to cost for better observability (#1944)

* add metrics for total_cost

* make lint

* refact codeact

* change metrics into llm

* add costs list, add into state

* refactor log completion

* refactor and test others

* make lint

* Update opendevin/core/metrics.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/llm/llm.py

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor

* add code

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* doc: add more cmd in unit test documentation (#1963)

* --- (#1975)

updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1976)

updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Logging security (#1943)

* update .gitignore

* Rename the confusing 'INFO' style to 'DETAIL'

* override str and repr

* feat: api_key desensitize

* feat: add SensitiveDataFilter in file handler

* tweak regex, add tests

* more tweaks, include other attrs

* add env vars, those with equivalent config

* fix tests

* tests are invaluable

---------

Co-authored-by: Shimada666 <649940882@qq.com>

* --- (#1967)

updated-dependencies:
- dependency-name: react-dom
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: "@types/react-dom"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1968)

updated-dependencies:
- dependency-name: "@reduxjs/toolkit"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1969)

updated-dependencies:
- dependency-name: husky
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1970)

updated-dependencies:
- dependency-name: tailwind-merge
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1971)

updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Refactor session management (#1810)

* refactor session mgmt

* defer file handling to runtime

* add todo

* refactor sessions a bit more

* remove messages logic from FE

* fix up socket handshake

* refactor frontend auth a bit

* first pass at redoing file explorer

* implement directory suffix

* fix up file tree

* close agent on websocket close

* remove session saving

* move file refresh

* remove getWorkspace

* plumb path/code differently

* fix build issues

* fix the tests

* fix npm build

* add session rehydration

* fix event serialization

* logspam

* fix user message rehydration

* add get_event fn

* agent state restoration

* change history tracking for codeact

* fix responsiveness of init

* fix lint

* lint

* delint

* fix prop

* update tests

* logspam

* lint

* fix test

* revert codeact

* change fileService to use API

* fix up session loading

* delint

* delint

* fix integration tests

* revert test

* fix up access to options endpoints

* fix initial files load

* delint

* fix file initialization

* fix mock server

* fixl int

* fix auth for html

* Update frontend/src/i18n/translation.json

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor sessions and sockets

* avoid reinitializing the same session

* fix reconnect issue

* change up intro message

* more guards on reinit

* rename agent_session

* delint

* fix a bunch of tests

* delint

* fix last test

* remove code editor context

* fix build

* fix any

* fix dot notation

* Update frontend/src/services/api.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix up error handling

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update frontend/src/services/session.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix build errs

* fix else

* add closed state

* delint

* Update opendevin/server/session/session.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* fix #1960 (#1964)

* Add ruff for shared mutable defaults (B) (#1938)

* Add ruff for shared mutable defaults (B)

* Apply B006, B008 on current files, except fast API

* Update agenthub/SWE_agent/prompts.py

Co-authored-by: Graham Neubig <neubig@gmail.com>

* fix unintended behavior change

* this is correct, tell Ruff to leave it alone

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Refactor integration testing CI, add optional Mac tests, and mark a few agents as deprecated (#1888)

* Add MacOS to integration tests

* Switch back to python 3.11

* Install Docker for macos pipeline

* regenerate.sh: Use environmental variable for sandbox type

* Pack different agents' tests into a single check

* Fix CodeAct tests

* Reduce file match and extensive debug logs

* Add TEST_IN_CI mode that reports codecov

* Small fix: don't quit if reusing old responses failed

* Merge codecov results

* Fix typos

* Remove coverage merge step - codecov automatically does that

* Make mac integration tests as optional - too slow

* Fix codecov args

* Add comments in yaml

* Include sandbox type in codecov report name

* Fix codecov report merge

* Revert renaming of test_matrix_success

* Remove SWEAgent and PlannerAgent from tests

* Mark planner agent and SWE agent as deprecated

* CodeCov: Ignore planner and sweagent

* Revert "Remove SWEAgent and PlannerAgent from tests"

This reverts commit 040cb3b.

* Remove all tests for SWE Agent

* Only keep basic tests for MonologueAgent and PlannerAgent

* Mark SWE Agent as deprecated, and ignore code coverage for it

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Fix Repeated Responses in Chat by Adding IPythonRunCellObservation (#1987)

Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* Save CI cycles for backend tests (#1985)

* Fix typo in prompt (#1992)

* Refactor monologue and SWE agent to use the messages in state history (#1863)

* Refactor monologue to use the messages in state history

* add messages, clean up

* fix monologue

* update integration tests

* move private method

* update SWE agent to use the history from State

* integration tests for SWE agent

* rename monologue to initial_thoughts, since that is what it is

* fix: catch session file not existed exception when init EventStream(maybe creating a new session with no session files stored). (#1994)

* add ml-bench in readme

* Bump boto3 from 1.34.110 to 1.34.111 (#2001)

Bumps [boto3](https://github.com/boto/boto3) from 1.34.110 to 1.34.111.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](boto/boto3@1.34.110...1.34.111)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump docker from 7.0.0 to 7.1.0 (#2002)

Bumps [docker](https://github.com/docker/docker-py) from 7.0.0 to 7.1.0.
- [Release notes](https://github.com/docker/docker-py/releases)
- [Commits](docker/docker-py@7.0.0...7.1.0)

---
updated-dependencies:
- dependency-name: docker
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump litellm from 1.37.20 to 1.38.0 (#2005)

Bumps [litellm](https://github.com/BerriAI/litellm) from 1.37.20 to 1.38.0.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](BerriAI/litellm@v1.37.20...v1.38.0)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix SWE-Bench evaluation due to setuptools version (#1995)

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055.

* bump version

* fix session state after resuming (#1999)

* fix state resuming

* fix session reconnection

* fix lint

* Implement `agentskills` for OpenDevin to helpfully improve edit AND including more useful tools/skills (#1941)

* add draft for skills

* Implement and test agentskills functions: open_file, goto_line, scroll_down, scroll_up, create_file, search_dir, search_file, find_file

* Remove new_sample.txt file

* add some work from opendevin w/ fixes

* Add unit tests for agentskills module

* fix some issues and updated tests

* add more tests for open

* tweak and handle goto_line

* add tests for some edge cases

* add tests for scrolling

* add tests for edit

* add tests for search_dir

* update tests to use pytest

* use pytest --forked to avoid file op unit tests to interfere with each other via global var

* update doc based on swe agent tool

* update and add tests for find_file and search_file

* move agent_skills to plugins

* add agentskills as plugin and docs

* add agentskill to ssh box and fix sandbox integration

* remove extra returns in doc

* add agentskills to initial tool for jupyter

* support re-init jupyter kernel (for agentskills) after restart

* fix print window's issue with indentation and add testcases

* add prompt for codeact with the newest edit primitives

* modify the way line number is presented (remove leading space)

* change prompt to the newest display format

* support tracking of costs via metrics

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* implement and add tests for py linting

* remove extra text arg for incompatible subprocess ver

* remove sample.txt

* update test_edits integration tests

* fix all integration

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/runtime/plugins/agent_skills/agentskills.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055.

* bump version

* remove _AGENT_SKILLS_DOCS

* move flake8 to test dep

* update poetry.lock

* remove extra arg

* reduce max iter for eval

* update poetry

* fix integration tests

---------

Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* build: Add poetry command to use Python 3.11 for environment setup (#1972)

* Bump @react-types/shared from 3.23.0 to 3.23.1 in /frontend (#2006)

Bumps [@react-types/shared](https://github.com/adobe/react-spectrum) from 3.23.0 to 3.23.1.
- [Release notes](https://github.com/adobe/react-spectrum/releases)
- [Commits](https://github.com/adobe/react-spectrum/compare/@react-types/shared@3.23.0...@react-types/shared@3.23.1)

---
updated-dependencies:
- dependency-name: "@react-types/shared"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @types/react-syntax-highlighter in /frontend (#2007)

Bumps [@types/react-syntax-highlighter](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-syntax-highlighter) from 15.5.11 to 15.5.13.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-syntax-highlighter)

---
updated-dependencies:
- dependency-name: "@types/react-syntax-highlighter"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @typescript-eslint/parser from 7.9.0 to 7.10.0 in /frontend (#2008)

Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.9.0 to 7.10.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.10.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump lint-staged from 15.2.2 to 15.2.4 in /frontend (#2009)

Bumps [lint-staged](https://github.com/okonet/lint-staged) from 15.2.2 to 15.2.4.
- [Release notes](https://github.com/okonet/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](lint-staged/lint-staged@v15.2.2...v15.2.4)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update README.md

* Update README.md

* add run_infer.sh

* fix input output

* fix docker sandbox

* fix run

* update and clean run_infer.py

* add script to clean up dockers

* update repo uid

* add description

* new

* Update README.md

* use root for sandbox

* update readme

* update ml-bench conda env

* update readme

* update readme

* use try except

* modify raise exception

* add int

* update README

* longer time

* fix existing issues

* fix existing issue

* new docker image

* add metrics of cost

* add result parsing cost

* fix

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-157.ec2.internal>
Co-authored-by: RainRat <rainrat78@yahoo.ca>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: Frank Xu <frankxu2004@gmail.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Shimada666 <649940882@qq.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Rahul Anand <62982824+zeul22@users.noreply.github.com>
Co-authored-by: jiangleo <jiangleo@users.noreply.github.com>
Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: Jeremi Joslin <jeremi@newlogic.com>
Co-authored-by: Aaron Xia <zhhuaxia@gmail.com>
Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: DaxServer <7479937+DaxServer@users.noreply.github.com>
Co-authored-by: Robert <871607149@qq.com>
li-boxuan added a commit that referenced this pull request Jun 6, 2024
* add ml-bench w/o exec env

* fix typos (#1956)

no functional change

* Refactored Logs (#1939)

* [Feat] A competitive Web Browsing agent (#1856)

* initial attempt at a browsing only agent

* add browsing agent

* update

* implement agent

* update

* fix comments

* remove unnecessary things from memory extras

* update image processing

---------

Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Update README.md SWE-bench score (#1959)

* Update README.md SWE-bench score

Our most recent results on swe-bench lite are 25%, so this updates the README accordingly.

* Update

* fix: llm is_local function logic error (#1961)

Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* doc: update documentation about poetry update (#1962)

* add doc

* Update Development.md

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* feat: add metrics related to cost for better observability (#1944)

* add metrics for total_cost

* make lint

* refact codeact

* change metrics into llm

* add costs list, add into state

* refactor log completion

* refactor and test others

* make lint

* Update opendevin/core/metrics.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/llm/llm.py

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor

* add code

---------

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* doc: add more cmd in unit test documentation (#1963)

* --- (#1975)

updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1976)

updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Logging security (#1943)

* update .gitignore

* Rename the confusing 'INFO' style to 'DETAIL'

* override str and repr

* feat: api_key desensitize

* feat: add SensitiveDataFilter in file handler

* tweak regex, add tests

* more tweaks, include other attrs

* add env vars, those with equivalent config

* fix tests

* tests are invaluable

---------

Co-authored-by: Shimada666 <649940882@qq.com>

* --- (#1967)

updated-dependencies:
- dependency-name: react-dom
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: "@types/react-dom"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1968)

updated-dependencies:
- dependency-name: "@reduxjs/toolkit"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1969)

updated-dependencies:
- dependency-name: husky
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1970)

updated-dependencies:
- dependency-name: tailwind-merge
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* --- (#1971)

updated-dependencies:
- dependency-name: i18next
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

* Refactor session management (#1810)

* refactor session mgmt

* defer file handling to runtime

* add todo

* refactor sessions a bit more

* remove messages logic from FE

* fix up socket handshake

* refactor frontend auth a bit

* first pass at redoing file explorer

* implement directory suffix

* fix up file tree

* close agent on websocket close

* remove session saving

* move file refresh

* remove getWorkspace

* plumb path/code differently

* fix build issues

* fix the tests

* fix npm build

* add session rehydration

* fix event serialization

* logspam

* fix user message rehydration

* add get_event fn

* agent state restoration

* change history tracking for codeact

* fix responsiveness of init

* fix lint

* lint

* delint

* fix prop

* update tests

* logspam

* lint

* fix test

* revert codeact

* change fileService to use API

* fix up session loading

* delint

* delint

* fix integration tests

* revert test

* fix up access to options endpoints

* fix initial files load

* delint

* fix file initialization

* fix mock server

* fixl int

* fix auth for html

* Update frontend/src/i18n/translation.json

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>

* refactor sessions and sockets

* avoid reinitializing the same session

* fix reconnect issue

* change up intro message

* more guards on reinit

* rename agent_session

* delint

* fix a bunch of tests

* delint

* fix last test

* remove code editor context

* fix build

* fix any

* fix dot notation

* Update frontend/src/services/api.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix up error handling

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/server/session/agent.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update frontend/src/services/session.ts

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* fix build errs

* fix else

* add closed state

* delint

* Update opendevin/server/session/session.py

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

---------

Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* fix #1960 (#1964)

* Add ruff for shared mutable defaults (B) (#1938)

* Add ruff for shared mutable defaults (B)

* Apply B006, B008 on current files, except fast API

* Update agenthub/SWE_agent/prompts.py

Co-authored-by: Graham Neubig <neubig@gmail.com>

* fix unintended behavior change

* this is correct, tell Ruff to leave it alone

---------

Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Refactor integration testing CI, add optional Mac tests, and mark a few agents as deprecated (#1888)

* Add MacOS to integration tests

* Switch back to python 3.11

* Install Docker for macos pipeline

* regenerate.sh: Use environmental variable for sandbox type

* Pack different agents' tests into a single check

* Fix CodeAct tests

* Reduce file match and extensive debug logs

* Add TEST_IN_CI mode that reports codecov

* Small fix: don't quit if reusing old responses failed

* Merge codecov results

* Fix typos

* Remove coverage merge step - codecov automatically does that

* Make mac integration tests as optional - too slow

* Fix codecov args

* Add comments in yaml

* Include sandbox type in codecov report name

* Fix codecov report merge

* Revert renaming of test_matrix_success

* Remove SWEAgent and PlannerAgent from tests

* Mark planner agent and SWE agent as deprecated

* CodeCov: Ignore planner and sweagent

* Revert "Remove SWEAgent and PlannerAgent from tests"

This reverts commit 040cb3b.

* Remove all tests for SWE Agent

* Only keep basic tests for MonologueAgent and PlannerAgent

* Mark SWE Agent as deprecated, and ignore code coverage for it

---------

Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>

* Fix Repeated Responses in Chat by Adding IPythonRunCellObservation (#1987)

Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>

* Save CI cycles for backend tests (#1985)

* Fix typo in prompt (#1992)

* Refactor monologue and SWE agent to use the messages in state history (#1863)

* Refactor monologue to use the messages in state history

* add messages, clean up

* fix monologue

* update integration tests

* move private method

* update SWE agent to use the history from State

* integration tests for SWE agent

* rename monologue to initial_thoughts, since that is what it is

* fix: catch session file not existed exception when init EventStream(maybe creating a new session with no session files stored). (#1994)

* add ml-bench in readme

* Bump boto3 from 1.34.110 to 1.34.111 (#2001)

Bumps [boto3](https://github.com/boto/boto3) from 1.34.110 to 1.34.111.
- [Release notes](https://github.com/boto/boto3/releases)
- [Changelog](https://github.com/boto/boto3/blob/develop/CHANGELOG.rst)
- [Commits](boto/boto3@1.34.110...1.34.111)

---
updated-dependencies:
- dependency-name: boto3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump docker from 7.0.0 to 7.1.0 (#2002)

Bumps [docker](https://github.com/docker/docker-py) from 7.0.0 to 7.1.0.
- [Release notes](https://github.com/docker/docker-py/releases)
- [Commits](docker/docker-py@7.0.0...7.1.0)

---
updated-dependencies:
- dependency-name: docker
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump litellm from 1.37.20 to 1.38.0 (#2005)

Bumps [litellm](https://github.com/BerriAI/litellm) from 1.37.20 to 1.38.0.
- [Release notes](https://github.com/BerriAI/litellm/releases)
- [Commits](BerriAI/litellm@v1.37.20...v1.38.0)

---
updated-dependencies:
- dependency-name: litellm
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix SWE-Bench evaluation due to setuptools version (#1995)

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055.

* bump version

* fix session state after resuming (#1999)

* fix state resuming

* fix session reconnection

* fix lint

* Implement `agentskills` for OpenDevin to helpfully improve edit AND including more useful tools/skills (#1941)

* add draft for skills

* Implement and test agentskills functions: open_file, goto_line, scroll_down, scroll_up, create_file, search_dir, search_file, find_file

* Remove new_sample.txt file

* add some work from opendevin w/ fixes

* Add unit tests for agentskills module

* fix some issues and updated tests

* add more tests for open

* tweak and handle goto_line

* add tests for some edge cases

* add tests for scrolling

* add tests for edit

* add tests for search_dir

* update tests to use pytest

* use pytest --forked to avoid file op unit tests to interfere with each other via global var

* update doc based on swe agent tool

* update and add tests for find_file and search_file

* move agent_skills to plugins

* add agentskills as plugin and docs

* add agentskill to ssh box and fix sandbox integration

* remove extra returns in doc

* add agentskills to initial tool for jupyter

* support re-init jupyter kernel (for agentskills) after restart

* fix print window's issue with indentation and add testcases

* add prompt for codeact with the newest edit primitives

* modify the way line number is presented (remove leading space)

* change prompt to the newest display format

* support tracking of costs via metrics

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* implement and add tests for py linting

* remove extra text arg for incompatible subprocess ver

* remove sample.txt

* update test_edits integration tests

* fix all integration

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update opendevin/runtime/plugins/agent_skills/README.md

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update agenthub/codeact_agent/prompt.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* Update opendevin/runtime/plugins/agent_skills/agentskills.py

Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* correctly setup plugins for swebench eval

* bump swe-bench version and add logging

* Revert "correctly setup plugins for swebench eval"

This reverts commit 2bd1055.

* bump version

* remove _AGENT_SKILLS_DOCS

* move flake8 to test dep

* update poetry.lock

* remove extra arg

* reduce max iter for eval

* update poetry

* fix integration tests

---------

Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>

* build: Add poetry command to use Python 3.11 for environment setup (#1972)

* Bump @react-types/shared from 3.23.0 to 3.23.1 in /frontend (#2006)

Bumps [@react-types/shared](https://github.com/adobe/react-spectrum) from 3.23.0 to 3.23.1.
- [Release notes](https://github.com/adobe/react-spectrum/releases)
- [Commits](https://github.com/adobe/react-spectrum/compare/@react-types/shared@3.23.0...@react-types/shared@3.23.1)

---
updated-dependencies:
- dependency-name: "@react-types/shared"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @types/react-syntax-highlighter in /frontend (#2007)

Bumps [@types/react-syntax-highlighter](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-syntax-highlighter) from 15.5.11 to 15.5.13.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-syntax-highlighter)

---
updated-dependencies:
- dependency-name: "@types/react-syntax-highlighter"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump @typescript-eslint/parser from 7.9.0 to 7.10.0 in /frontend (#2008)

Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 7.9.0 to 7.10.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v7.10.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump lint-staged from 15.2.2 to 15.2.4 in /frontend (#2009)

Bumps [lint-staged](https://github.com/okonet/lint-staged) from 15.2.2 to 15.2.4.
- [Release notes](https://github.com/okonet/lint-staged/releases)
- [Changelog](https://github.com/lint-staged/lint-staged/blob/master/CHANGELOG.md)
- [Commits](lint-staged/lint-staged@v15.2.2...v15.2.4)

---
updated-dependencies:
- dependency-name: lint-staged
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update README.md

* Update README.md

* add run_infer.sh

* fix input output

* fix docker sandbox

* fix run

* update and clean run_infer.py

* add script to clean up dockers

* update repo uid

* add description

* new

* Update README.md

* use root for sandbox

* update readme

* update ml-bench conda env

* update readme

* update readme

* use try except

* modify raise exception

* add int

* update README

* longer time

* fix existing issues

* fix existing issue

* new docker image

* add metrics of cost

* add result parsing cost

* fix

* fix

* update summarize

* fix

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-31-157.ec2.internal>
Co-authored-by: RainRat <rainrat78@yahoo.ca>
Co-authored-by: மனோஜ்குமார் பழனிச்சாமி <smartmanoj42857@gmail.com>
Co-authored-by: Frank Xu <frankxu2004@gmail.com>
Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Co-authored-by: Shimada666 <649940882@qq.com>
Co-authored-by: Boxuan Li <liboxuan@connect.hku.hk>
Co-authored-by: Xingyao Wang <xingyao6@illinois.edu>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Engel Nyst <enyst@users.noreply.github.com>
Co-authored-by: Robert Brennan <accounts@rbren.io>
Co-authored-by: Rahul Anand <62982824+zeul22@users.noreply.github.com>
Co-authored-by: jiangleo <jiangleo@users.noreply.github.com>
Co-authored-by: jianghongwei <jianghongwei@58.com>
Co-authored-by: Jeremi Joslin <jeremi@newlogic.com>
Co-authored-by: Aaron Xia <zhhuaxia@gmail.com>
Co-authored-by: OpenDevin <opendevin@opendevin.ai>
Co-authored-by: DaxServer <7479937+DaxServer@users.noreply.github.com>
Co-authored-by: Robert <871607149@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants