[Feat] A competitive Web Browsing agent #1856

frankxu2004 · 2024-05-17T02:13:01Z

This PR aims at enabling a competitive browsing agent for #1470.

Now I transplanted the simplified demo agent used in WebArena in our agent hub.

To test, it works best with GPT-4 LLMs such as GPT-4o.

poetry run python ./opendevin/core/main.py -i 5 -t "tell me the usa's president using google search" -c BrowsingAgent -m gpt-4o-2024-05-13

…owsing-agent

frankxu2004 · 2024-05-20T22:29:53Z

Example logs:

17:52:07 - opendevin:INFO: browsing_agent.py:128 - Last action failed:
click('235')
Try again with the current state of the page.


# Current Accessibility Tree:
RootWebArea 'Google', focused
        [20] navigation ''
                [22] link 'About'
                [23] link 'Store'
                [31] link 'Gmail'
                [33] link 'Search for Images'
                [38] button 'Google apps', expanded=False
                        [39] image ''
                [40] link 'Sign in'
                [a] IframePresentational ''
        [48] image 'Google'
        [78] search ''
                [88] image ''
                [92] combobox 'Search' value='current president of the USA', focused, autocomplete='both', hasPopup='listbox', expanded=True, controls='Alh6id'
                [98] button 'Clear'
                        [100] image ''
                [103] button 'Search by voice'
                        [104] image ''
                [106] button 'Search by image'
                        [107] image ''
                [127] listbox '', multiselectable=False, orientation='vertical'
                        [141] option 'current president of the usa', selected=False
                        [141] option 'who is the president of the usa', selected=False
                        [141] option 'who is the president of the usa now', selected=False
                        [141] option 'who is the president of the usa 2024', selected=False
                        [141] option 'who is the president of the usa 2023', selected=False
                        [141] option 'president of the senate us', selected=False
                        [141] option 'who is the president of the usa 2020', selected=False
                        [141] option 'who is the president of the usa 2021', selected=False
                        [141] option 'who is the president of the usa today', selected=False
                        [141] option 'who is the president of the usa during ww1', selected=False
                [226] button 'Google Search'
                [227] button "I'm Feeling Lucky"
                [230] button 'Report inappropriate predictions'
                [235] button 'Google Search'
                [236] button "I'm Feeling Lucky"
        [271] contentinfo ''
                [275] link 'Advertising'
                [276] link 'Business'
                [277] link 'How Search works'
                [279] link 'Our third decade of climate action: join us'
                        [280] image ''
                [283] link 'Privacy'
                [284] link 'Terms'
                [289] button 'Settings', hasPopup='menu', expanded=False
                        generic '', hasPopup='menu'

# Previous Actions
goto('https://www.google.com')
fill('92', 'current president of the USA')
click('235')
click('235')


Here is an example with chain of thought of a valid action when clicking on a button:
"
In order to accomplish my goal I need to click on the button with bid 12
```click("12")```
"

17:52:07 - opendevin:INFO: browsing_agent.py:129 - In order to accomplish my goal, I need to click on the button with bid 226 to perform the Google search.
```click('226')```
17:52:07 - ACTION
BrowseInteractiveAction(browser_actions="click('226')", thought='In order to accomplish my goal, I need to click on the button with bid 226 to perform the Google search.', action='browse_interactive')


==============
STEP 4

17:52:13 - opendevin:INFO: browsing_agent.py:141 - Cost: 0.02 USD | Accumulated Cost: 0.05 USD
17:52:13 - opendevin:INFO: browsing_agent.py:128 - 

# Current Accessibility Tree:
RootWebArea 'current president of the USA - Google Search', focused
        [15] heading 'Accessibility Links'
        [18] link 'Skip to main content'
        [19] link 'Turn off continuous scrolling'
        [27] link 'Accessibility help'
        [31] link 'Accessibility feedback'
        [35] search ''
                [39] link 'Google'
                        [40] image 'Google'
                [56] combobox 'Search' value='current president of the USA', autocomplete='both', hasPopup='listbox', expanded=False, controls='Alh6id'
                [59] button 'Clear'
                        [61] image ''
                [63] button 'Search by voice'
                        [64] image ''
                [65] button 'Search by image'
                        [66] image ''
                [67] button 'Search'
                        [70] image ''
        [248] button 'Settings'
                [250] image ''
        [253] banner ''
                [256] button 'Google apps', expanded=False
                        [257] image ''
                [259] link 'Sign in'
        [280] navigation ''
                [283] navigation ''
                        [287] heading 'Filters and Topics'
                        [291] list ''
                                [292] listitem ''
                                        StaticText 'All'
                                [295] listitem ''
                                        [296] link 'Images'
                                [298] listitem ''
                                        [299] link 'News'
                                [301] listitem ''
                                        [302] link 'Videos'
                                [304] listitem ''
                                        [305] link 'Shopping'
                                [307] listitem ''
                                        [309] button 'More', hasPopup='menu', expanded=False
                                                [312] image ''
                        [336] button 'Tools', expanded=False, controls='hdtbMenus'
                [340] list ''
                        [346] button 'SafeSearch', hasPopup='menu', expanded=False
                                [353] image ''
        [554] main ''
                [558] heading 'Search Results'
                [575] heading 'United States/President'
                [583] heading 'Joe Biden'
                        [585] link 'Joe Biden'
                [597] button 'Credit: Getty Images/The White House'
                        [600] image 'Credit: Getty Images/The White House'
                StaticText 'The 46th and current president of the United States is Joseph R. Biden, Jr. He was sworn into office on January 20, 2021.'
                StaticText 'Dec 6, 2023'
                [618] link 'Presidents, vice presidents, and first ladies | USAGov USA.gov https://www.usa.gov › ... › U.S. facts and figures'
                        [620] heading 'Presidents, vice presidents, and first ladies | USAGov'
                [649] button 'About this result'
                        [652] image ''
                [656] heading 'People also search for'
                        [657] link 'People also search for'
                [660] link 'Benjamin Netanyahu (Trending)'
                        [666] image ''
                [670] link 'Donald Trump'
                [676] link 'Katie Britt (Trending)'
                        [682] image ''
                [686] link 'Jill Biden'
                [692] link 'Barack Obama'
                [698] link 'Neilia Hunter Biden'
                [704] link 'Kamala Harris'
                [725] button 'Feedback'
                StaticText 'Sources include:'
                [730] link 'Ballotpedia'
                StaticText ','
                [731] link 'Wikipedia'
                StaticText '.'
                [732] link 'Learn more'
                [753] heading 'People also ask'
                [760] button 'About this result'
                        [763] image ''
                [771] button 'Who is next in line for president of us?', expanded=False, controls='_CMZLZpjVDMvXseMP7LSn8Ao_44'
                [830] button 'Who is the new president of United States?', expanded=False, controls='_CMZLZpjVDMvXseMP7LSn8Ao_34'
                [889] button 'Who is the number 1 US president?', expanded=False, controls='_CMZLZpjVDMvXseMP7LSn8Ao_42'
                [948] button 'What number president is Joe Biden?', expanded=False, controls='_CMZLZpjVDMvXseMP7LSn8Ao_43'
                [1029] button 'Feedback'
                [1051] link 'President of the United States Wikipedia https://en.wikipedia.org › wiki › President_of_the_Unit...'
                        [1053] heading 'President of the United States'
                [1082] button 'About this result'
                        [1085] image ''
                [1089] emphasis ''
                        StaticText 'Joe Biden'
                StaticText 'is the 46th and current president of the United States, having assumed office on January 20, 2021.'
                StaticText '\u200e'
                [1094] link 'List'
                StaticText '· \u200e'
                [1095] link 'Powers'
                [1096] link 'Executive Office of the'
                [1097] link 'Vice President'
                [1105] link 'Joe Biden: The President The White House (.gov) https://www.whitehouse.gov › administration › presiden...'
                        [1107] heading 'Joe Biden: The President'
                [1136] button 'About this result'
                        [1139] image ''
                StaticText 'As President,'
                [1143] emphasis ''
                        StaticText 'Biden'
                StaticText "will restore America's leadership and build our communities back better. Joseph Robinette Biden, Jr. was born in Scranton, Pennsylvania, the\xa0..."
                [1153] link 'President Joe Biden (@potus) • Instagram photos and videos Instagram\xa0·\xa0potus 19.2M+ followers'
                        [1155] heading 'President Joe Biden (@potus) • Instagram photos and videos'
                [1182] button 'About this result'
                        [1185] image ''
                [1189] emphasis ''
                        StaticText '46th'
                StaticText 'President of the United States, husband to @flotus, proud dad and pop. Finishing the job for all Americans. Text me: (302) 404-0880 ... Photo by President\xa0...'
                [1199] link 'President Joe Biden Facebook\xa0·\xa0President Joe Biden 11.9M+ followers'
                        [1201] heading 'President Joe Biden'
                [1228] button 'About this result'
                        [1231] image ''
                [1235] emphasis ''
                        StaticText 'President Joe Biden'
                StaticText '. 10M likes · 72129 talking about this. 46th President of the United States, husband to @FLOTUS, proud father and pop. Text me (302)...'
                [1245] link 'The Executive Branch The White House (.gov) https://www.whitehouse.gov › ... › Our Government'
                        [1247] heading 'The Executive Branch'
                [1276] button 'About this result'
                        [1279] image ''
                [1283] emphasis ''
                        StaticText 'President'
                StaticText 'is both the head of state and head of government of the'
                [1284] emphasis ''
                        StaticText 'United States of America'
                StaticText ', and Commander-in-Chief of the armed forces. Under Article II of\xa0...'
                [1294] link 'President of the United States United States Mission to the United Nations (.gov) https://usun.usmission.gov › Our Leaders'
                        [1296] heading 'President of the United States'
                [1325] button 'About this result'
                        [1328] image ''
                [1332] emphasis ''
                        StaticText 'Joseph R. Biden'
                StaticText '. President Biden represented Delaware for 36 years in the U.S. Senate before becoming the 47th Vice President of the United States.'
                [1342] link 'President of the United States Ballotpedia https://ballotpedia.org › President_of_the_United_States'
                        [1344] heading 'President of the United States'
                [1373] button 'About this result'
                        [1376] image ''
                StaticText 'The current president is'
                [1380] emphasis ''
                        StaticText 'Joe Biden (D'
                StaticText '). Election ... The executive Power shall be vested in a President of the United States of America. ... The President, Vice\xa0...'
                [1391] link 'Joe Biden Wikipedia https://en.wikipedia.org › wiki › Joe_Biden'
                        [1393] heading 'Joe Biden'
                [1422] button 'About this result'
                        [1425] image ''
                [1429] emphasis ''
                        StaticText 'Joseph Robinette Biden Jr'
                StaticText 'is an American politician who is the 46th and current president of the United States since 2021. A member of the Democratic Party,\xa0...'
                StaticText '\u200e'
                [1434] link 'Political positions'
                StaticText '· \u200e'
                [1435] link 'Electoral history'
                [1436] link '2008 Presidential Campaign'
                [1437] link 'Jill Biden'
                [1445] link 'Images'
                [1452] button 'About this result'
                        [1455] image ''
                [1462] button 'Joe Biden: The President | The White House'
                        [1465] image 'Joe Biden: The President | The White House'
                [1468] link 'Joe Biden: The President | The White House The White House'
                        [1473] image ''
                [1483] button 'About this result'
                        [1486] image ''
                [1488] button 'President of the USA | Current Leader'
                        [1491] image 'President of the USA | Current Leader'
                [1494] link 'President of the USA | Current Leader PlanetRulers'
                        [1499] image ''
                [1509] button 'About this result'
                        [1512] image ''
                [1514] button 'Joe Biden: The President | The White House'
                        [1517] image 'Joe Biden: The President | The White House'
                [1520] link 'Joe Biden: The President | The White House The White House'
                        [1525] image ''
                [1535] button 'About this result'
                        [1538] image ''
                [1708] button 'Feedback'
                [1720] button '6 more images'
                        [1726] image ''
        [1771] heading 'Related searches'
        [1778] button 'About this result'
                [1781] image ''
        [1786] link 'who is the 46th president'
        [1792] link 'who is the vice president of the united states'
        [1799] link 'who is the prime minister of usa'
        [1805] link 'all presidents in order'
        [1811] link 'first president of usa'
        [1816] link '5 requirements to be president'
        [1821] link 'joe biden'
        [1826] link 'presidential line of succession today'
        generic '', hidden=True
        generic '', hidden=True
        generic '', owns='rhs'
                [1868] complementary ''
                        generic '', hidden=True
                        generic '', hidden=True
                        [1872] heading 'Complementary Results'
                        [1891] link 'Joe Biden'
                                [1892] heading 'Joe Biden'
                        [1895] heading '46th U.S. President'
                        [1900] button 'More options', hasPopup='menu', expanded=False
                                [1901] image 'More options'
                                        [1902] image ''
                        [1992] link ''
                                [1994] image ''
                        [2009] link 'whitehouse.gov'
                                [2011] image ''
                        [2018] heading 'Description'
                        StaticText 'Joseph Robinette Biden Jr. is an American politician who is the 46th and current president of the United States since 2021.'
                        [2023] link 'Wikipedia'
                        StaticText 'Born'
                        StaticText ':'
                        StaticText 'November 20, 1942 (age 81\xa0years),'
                        [2033] link 'Scranton, PA'
                        StaticText 'Edited works'
                        StaticText ':'
                        [2042] link 'Halting the Spread of HIV/AIDS: Future Efforts in the U. S. Bilateral and Multilateral Response: Congressional Hearings'
                        StaticText ','
                        [2045] link 'MORE'
                        StaticText 'Organizations founded'
                        StaticText ':'
                        [2055] link 'United States Department of Defense China Task Force'
                        StaticText ','
                        [2058] link 'MORE'
                        StaticText 'Grandchildren'
                        StaticText ':'
                        [2068] link 'Navy Joan Roberts'
                        StaticText ','
                        [2069] link 'Natalie Biden'
                        [2070] link 'Maisy Biden'
                        [2071] link 'Robert Biden II'
                        StaticText ','
                        [2072] link 'Naomi Biden'
                        [2073] link 'Finnegan Biden'
                        StaticText 'Grandparents'
                        StaticText ':'
                        [2082] link 'Ambrose J. Finnegan'
                        StaticText ','
                        [2083] link 'Mary Elizabeth Robinette Biden'
                        [2084] link 'Joseph H. Biden'
                        [2085] link 'Geraldine C. Blewitt'
                        StaticText 'Great-grandparents'
                        StaticText ':'
                        [2094] link 'George Hamilton Robinette'
                        StaticText ','
                        [2097] link 'MORE'
                        StaticText 'Marriage location'
                        StaticText ':'
                        [2107] link 'New York, NY'
                        StaticText 'Sources include:'
                        [2111] link 'Ballotpedia'
                        [2112] link 'Wikipedia'
                        StaticText '.'
                        [2113] link 'Learn more'
                        [2119] heading 'Profiles'
                        [2125] link 'Instagram'
                                [2127] image ''
                        [2132] link 'X (Twitter)'
                                [2134] image ''
                        [2139] link 'Facebook'
                                [2141] image ''
                        [2146] link 'YouTube'
                                [2148] image ''
                        [2159] link 'More about Joe Biden'
                        [2166] button 'Feedback'
                        generic '', hidden=True
                        generic '', hidden=True
        [1843] progressbar 'Loading...', live='polite', relevant='additions text', valuemin=0, valuemax=100, valuetext=''
        [1847] heading 'Page Navigation'
        [1848] button 'More results'
        [1856] button '', live='polite', relevant='additions text'
        [1864] navigation ''
        generic '', live='polite', relevant='additions text'
        generic '', live='polite', relevant='additions text'
        generic '', live='polite', relevant='additions text'
        generic '', live='polite', relevant='additions text'

# Previous Actions
goto('https://www.google.com')
fill('92', 'current president of the USA')
click('235')
click('235')
click('226')


Here is an example with chain of thought of a valid action when clicking on a button:
"
In order to accomplish my goal I need to click on the button with bid 12
```click("12")```
"

17:52:13 - opendevin:INFO: browsing_agent.py:129 - In order to accomplish my goal of telling you the current president of the USA, I need to send a message with the relevant information found in the search results.

```send_msg_to_user('The current president of the USA is Joe Biden.')```
17:52:13 - ACTION
MessageAction(content='The current president of the USA is Joe Biden.', wait_for_response=False, action='message')

…owsing-agent

xingyaoww

LGTM! This mainly adds a browsing agent to the agent hub and tweaked a little bit about browser env. I think we can approve it to unblock the integration of BrowserGym.

EDIT: I also locally tested and confirmed the sample command works on my end!

PS: When we figure out a way to do task decomposition, CodeAct can eventually delegate tasks to this BrowserAgent for complex web browsing tasks!

yufansong

Leave some nits. Mostly LGTM. I would be appricate it if you can add more comments or simply elaborate your design and some parameter setting. Then other people can add more work on your codebase. I don't want to block our integration progress and AP it. I can help for some follow up refactor or nits if you have no time.

agenthub/browsing_agent/README.md

agenthub/browsing_agent/browsing_agent.py

agenthub/browsing_agent/prompt.py

…owsing-agent

frankxu2004 · 2024-05-21T13:44:44Z

Thanks! @yufansong I added some comments for things that are not clear. Hope it's good for now -- since I changed the BrowserOutputObservation a bit, the integration tests are failing for some, would you mind taking a look how to fix those?

EDIT: NVM, just fixed those, should be ready to go

…owsing-agent

li-boxuan · 2024-05-22T04:05:32Z

Sad, our project test coverage reduced by 5.87%... let me see if there's anything we could do to test this.

li-boxuan · 2024-05-22T07:32:44Z

I've made some progress in creating an integration test for this agent! Will create a PR in a day.

li-boxuan · 2024-05-22T07:53:18Z

agenthub/browsing_agent/prompt.py

+        )
+
+
+class SystemPrompt(PromptElement):


@frankxu2004 this prompt (along with many other prompts in this file) seems unused? Is it by intention?

Yes, basically this whole prompt.py file is not currently used. Currently the agent is a simplified version for ease of understanding. However I included here with the intention of incorporating a more complex agent using more comprehensive information as next steps. Here it's still useful as it provides others of building blocks of prompts and understanding what possible information to include as context for LLMs.

These PRs are mostly for chasing the neurips paper deadline so not all features are implemented yet.

I see, sounds fair. I am just having a bit trouble reproducing poetry run python ./opendevin/core/main.py -i 5 -t "tell me the usa's president using google search" -c BrowsingAgent -m gpt-4o-2024-05-13... I tried like 5 times and only succeeded once.

that's a bit weird, what error are you seeing? do you have logs

Currently the agent does not return AgentFinishAction, so to the eyes of the frame, it's always error in the end. Maybe I should add this Finish thing

logs.zip

Basically, keep clicking without making progress

Yeah, sometimes it's like this. I improved the agent a bit and fixed some issues here #1993

* initial attempt at a browsing only agent * add browsing agent * update * implement agent * update * fix comments * remove unnecessary things from memory extras * update image processing --------- Co-authored-by: Yufan Song <33971064+yufansong@users.noreply.github.com>

frankxu2004 added 3 commits May 16, 2024 22:11

initial attempt at a browsing only agent

47d09e6

Merge branch 'main' of https://github.com/OpenDevin/OpenDevin into br…

4b7a732

…owsing-agent

add browsing agent

58d4433

neubig assigned frankxu2004 May 17, 2024

frankxu2004 added 7 commits May 17, 2024 14:59

update

4b54527

Merge branch 'main' of https://github.com/OpenDevin/OpenDevin into br…

5dea62d

…owsing-agent

Merge branch 'main' of https://github.com/OpenDevin/OpenDevin into br…

748f0bb

…owsing-agent

Merge branch 'main' of https://github.com/OpenDevin/OpenDevin into br…

43f1cbf

…owsing-agent

Merge branch 'main' of https://github.com/OpenDevin/OpenDevin into br…

7a83cbb

…owsing-agent

implement agent

a88175e

Merge branch 'main' of https://github.com/OpenDevin/OpenDevin into br…

5b25147

…owsing-agent

frankxu2004 marked this pull request as ready for review May 20, 2024 21:57

update

e45df0a

Merge branch 'main' of https://github.com/OpenDevin/OpenDevin into br…

4bfc4fe

…owsing-agent

xingyaoww approved these changes May 21, 2024

View reviewed changes

xingyaoww mentioned this pull request May 21, 2024

Add: a mechanism for tracking contributions to the paper #1917

Draft

yufansong approved these changes May 21, 2024

View reviewed changes

frankxu2004 added 2 commits May 21, 2024 09:29

Merge branch 'main' of https://github.com/OpenDevin/OpenDevin into br…

680d8cc

…owsing-agent

fix comments

d68de8a

remove unnecessary things from memory extras

e9ded4f

frankxu2004 mentioned this pull request May 21, 2024

Incorporate BrowsingAgent's prompts to CodeAct to enable richer interactions with the browser. #1945

Open

frankxu2004 added 2 commits May 21, 2024 13:48

update image processing

8291869

Merge branch 'main' of https://github.com/OpenDevin/OpenDevin into br…

9036245

…owsing-agent

yufansong enabled auto-merge (squash) May 21, 2024 18:54

yufansong disabled auto-merge May 21, 2024 19:02

Merge branch 'main' into browsing-agent

e6e3f33

yufansong enabled auto-merge (squash) May 21, 2024 19:03

yufansong merged commit 1fe290a into OpenDevin:main May 21, 2024
23 checks passed

frankxu2004 deleted the browsing-agent branch May 21, 2024 20:03

li-boxuan reviewed May 22, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] A competitive Web Browsing agent #1856

[Feat] A competitive Web Browsing agent #1856

frankxu2004 commented May 17, 2024 •

edited

frankxu2004 commented May 20, 2024 •

edited

xingyaoww left a comment •

edited

yufansong left a comment

frankxu2004 commented May 21, 2024 •

edited

li-boxuan commented May 22, 2024 •

edited

li-boxuan commented May 22, 2024

li-boxuan May 22, 2024

frankxu2004 May 22, 2024

frankxu2004 May 22, 2024

li-boxuan May 23, 2024

frankxu2004 May 23, 2024

frankxu2004 May 23, 2024 •

edited

li-boxuan May 23, 2024

frankxu2004 May 23, 2024

		)


		class SystemPrompt(PromptElement):

[Feat] A competitive Web Browsing agent #1856

[Feat] A competitive Web Browsing agent #1856

Conversation

frankxu2004 commented May 17, 2024 • edited

frankxu2004 commented May 20, 2024 • edited

xingyaoww left a comment • edited

Choose a reason for hiding this comment

yufansong left a comment

Choose a reason for hiding this comment

frankxu2004 commented May 21, 2024 • edited

li-boxuan commented May 22, 2024 • edited

li-boxuan commented May 22, 2024

li-boxuan May 22, 2024

Choose a reason for hiding this comment

frankxu2004 May 22, 2024

Choose a reason for hiding this comment

frankxu2004 May 22, 2024

Choose a reason for hiding this comment

li-boxuan May 23, 2024

Choose a reason for hiding this comment

frankxu2004 May 23, 2024

Choose a reason for hiding this comment

frankxu2004 May 23, 2024 • edited

Choose a reason for hiding this comment

li-boxuan May 23, 2024

Choose a reason for hiding this comment

frankxu2004 May 23, 2024

Choose a reason for hiding this comment

frankxu2004 commented May 17, 2024 •

edited

frankxu2004 commented May 20, 2024 •

edited

xingyaoww left a comment •

edited

frankxu2004 commented May 21, 2024 •

edited

li-boxuan commented May 22, 2024 •

edited

frankxu2004 May 23, 2024 •

edited