Browser agent
MyCompAgent
A DOM-driven browser agent powered by Gemini native function calling, Playwright execution, approval modes, memory, and run-level debug traces.
- Status
- Live repository
- Proof type
- Browser agent
- Stack
- Playwright Gemini function calling memory debug logs
What it is
A DOM-to-action browser loop
MyCompAgent converts browser state into a DOM snapshot, interprets that snapshot for the model, asks Gemini for native function calls, and executes approved actions through Playwright.
Why it exists
Browser agents need controlled execution
The repo is built around the practical parts of browser control: human-in-loop modes, guardrails, persistent sessions, memory events, and evidence logs for what the model saw and did.
How it works
Snapshot, interpret, call, execute
01Snapshot
02Interpreter
03Function
04Playwright
- Parse the browser into a compact DOM snapshot with visible state.
- Maintain a multi-turn chat loop instead of one-shot action parsing.
- Use Gemini native function calling for click, type, navigate, wait, and screenshot actions.
- Record actions, LLM responses, browser state, interpreter state, memory events, snapshots, and video traces.