Browser agent

MyCompAgent

A DOM-driven browser agent powered by Gemini native function calling, Playwright execution, approval modes, memory, and run-level debug traces.

MyCompAgent artifact board with DOM snapshot, Gemini function call, actions log, and Playwright surface.
Artifact-style board from repository surfaces, not a product screenshot.
Status
Live repository
Proof type
Browser agent
Stack
Playwright Gemini function calling memory debug logs

What it is

A DOM-to-action browser loop

MyCompAgent converts browser state into a DOM snapshot, interprets that snapshot for the model, asks Gemini for native function calls, and executes approved actions through Playwright.

Why it exists

Browser agents need controlled execution

The repo is built around the practical parts of browser control: human-in-loop modes, guardrails, persistent sessions, memory events, and evidence logs for what the model saw and did.

How it works

Snapshot, interpret, call, execute

01Snapshot
02Interpreter
03Function
04Playwright
  1. Parse the browser into a compact DOM snapshot with visible state.
  2. Maintain a multi-turn chat loop instead of one-shot action parsing.
  3. Use Gemini native function calling for click, type, navigate, wait, and screenshot actions.
  4. Record actions, LLM responses, browser state, interpreter state, memory events, snapshots, and video traces.