Site Interaction

Some workflows can only be done by a human clicking through a website. There's no CLI, no public API, no way to script it. But the website's frontend talks to a backend API — and if you can discover that API, you can drive it directly.

Problem: You Don't Know What API the Site Uses

Most sites have no documented API. You need to figure out what endpoints exist, what they accept, and what they return.

Solution: scout the site by intercepting network traffic. Launch the browser, click through the workflow manually, and watch what the frontend calls:

javascript

page.on("response", async (res) => {
  const url = res.url();
  if (!url.includes("/api/")) return;
  
  const body = await res.text().catch(() => "");
  console.log(
    `[${res.request().method()} ${res.status()}] ${url}`,
  );
  if (body.length < 1000) console.log(`  ${body}`);
});

This reveals endpoints, request shapes, and response formats. These endpoints are stable — the frontend depends on them.

Trial each step independently before combining into an end-to-end script.

Problem: You Need to Automate Actions on the Site

Once you know the API (or if there isn't one), you need to drive the site programmatically.

Solution A: call the API via page.evaluate(() => fetch(...)). This is the preferred approach — faster, more resilient to UI redesigns, and auth headers are included automatically because fetch runs in the page context:

javascript

// GET
const items = await page.evaluate(async () => {
  const res = await fetch("/api/items");
  return res.json();
});

// POST with JSON
const result = await page.evaluate(async (payload) => {
  const res = await fetch("/api/items", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify(payload),
  });
  return { status: res.status, body: await res.json() };
}, { name: "new item", value: 42 });

Solution B: drive the UI directly. Some actions have no API equivalent — CAPTCHAs, OAuth flows, drag-and-drop, file inputs that trigger client-side processing:

javascript

await page.locator('button[aria-label="Submit"]').click();
await page.locator('input[name="email"]').fill("a@b.com");
await page.locator('input[type="file"]').setInputFiles("f.zip");

When to use which:

API	UI
Fast — no rendering or animations	Slow — waits for DOM updates
Resilient to UI redesigns	Breaks when selectors change
Can't solve CAPTCHAs or OAuth	Can handle any visual interaction

Prefer API. Fall back to UI when necessary.

Problem: The Page Isn't Ready Yet

SPAs render asynchronously. goto() resolves before the app is usable. waitForSelector() is fragile — selectors change across deploys.

Solution: poll a lightweight API endpoint for readiness instead of inspecting the DOM:

javascript

let attempts = 0;
while (attempts < 120) {
  await page.waitForTimeout(2000);
  
  const ok = await page.evaluate(async () => {
    try {
      const res = await fetch("/api/me");
      return res.ok;
    } catch { return false; }
  }).catch(() => false);
  
  if (ok) break;
  attempts++;
}

This works regardless of how the UI renders. If the API responds, the session is live.

Problem: The Site Redirected Somewhere Unexpected

You navigated to /app/settings but landed on /login, /welcome, or /customize instead.

Solution: check the URL after navigation and take corrective action:

javascript

await page.goto("https://example.com/app/settings");

const url = page.url();
if (url.includes("/login")) {
  console.log("Redirected to login. Waiting for sign-in...");
  // Wait for auth (see authentication.md)
} else if (url.includes("/welcome")) {
  // Click through to the right page
  await page.getByText("Go to settings").click();
}

Problem: You Can't Tell Why the Script Failed

The automation broke somewhere. Without seeing what the browser was showing, debugging is guesswork.

Solution: screenshot the page on every unhandled error:

javascript

try {
  await doWork(page);
} catch (err) {
  await page.screenshot({ path: "error.png" });
  console.error("Failed:", err.message);
  console.error("Screenshot: error.png");
  process.exit(1);
} finally {
  await browser.close();
}

Save to a known path and log the location so the human (or agent) can inspect the browser state at the moment of failure.

Site Interaction ​

Problem: You Don't Know What API the Site Uses ​

Problem: You Need to Automate Actions on the Site ​

Problem: The Page Isn't Ready Yet ​

Problem: The Site Redirected Somewhere Unexpected ​

Problem: You Can't Tell Why the Script Failed ​

Site Interaction

Problem: You Don't Know What API the Site Uses

Problem: You Need to Automate Actions on the Site

Problem: The Page Isn't Ready Yet

Problem: The Site Redirected Somewhere Unexpected

Problem: You Can't Tell Why the Script Failed