Skip to content

Site Interaction

Some workflows can only be done by a human clicking through a website. There's no CLI, no public API, no way to script it. But the website's frontend talks to a backend API — and if you can discover that API, you can drive it directly.

Problem: You Don't Know What API the Site Uses

Most sites have no documented API. You need to figure out what endpoints exist, what they accept, and what they return.

Solution: scout the site by intercepting network traffic. Launch the browser, click through the workflow manually, and watch what the frontend calls:

javascript
page.on("response", async (res) => {
  const url = res.url();
  if (!url.includes("/api/")) return;
  
  const body = await res.text().catch(() => "");
  console.log(
    `[${res.request().method()} ${res.status()}] ${url}`,
  );
  if (body.length < 1000) console.log(`  ${body}`);
});

This reveals endpoints, request shapes, and response formats. These endpoints are stable — the frontend depends on them.

Trial each step independently before combining into an end-to-end script.

Problem: You Need to Automate Actions on the Site

Once you know the API (or if there isn't one), you need to drive the site programmatically.

Solution A: call the API via page.evaluate(() => fetch(...)). This is the preferred approach — faster, more resilient to UI redesigns, and auth headers are included automatically because fetch runs in the page context:

javascript
// GET
const items = await page.evaluate(async () => {
  const res = await fetch("/api/items");
  return res.json();
});

// POST with JSON
const result = await page.evaluate(async (payload) => {
  const res = await fetch("/api/items", {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify(payload),
  });
  return { status: res.status, body: await res.json() };
}, { name: "new item", value: 42 });

Solution B: drive the UI directly. Some actions have no API equivalent — CAPTCHAs, OAuth flows, drag-and-drop, file inputs that trigger client-side processing:

javascript
await page.locator('button[aria-label="Submit"]').click();
await page.locator('input[name="email"]').fill("a@b.com");
await page.locator('input[type="file"]').setInputFiles("f.zip");

When to use which:

APIUI
Fast — no rendering or animationsSlow — waits for DOM updates
Resilient to UI redesignsBreaks when selectors change
Can't solve CAPTCHAs or OAuthCan handle any visual interaction

Prefer API. Fall back to UI when necessary.

Problem: The Page Isn't Ready Yet

SPAs render asynchronously. goto() resolves before the app is usable. waitForSelector() is fragile — selectors change across deploys.

Solution: poll a lightweight API endpoint for readiness instead of inspecting the DOM:

javascript
let attempts = 0;
while (attempts < 120) {
  await page.waitForTimeout(2000);
  
  const ok = await page.evaluate(async () => {
    try {
      const res = await fetch("/api/me");
      return res.ok;
    } catch { return false; }
  }).catch(() => false);
  
  if (ok) break;
  attempts++;
}

This works regardless of how the UI renders. If the API responds, the session is live.

Problem: The Site Redirected Somewhere Unexpected

You navigated to /app/settings but landed on /login, /welcome, or /customize instead.

Solution: check the URL after navigation and take corrective action:

javascript
await page.goto("https://example.com/app/settings");

const url = page.url();
if (url.includes("/login")) {
  console.log("Redirected to login. Waiting for sign-in...");
  // Wait for auth (see authentication.md)
} else if (url.includes("/welcome")) {
  // Click through to the right page
  await page.getByText("Go to settings").click();
}

Problem: You Can't Tell Why the Script Failed

The automation broke somewhere. Without seeing what the browser was showing, debugging is guesswork.

Solution: screenshot the page on every unhandled error:

javascript
try {
  await doWork(page);
} catch (err) {
  await page.screenshot({ path: "error.png" });
  console.error("Failed:", err.message);
  console.error("Screenshot: error.png");
  process.exit(1);
} finally {
  await browser.close();
}

Save to a known path and log the location so the human (or agent) can inspect the browser state at the moment of failure.