edit_square igindin

Cloning Cal.com's UX in 10 Minutes with Chrome DevTools MCP + AI

Chrome DevTools MCP lets your AI agent see websites. Screenshot a reference, write a brief, dispatch a subagent — pixel-perfect implementation in minutes.

Ilya Gindin

My user told me, in plain Russian: “Фиолетовый цвет вообще не понятно. Убирай и переделывай.”

Purple makes no sense. Remove it and redo.

That’s feedback. Not vague — surgical. The profile page I’d built for MeetCal looked like every other AI-generated UI: dark background, arbitrary purple accent (#6366f1), cards with colored left borders. The kind of thing that happens when you say “build me a scheduling UI” without showing the AI anything real to work from.

I looked at Cal.com. Clean. Professional. Obviously the right reference point. But how do you actually get from “I like that design” to “it’s implemented” without spending an afternoon in Figma or pixel-peeping in DevTools yourself?

Here’s the workflow I landed on. It took about 10 minutes.


The Problem With “Make It Look Like Cal.com”

When you describe a design in words, you introduce a telephone game. “Light background, simple cards, clean typography” means something different to every developer — human or AI.

My first attempt at the profile page proved this. I told the AI to build a scheduling profile listing. It came back with:

  • Background: #111827 (dark navy)
  • Accent: #6366f1 (indigo/purple)
  • Cards with a colored left border (a common “design pattern” that isn’t actually from anywhere)
  • A letter avatar because I hadn’t wired up real image URLs yet

Technically correct. Visually wrong. The problem wasn’t the code — it was the specification. I hadn’t given the AI anything concrete to look at.


Discovery: Chrome DevTools MCP

Chrome DevTools MCP is a Model Context Protocol server maintained by the Chrome DevTools team at Google. It exposes browser automation tools directly to your AI agent: navigate, click, take screenshots, inspect the DOM, run scripts against a live page.

The key tools I used:

  • navigate_page — go to a URL
  • take_screenshot — capture what the browser sees, full-page
  • take_snapshot — dump the accessibility tree (page structure as your agent reads it)

This isn’t Puppeteer you’re writing by hand. It’s your AI agent controlling a real browser tab, seeing exactly what you see. The agent can look at cal.com/bailey, take a screenshot, examine the DOM structure, and then go implement something that actually resembles it.

No telephone game. No interpretation gap.


The Workflow

Here’s what I actually did, step by step.

Step 1: Open a reference page in Chrome.

I navigated to a Cal.com user profile — someone with multiple event types — so the agent would see the full layout: avatar area, name, bio, event cards listed below.

Step 2: Take a screenshot with Chrome DevTools MCP.

navigate_page → take_screenshot (full page)

The screenshot saves to a local path. That path becomes an artifact I can pass to subagents.

Step 3: Write a visual brief.

Screenshot plus specific values. Not “make it look clean” — actual numbers:

Reference: [screenshot path]
- Background: #f5f5f5 (light gray)
- Cards: white, thin 1px border #e5e7eb, no colored accents
- Avatar: circular, sits above cover area
- Event duration: pill-shaped badge, gray (#f3f4f6), rounded-full
- Typography: dark (#111827) on light, no purple anywhere
- CSS variables: must fall back gracefully for Telegram Mini App

Step 4: Dispatch an autonomous-developer subagent.

I passed it:

  • The screenshot path
  • The visual brief above
  • The paths to the current component files

The agent came back in about two minutes with a complete rewrite. Not suggestions — working code.


Before and After

This is the concrete diff.

Before:

background: #111827;          /* dark navy */
accent: #6366f1;              /* purple */
card border-left: 4px solid purple;
avatar: letter placeholder, purple background

After:

background: #f5f5f5;          /* light gray */
cards: white, border: 1px solid #e5e7eb
no left border accent
avatar: real Telegram profile photo, circular crop
duration: pill badge, background #f3f4f6
bio text: visible, charcoal

The layout structure also changed. Before: cards stacked with emphasis on the border accent. After: clean card list where the event title and duration pill do all the work — the same visual hierarchy Cal.com uses.

One detail worth noting: the CSS variables. Telegram Mini Apps inject their own CSS variables for theme colors (--tg-theme-bg-color, etc.). The rewrite used these as primary values with explicit hex fallbacks for browser previewing:

background-color: var(--tg-theme-bg-color, #f5f5f5);
color: var(--tg-theme-text-color, #111827);

That’s not something I’d have thought to specify in a verbal description. The agent saw the component context, understood it was a Telegram Mini App, and handled it.


What Makes This Workflow Actually Powerful

Three things, in order of importance.

1. The AI can see the reference.

This is the whole thing. When you pass a screenshot, you eliminate the verbal telephone game. The agent sees the same page you do — the actual rendered output, not your description of it. Cal.com’s card padding, the way the avatar overlaps the cover area, the exact weight of the border — all of that is in the screenshot without you needing to articulate it.

take_snapshot goes further. It gives the agent the DOM tree and ARIA structure of the page. Useful if you want the agent to understand component hierarchy, not just visual output.

2. Visual brief beats verbal description.

The screenshot gets you 80%. The remaining 20% is specific values. “Light background” is ambiguous. #f5f5f5 is not. “Remove the purple” is a direction. “Neutral palette: gray and white only, badges use #f3f4f6” is a specification.

Invest five minutes writing this before you dispatch the agent. You’ll save 30 minutes of iteration.

3. Autonomous execution means no back-and-forth.

The subagent pattern here is: read the brief, read the existing code, rewrite, verify it compiles, done. No intermediate “does this look right?” No waiting for me to approve changes one at a time. It treats the brief as a contract and fulfills it.

The constraint is the brief quality. A good brief returns working code. A vague brief returns a guess.


The Tools

  • Chrome DevTools MCP (by Google) — browser automation as MCP tools. navigate_page, take_screenshot, take_snapshot are the three you’ll use most. Documentation, Addy Osmani’s writeup
  • Claude Code + autonomous-developer agent — reads the visual brief, understands the existing component structure, rewrites CSS and JSX, verifies TypeScript compiles
  • Telegram Mini App CSS variables — if you’re building for Telegram, use var(--tg-theme-*) with hex fallbacks so the component works in both environments

Lessons

Screenshot first, describe second. The screenshot is your spec. Write the visual brief as an addendum, not a replacement.

Color is where “AI-generated” comes from. The dark backgrounds and arbitrary accent colors aren’t intentional design choices — they’re what an AI produces when it has no reference. Give it a reference and it stops guessing.

Include specific values in the brief. Hex codes, pixel measurements, border radiuses. The more precise the brief, the less interpretation the agent needs to do.

CSS variables + fallbacks = works everywhere. If you’re building for an embedded context like Telegram, write var(--platform-variable, #fallback) from the start. The agent will respect this if you put it in the brief.

The “10 minutes” is real. Screenshot: 1 minute. Brief: 3-4 minutes. Agent execution: 2 minutes. Review and minor tweaks: 3 minutes. Compare that to doing it by hand — opening DevTools, inspecting each element, copying values, rewriting CSS manually. Same output, different time cost.


The user approved the redesign without comment. No feedback meant the purple was gone and the thing looked professional. That’s the signal.

Chrome DevTools MCP is in its early stages — the toolset will expand. But even now, navigate + screenshot + brief + agent is a complete workflow for UI work. You’re not describing what you want. You’re showing it.

← arrow keys or swipe →