Carousel Studio

Help

Jump to section

What this app does

Carousel Studio turns a 30–90 second voice memo into a publish-ready Instagram carousel post (1080×1350) on-brand with your palette, fonts, and image style. You dictate the idea, the app cleans it into a brief, then generates the slides, and you tune every detail directly in the preview.

The full loop is: Record → transcript → brief → carousel → refine → export ZIP.

Getting started in 60 seconds

  1. Create a palette from a URL (we'll pull the brand colors from the site) or build one by hand.
  2. Create a brand: pick the palette, pick body + mono fonts, write a one-line voice tone, optionally upload a logo, optionally describe the image style for AI generation.
  3. From Projects, click New project, give it a name, pick the brand.
  4. In the studio, hit ● Record, talk for 30–90 seconds, click Transcribe, then Extract brief, then Regenerate from scratch.
  5. Edit slides directly in the preview. Apply refinement changes specific things in plain English. Download ZIP ↓ when you're done.

Brands

A brand bundles everything that should stay constant across carousels: colors (via a palette), fonts, voice tone, optional logo, and optional image style prompt. Multiple projects share one brand.

Fields

  • Brand palette — the single source of color for slides. The studio derives all high-contrast scheme combinations from this palette automatically.
  • Body font / Mono font — picked from a curated list of Google Fonts. The body font drives headlines; mono drives kickers and footers. Previews in the actual typeface.
  • Brand voice — a sentence describing tone. Injected verbatim into Claude's system prompt so generated slides sound like you. Example: "warm and conversational, no jargon, no exclamation points."
  • Image style guidelines — prepended to AI image prompts on individual slides so all generated images stay consistent. Example: "editorial film photography, warm natural light, muted earth tones, shallow depth of field, no people in frame, no text."
  • Logo — uploaded image. Available as a top-right corner overlay per slide.

Sample preview

At the bottom of the brand editor you'll see two live sample slides for "lower back pain" that update as you change palette, fonts, voice, or logo. The right one can be filled with an AI image to show how your brand's image style looks. The samples are a reference, not a save — your real edits happen in the form fields above.

The brand voice is the single most important brand field. A specific voice ("sharp, contrarian, no platitudes") produces dramatically better slides than a vague one ("professional").

Palettes

A palette is a named set of brand colors. From a palette the studio computes all the high-contrast color schemes available for slides.

Three ways to make one

  • Extract from URL. Paste any URL. We scrape the page's CSS, theme-color meta tag, and OG image; rank colors by frequency and selector prominence; return the top 6–8 swatches. Click "Save as palette" to keep them.
  • Build from a curated scheme. Pick one of the 8 defaults as a starting point, then tweak.
  • Build from scratch. Three colors (background, foreground, accent) plus as many Extra colors as you want. More swatches = more high-contrast combinations available in the studio.

Roles

bg fills the slide. fg is for body/secondary text. accent is for the kicker underline, decoration strokes, and is the default headline color. Don't worry about getting the exact pairing right — the studio derives all high-contrast pairs from the swatches.

The studio (project editor)

The studio is the page at /app/projects/<id>. Left rail holds project-level inputs (record, brief, save status). The main column has:

  • Action bar — brand select, scheme select (with swatches), custom palette override, final-slide CTA, Regenerate from scratch, Download ZIP, Delete project.
  • Refinement panel — type a plain-English change; preserves everything you don't mention.
  • Seamless backdrop card — one AI image stretched across N slides.
  • Slide grid — 3 across on desktop, 2 / 1 on smaller screens. Each slide is the actual 1080×1350 canvas scaled to fit, fully editable.
  • Caption card — Claude-written Instagram caption with hashtags.

Voice → brief → carousel

  1. Hit ● Record in the left rail. Live level meter appears. Talk for 30–90 seconds about the idea — the hook, the through-line, 3–5 supporting points, the payoff.
  2. Stop, then Transcribe. We send the audio to Whisper and drop the result in the Raw transcript box. You can edit it.
  3. Hit Extract brief. Claude reads the transcript, drops ums and restarts, and fills in Hook / Points / Payoff fields.
  4. Hit Regenerate from scratch. Claude designs the carousel using the brief, your brand voice, and the active color scheme.
If a generation feels off, edit the brief fields directly and regenerate. The brief is the single most important input — the slides are downstream of it.

Brief from a URL (PubMed, articles)

Paste a link into the URL field under the brief textareas and hit Brief from URL. The server fetches the page, extracts the readable content, and asks Claude to summarize it into the same Hook / Points / Payoff shape as a voice memo.

  • PubMed (pubmed.ncbi.nlm.nih.gov/<PMID>): we fetch the abstract via NCBI's official E-utilities API — title, authors, journal, and the full abstract land cleanly. Claude leads the brief with the headline finding and anchors the points in methods + results.
  • PMC (ncbi.nlm.nih.gov/pmc/articles/PMC<id>/): we scrape the article body. Works for full-text open-access papers.
  • Any other URL: we fetch the HTML, strip scripts / nav / footer, prefer <article> or <main> when present, and pass the readable text to Claude.

Quantitative results (sample size, effect sizes, %, p-values) are kept verbatim — Claude is instructed not to paraphrase numbers. Paywalled or thin pages will produce a sparse brief and a note in the project; in that case paste what you can grab manually into the Raw transcript box and use Extract brief instead.

After extraction, double-check the brief fields — especially numbers and author names — before regenerating. Source citations belong in the brief notes (and on individual slides via the Cite8 fact-check pass), not in the hook itself.

Refining a carousel

The Apply refinement button changes only what you describe and preserves everything else. Examples that work:

  • change the kicker on slide 2 to THE FIX
  • shorten the body on slide 4
  • replace the footer on slide 3 with the source name
  • swap slides 3 and 4
  • make slide 5 sharper

Positional vocabulary: "top" or "above" means the kicker. "Middle" or "headline" means the body. "Bottom", "below", or "footer" means the footer. "Slide 3" or "third slide" addresses by 1-based index.

Editing slide text directly

Every text field on a slide is directly editable. Click any kicker, body, or footer to type — changes autosave 500ms after the last keystroke. Empty fields show italicized placeholder text so you know where the slot is.

The same applies to the hero slide's CTA and brand line.

Inline emphasis (markdown shorthand)

Wrap text in these markers to style the words inside. They work on standard, tweet, and tip-step text fields. While the field is focused you see the raw markers; on blur the field re-renders with the styling applied so you can confirm before exporting.

  • **word** — accent color + bold (uses the slide's --slide-accent).
  • __word__ — bold (no color change).
  • _word_ — underline.
  • ~~word~~ — strikethrough.
  • `word` — inline monospace with a subtle background chip.
Conventions don't span newlines, and the longer markers match first (so **foo** doesn't get parsed as two * opens).

Slide templates (kinds)

Each slide can be rendered as a different kind of template. Switch kinds using the kind picker at the top of every slide's controls. Your text edits are kept in place where fields overlap between kinds.

Standard

The default. Big editorial body headline, optional all-caps kicker, footer line, decoration glyph, optional logo overlay, optional AI background image. The body auto-shrinks (88 / 68 / 52 px) based on length.

Tweet

An X/Twitter-style content slide. Renders an author header at the top (avatar, display name, verified blue checkmark, @handle), a bold headline, then a multi-paragraph body in a soft sans-serif. Great for "myth busting" series where the headline is the quoted claim and the paragraphs are the rebuttal.

  • Author identity is set on the brand, not the slide — open /app/brands/[id] and fill in the Author block (avatar upload, display name, handle, verified checkbox). Every tweet slide in every project for that brand picks it up automatically.
  • Headline is the slide's existing kicker field, rendered as the bold first line. Type the claim (e.g. #1 "Dairy causes massive inflammation.") directly on the slide.
  • Paragraphs live in body, separated by blank lines. Edits go back to body as one string with \n\n between paragraphs so fact-check and autosave keep working.
  • Background images, the logo overlay, fact-check dots, citations, reordering, and ZIP export all work just like the standard kind.
Want auto-numbering across a tweet series? Type the #1, #2 prefix into the headline yourself for now — auto-numbering across consecutive tweet slides is on the wave-2 list.

Tip step

A dark "numbered tip" slide. Renders a thin accent stripe down the left edge, a heavy N. step prefix + heading, a body paragraph, and an accent-colored takeaway line below — plus a corner swipe arrow.

  • Step numbers are automatic — derived from the slide's position among tip-step slides. Reorder and the numbers renumber.
  • Three text fields: heading (title), body (the explanation paragraph — used for fact-check), and callout (the colored takeaway line).
  • The controls panel adds four color pickers (BG / Stripe / Step / Callout) with × resets that fall back to the scheme.
  • Inline accent. Wrap words in **double asterisks** inside the heading, body, or callout to render them in the slide's accent color (and weight). While you're typing the field switches to raw text so you can see the markers; on blur it re-renders with the accent applied.
  • Font sizes use Hd / Bd / Ca inputs that map to the same kickerSize / bodySize / footerSize fields the other kinds use — so the → all bulk button propagates across kinds.
  • Background image, fact-check, citations, decoration, reorder, logo overlay, and ZIP export all behave the same way as on the other kinds.

Editorial

An essay-style content slide. Renders a heavy navy heading, a multi-paragraph soft sans body, and a persistent brand chrome bar pinned along the bottom (small wordmark on the left, brand name + author display name in mono on the right).

  • Two text zones: heading (the section title) and body (paragraphs separated by blank lines, just like the tweet kind). Both support the inline markdown markers (**accent**, __bold__, _underline_, ~~strike~~, `code`).
  • Brand chrome bar at the bottom pulls its content from the brand: wordmark = logoUrl, line 1 = name, line 2 = author.displayName. Set them once per brand and every editorial slide picks them up.
  • BG color picker lets you override the scheme on a per-slide basis (pastel mint, soft cream, navy, whatever fits the topic). × resets to the scheme background.
  • Font sizes use Hd / Bd mapped to the shared kickerSize / bodySize fields so the → all bulk button keeps working across kinds.

Tip cover

The matching cover for a tip-step series. Same chrome as the step slides (left accent stripe + corner swipe arrow + per-slide colors) but the body region is a single oversize chunky display headline + a softer subtitle, centered. Use as slide 1 of a series; flip the rest to Tip step.

Editorial cover

The matching cover for an editorial series. Same bottom brand chrome bar editorial-body uses, but the body region is an oversize all-caps headline + a smaller sub-question / sub-line. Background follows the project scheme; the per-slide BG picker overrides.

Both covers expose Hd / Sub font-size inputs that map to kickerSize / bodySize — the → all bulk button still propagates across kinds.

Bar chart

A data visualization slide. Renders a bold chart title, an optional subtitle or unit label, horizontal bars scaled to the largest value, and an optional italic source / summary note at the bottom. Great for comparison stats, survey results, or before-and-after numbers.

  • Bars editor — the controls panel shows a row per bar (label text input, numeric value, color swatch). Click + Add bar to insert a new row; to remove one. Bar widths update live as you type values.
  • Unit suffix — a small text field (e.g. % or pts) that is appended to every value label on the slide. Leave blank for plain numbers.
  • Bar colors each default to the scheme accent; click the swatch to override per-bar.
  • BG color picker per slide overrides the scheme background (same as the other kinds).
  • The chart title, subtitle, and note fields all support the inline markdown conventions (**accent**, __bold__, etc.).
  • Citations, fact-check dots, reorder buttons, and ZIP export work the same as every other kind.

Compare table

Two-column side-by-side comparison (e.g. "old way" vs "new way"). Edit the left/right column labels and pick a color for each header (defaults to green / salmon). Add as many aspect rows as you need — each row shows a row label plus a paragraph for the left side and a paragraph for the right side.

Data table

Generic N-column data table. Edit headers as a comma-separated list ("Row, Col A, Col B, Col C"). Each row has a label, an optional row-label color (defaults to scheme accent), and the remaining cell values entered pipe-separated ("12 | 34 | 56"). Headers render bold on the scheme-foreground background.

Glossary

Left-column terms (rendered in the accent color) paired with right-column definitions. Great for jargon-busting carousels.

Term card

A bold two-tone headline (wrap the highlighted phrase in *asterisks* to color it with the scheme accent), followed by chip tags and a 2×2 grid of icon cards. Each card has an icon (emoji or character), title, and short body.

Flowchart

Sequential boxes connected by arrows. Pick vertical (↓) or horizontal (→) layout. Each node has a label and an optional color override.

Pyramid

Stacked-tier diagram from narrow top to wide base (or invert it). Each tier has a label, an optional short note rendered below the label in smaller text, and a color.

Timeline

Vertical timeline with circular markers connected by an accent line. Each event has a short marker (number, year, or letter), a title, and an optional body paragraph.

Stat callout

One oversize stat slide. A giant value (e.g. "78%") in the accent color, a one-line label below it, and an optional smaller context paragraph. Best for slides that punch you in the face with a single number.

Quote

Pull-quote slide. Big serif italic quote text, attribution name, and optional role / affiliation line. A decorative open-quote mark anchors the layout.

Checklist

Title + a list of checked / unchecked items. Toggle the per-item checkbox in the controls panel; checked items get a strikethrough and a filled accent box, unchecked items show an empty box.

2×2 matrix

Quadrant chart with editable X and Y axis labels and four cells (top-left, top-right, bottom-left, bottom-right). Each cell has a title (in accent color) and an optional body line. Useful for prioritization frameworks (eg. Eisenhower, ICE).

Per-slide size + accent

Every structured kind (bar-chart, glossary, term-card, timeline, stat-callout, etc.) has two extra controls in its panel:

  • Size slider — scales every text element on that slide between 70% and 150%. Defaults to 100%. The slider updates the slide live (no full re-render) so you don't lose focus while dragging.
  • Accent color — overrides the slide's accent color (used for term highlights, chart bars, timeline markers, quote marks, chip backgrounds, etc.) without changing the scheme. × resets to the scheme accent.

AI picks the kind

When you Regenerate from scratch or Apply refinement, Claude is now free to pick a different kind per slide when it genuinely fits the content — bar-chart when the brief has numbers, stat-callout when one number is the whole point, quote when the brief includes an attributed quotation, timeline for chronological events, etc.

Decision rules in the system prompt prevent fabrication: Claude won't invent stats, bars, or quotes just to fill a richer layout — slides without supporting data stay as the standard headline format.

Reordering, adding, deleting slides

Each slide's controls include ↑ Move up / ↓ Move down / ✕ Delete buttons. Use them to rearrange or remove the carousel after generation. The hero always stays last and isn't reorderable or deletable.

At the end of the slide grid (before the hero) there's a + Add slide tile — click it to insert a fresh standard slide you can then edit, restyle, or change the kind of.

Fact-check dots travel with the slide on reorder and are dropped when the slide is deleted. If a seamless backdrop range would split, get truncated, or empty out, it adjusts automatically (and is dropped when emptied).

Instagram preview

The Instagram preview card above the slide grid shows your carousel exactly as a viewer would scroll through it on Instagram — including the hero CTA at the end. Swipe on touch, or use the ‹ / › buttons and dot indicators to step through each slide.

Edits in the slide grid refresh the preview in real time, so it's a fast way to gut-check pacing and reading flow before you export.

Per-element font sizes

Each slide's controls include three small number inputs labeled K / B / F: kicker px / body px / footer px. Leave them empty to use the auto-sizing rules (88 / 68 / 52 px for the body depending on length; 28 for the kicker; 22 for the footer). Set a number to override that single element on that single slide.

If you change body text and the auto-shrink picks a size you don't like, override the body field with your preferred px value. Clear it to return to auto.

Decorations

The icon grid below each slide picks a small graphic that sits above the footer: arrows, dots, underlines, caret, star, plus, cross, check, zigzag. Plus a none option.

Below the picker, a color input lets you override the decoration's color per slide. Click Use scheme accent to clear the override.

AI background images per slide

Click + Generate AI image on any slide. You'll be asked for a theme (atmospheric / nature / texture / light-shadow / custom) and a prompt. We always append "No text in the image" so the AI doesn't hallucinate captions.

The slide-specific prompt drives the subject; your brand's image style is appended as a stylistic modifier so all slides share a look. The result becomes a full-bleed background.

Opacity slider

Each image-backed slide gets an Image opacity 0–100% slider in its controls. Drop it below 40% if you want the image to be a subtle texture rather than the focus — text colors stay scheme colors below 40%, flip to white at 40% and above for legibility.

Image gen takes 10–30 seconds depending on prompt complexity. Don't refresh; the request is in flight.

Seamless backdrop across slides

One AI image stretched across a contiguous slide range so a reader swiping sees a continuous backdrop. Set From slide and to, write a prompt, click Generate backdrop.

How it works

Generated at 1536×1024 (gpt-image-2's native landscape size), then stretched via CSS background-size and per-slide background-position. A 4-slide backdrop stretches the source ~2.8× wider than native; a 5-slide one ~3.5×. Surfaced in the status line after generation.

Best with

Atmospheric / textural / gradient prompts where the image has no central focal point. Detailed scenes (faces, objects) don't stretch cleanly.

Backdrop opacity

Drag the Backdrop opacity slider 0–100%. All backdrop slides update at once in place. Same 40% threshold flips text white.

Overriding individual slides

A per-slide + Generate AI image on a slide inside the backdrop range replaces the backdrop on that one slide only. The rest of the backdrop continues uninterrupted.

Backdrop prompts ignore the brand's image style on purpose — you usually want a different look for a sweeping backdrop than for a subject-focused slide image.

The hero / CTA slide

The carousel ends with a hero slide added by the system — not generated by Claude. Top 62% is the brand's logo or an AI-generated hero image. Bottom 38% is a CTA strip with a thick accent-color top border, the CTA headline (editable), and the brand line (editable).

Hero controls include K/B-style size inputs for the CTA and brand line, plus + Generate hero image / Clear image. The CTA dropdown in the action bar sets the default text; click into the hero to edit it directly per project.

Fact-checking with Cite8

Cite8 is a clinical RAG service that verifies claims against peer-reviewed PubMed literature and links each supporting/contradicting source. The studio integrates it via the Fact-check claims button in the refinement bar.

How it works

  • Each slide's body text is sent to Cite8's /api/v1/verify-claim endpoint.
  • Cite8 returns peer-reviewed citations with one of: supports / partial / contradicts / unrelated / unknown.
  • The studio rolls those up into a single per-slide indicator: green (well-supported), yellow (partial / no clear source / needs sourcing), red (contradicted by literature).
  • The dot lives on each slide's number badge. Click a dot to see the citations and PubMed links.

When to use it

Whenever a slide makes a factual claim — especially medical, health, or science adjacent. Cite8 is PubMed-backed, so it's strongest on biomedical content. Non-medical claims will mostly come back yellow ("no relevant sources found"); that's expected.

When Cite8 itself is unavailable

If the Cite8 service is down or unreachable, every slide's verification call will fail. Rather than silently marking every slide yellow, the studio detects the total outage and pops a dedicated modal:

  • "Cite8 is unavailable right now. We couldn't verify any of the slides' claims."
  • The Cite8 error text is shown verbatim for diagnostic context.
  • Override all (publish without verification) — marks every slide overridden so the modal won't reappear on subsequent generations. Use this only when you've decided to ship without the fact-check.
  • Close the modal to wait and try the Fact-check claims button again later.

Citing studies on the slide itself

When Cite8 returns a supports verification for a slide, the studio auto-populates a small citation line beneath the slide's footer — a CITE8 chip plus the PubMed ID and publication year of the strongest supporting study. The line is editable like any other slide text; editing it locks the slide so future fact-check runs won't overwrite your wording. Clear the text and the citation line disappears from the rendered slide.

Auto-check after generation

Every time a carousel is freshly generated or refined, the studio runs the fact-check automatically. If any slide comes back red (contradicted by literature), a modal pops up listing those slides with the contradicting claim, the reasoning, and links to the PubMed sources. For each red slide you can:

  • Override (keep claim) — accept the slide as-is. The slide is marked overridden so it won't trigger the modal again.
  • Edit slide — closes the modal and scrolls the slide into view with the body field focused, so you can soften or rephrase.
  • Override all — clears every red flag in one click. Use this when you've decided the topic is controversial-but-defensible and you've reviewed each one.

Configuration

Set two Worker env vars: CITE8_BASE_URL (e.g. your Railway deployment URL) and CITE8_API_KEY (Bearer token). Without them, the button still works but returns "unconfigured" status for every slide.

A red light isn't a final verdict — Cite8 returns the strongest matching evidence, which may not be the most recent. Click through to the linked papers to make the call yourself.

Caption generation

Below the slide grid, the Caption card has a Generate button. We pass your brief, slides, hero CTA, and brand voice to Claude and ask for an Instagram caption that:

  • Hooks in the first line (before the 125-char "...more" cutoff).
  • Delivers value beyond what the slides show.
  • Ends with one clear CTA aligned to the hero slide.
  • Includes up to 5 niche-relevant hashtags.

The caption autosaves. Copy sends it to the clipboard ready to paste into Instagram.

Exporting + saving to camera roll

Download ZIP ↓ renders each slide at 1080×1350 PNG and bundles them into a zip named <slug>-<scheme>-<date>.zip with files like 01-hook.png, 02-slide-2.png, …, N-hero.png.

On mobile, every slide additionally shows a 📲 Save to Photos button. We render that slide and trigger the system share sheet so iOS surfaces "Save Image" and Android shows the system save sheet.

How color schemes work

Schemes are bg / fg / accent triples. The studio picks one of two sources:

  • Curated schemes — 8 hand-picked combinations (cream + burgundy, white + black, etc.). Shown when no palette is active.
  • Palette-derived schemes — when a palette is active (either via Custom palette override or inherited from the brand), the studio computes every (bg, fg) pair from the palette swatches that passes WCAG AA contrast (≥4.5:1), sorts by contrast descending, caps at 16. Accent is auto-picked as a third swatch with ≥3:1 contrast against the background.

The scheme picker shows actual color preview chips for every option, so you can choose by look not hex code.

Tips & gotchas

Tips

  • Brief specificity drives slide quality. If a generation feels generic, the brief was too vague. Edit the brief fields and regenerate.
  • Refinements are surgical. Use them for small targeted changes ("shorten slide 4"). Re-generate when you want a wholly new direction.
  • Backdrop + scheme contrast. If white text over backdrop feels harsh, lower the opacity to 30–35%; text flips back to scheme colors and the backdrop becomes a soft texture.
  • Capture a long memo, edit the transcript. Whisper is surprisingly forgiving. Just talk; clean up the transcript text before clicking Extract brief.

Gotchas

  • Image gen is per-call billing. Each AI image call costs a few cents. Don't spam regenerate.
  • The first generation is slowest. Cloudflare Workers cold-starts the runtime; subsequent calls are faster.
  • Autosaves commit to your data repo. Every edit produces a git commit. That's intentional (audit trail), but your repo's commit history will be noisy.