Merge branch 'main' into pr/227

2024-06-27 14:44:24 +08:00 · 2024-06-27 14:44:24 +08:00 · 0200274e61
commit 0200274e61
parent 7c1fb7a54d 901ce332a5
81 changed files with 7026 additions and 814 deletions
--- a/.github/FUNDING.yml
+++ b/.github/FUNDING.yml
@ -0,0 +1 @@
+github: [abi]
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@ -0,0 +1,21 @@
+---
+name: Bug report
+about: Create a report to help us improve
+title: ''
+labels: ''
+assignees: ''
+
+---
+
+**Describe the bug**
+A clear and concise description of what the bug is.
+
+**To Reproduce**
+Steps to reproduce the behavior:
+1. Go to '...'
+2. Click on '....'
+3. Scroll down to '....'
+4. See error
+
+**Screenshots of backend AND frontend terminal logs**
+If applicable, add screenshots to help explain your problem.
--- a/.github/ISSUE_TEMPLATE/custom.md
+++ b/.github/ISSUE_TEMPLATE/custom.md
@ -0,0 +1,10 @@
+---
+name: Custom issue template
+about: Describe this issue template's purpose here.
+title: ''
+labels: ''
+assignees: ''
+
+---
+
+
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@ -0,0 +1,20 @@
+---
+name: Feature request
+about: Suggest an idea for this project
+title: ''
+labels: ''
+assignees: ''
+
+---
+
+**Is your feature request related to a problem? Please describe.**
+A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
+
+**Describe the solution you'd like**
+A clear and concise description of what you want to happen.
+
+**Describe alternatives you've considered**
+A clear and concise description of any alternative solutions or features you've considered.
+
+**Additional context**
+Add any other context or screenshots about the feature request here.
--- a/.vscode/settings.json
+++ b/.vscode/settings.json
@ -1,3 +1,5 @@
 {
-  "python.analysis.typeCheckingMode": "strict"
+  "python.analysis.typeCheckingMode": "strict",
+  "python.analysis.extraPaths": ["./backend"],
+  "python.autoComplete.extraPaths": ["./backend"]
 }
--- a/Evaluation.md
+++ b/Evaluation.md
@ -0,0 +1,19 @@
+## Evaluating models and prompts
+
+Evaluation dataset consists of 16 screenshots. A Python script for running screenshot-to-code on the dataset and a UI for rating outputs is included. With this set up, we can compare and evaluate various models and prompts.
+
+### Running evals
+
+- Input screenshots should be located at `backend/evals_data/inputs` and the outputs will be `backend/evals_data/outputs`. If you want to modify this, modify `EVALS_DIR` in `backend/evals/config.py`. You can download the input screenshot dataset here: TODO.
+- Set a stack and model (`STACK` var, `MODEL` var) in `backend/run_evals.py`
+- Run `OPENAI_API_KEY=sk-... python run_evals.py` - this runs the screenshot-to-code on the input dataset in parallel but it will still take a few minutes to complete.
+- Once the script is done, you can find the outputs in `backend/evals_data/outputs`.
+
+### Rating evals
+
+In order to view and rate the outputs, visit your front-end at `/evals`.
+
+- Rate each output on a scale of 1-4
+- You can also print the page as PDF to share your results with others.
+
+Generally, I run three tests for each model/prompt + stack combo and take the average score out of those tests to evaluate.
--- a/README.md
+++ b/README.md
@ -1,32 +1,53 @@
 # screenshot-to-code

-This simple app converts a screenshot to code (HTML/Tailwind CSS, or React or Bootstrap or Vue). It uses GPT-4 Vision to generate the code and DALL-E 3 to generate similar-looking images. You can now also enter a URL to clone a live website!
+A simple tool to convert screenshots, mockups and Figma designs into clean, functional code using AI. **Now supporting Claude Sonnet 3.5 and GPT-4O!**

 https://github.com/abi/screenshot-to-code/assets/23818/6cebadae-2fe3-4986-ac6a-8fb9db030045

+Supported stacks:
+
+- HTML + Tailwind
+- React + Tailwind
+- Vue + Tailwind
+- Bootstrap
+- Ionic + Tailwind
+- SVG
+
+Supported AI models:
+
+- Claude Sonnet 3.5 - Best model!
+- GPT-4O - also recommended!
+- GPT-4 Turbo (Apr 2024)
+- GPT-4 Vision (Nov 2023)
+- Claude 3 Sonnet
+- DALL-E 3 for image generation
+
 See the [Examples](#-examples) section below for more demos.

-## 🚀 Try It Out!
+We also just added experimental support for taking a video/screen recording of a website in action and turning that into a functional prototype. 

-🆕 [Try it here](https://screenshottocode.com) (bring your own OpenAI key - **your key must have access to GPT-4 Vision. See [FAQ](#%EF%B8%8F-faqs) section below for details**). Or see [Getting Started](#-getting-started) below for local install instructions.
+![google in app quick 3](https://github.com/abi/screenshot-to-code/assets/23818/8758ffa4-9483-4b9b-bb66-abd6d1594c33)

-## 🌟 Recent Updates
+[Learn more about video here](https://github.com/abi/screenshot-to-code/wiki/Screen-Recording-to-Code).

- Dec 11 - Start a new project from existing code (allows you to come back to an older project)
- Dec 7 - 🔥 🔥 🔥 View a history of your edits, and branch off them
- Nov 30 - Dark mode, output code in Ionic (thanks [@dialmedu](https://github.com/dialmedu)), set OpenAI base URL
- Nov 28 - 🔥 🔥 🔥 Customize your stack: React or Bootstrap or TailwindCSS
- Nov 23 - Send in a screenshot of the current replicated version (sometimes improves quality of subsequent generations)
- Nov 21 - Edit code in the code editor and preview changes live thanks to [@clean99](https://github.com/clean99)
- Nov 20 - Paste in a URL to screenshot and clone (requires [ScreenshotOne free API key](https://screenshotone.com?via=screenshot-to-code))
- Nov 19 - Support for dark/light code editor theme - thanks [@kachbit](https://github.com/kachbit)
- Nov 16 - Added a setting to disable DALL-E image generation if you don't need that
- Nov 16 - View code directly within the app
- Nov 15 - You can now instruct the AI to update the code as you wish. It is helpful if the AI messed up some styles or missed a section.
+[Follow me on Twitter for updates](https://twitter.com/_abi_).
+
+## Sponsors
+
+<a href="https://konghq.com/products/kong-konnect?utm_medium=referral&utm_source=github&utm_campaign=platform&utm_content=screenshot-to-code" target="_blank" title="Kong - powering the API world"><img src="https://picoapps.xyz/s2c-sponsors/Kong-GitHub-240x100.png"></a>
+
+## 🚀 Hosted Version
+
+[Try it live on the hosted version (paid)](https://screenshottocode.com).

 ## 🛠 Getting Started

-The app has a React/Vite frontend and a FastAPI backend. You will need an OpenAI API key with access to the GPT-4 Vision API.
+The app has a React/Vite frontend and a FastAPI backend. 
+
+Keys needed:
+
+* [OpenAI API key with access to GPT-4](https://github.com/abi/screenshot-to-code/blob/main/Troubleshooting.md)
+* Anthropic key (optional) - only if you want to use Claude Sonnet, or for experimental video support.

 Run the backend (I use Poetry for package management - `pip install poetry` if you don't have it):

@ -38,11 +59,7 @@ poetry shell
 poetry run uvicorn main:app --reload --port 7001
 ```

-You can also run the backend (when you're in `backend`):
-
-```bash
-poetry run pyright
-```
+If you want to use Anthropic, add `ANTHROPIC_API_KEY` to `backend/.env`. You can also set up the keys using the settings dialog on the front-end (click the gear icon after loading the frontend).

 Run the frontend:

@ -62,10 +79,6 @@ For debugging purposes, if you don't want to waste GPT4-Vision credits, you can
 MOCK=true poetry run uvicorn main:app --reload --port 7001
 ```

-## Configuration
-
- You can configure the OpenAI base URL if you need to use a proxy: Set OPENAI_BASE_URL in the `backend/.env` or directly in the UI in the settings dialog
-
 ## Docker

 If you have Docker installed on your system, in the root directory, run:
@ -81,6 +94,9 @@ The app will be up and running at http://localhost:5173. Note that you can't dev

 - **I'm running into an error when setting up the backend. How can I fix it?** [Try this](https://github.com/abi/screenshot-to-code/issues/3#issuecomment-1814777959). If that still doesn't work, open an issue.
 - **How do I get an OpenAI API key?** See https://github.com/abi/screenshot-to-code/blob/main/Troubleshooting.md
+- **How can I configure an OpenAI proxy?** - If you're not able to access the OpenAI API directly (due to e.g. country restrictions), you can try a VPN or you can configure the OpenAI base URL to use a proxy: Set OPENAI_BASE_URL in the `backend/.env` or directly in the UI in the settings dialog. Make sure the URL has "v1" in the path so it should look like this:  `https://xxx.xxxxx.xxx/v1`
+- **How can I update the backend host that my front-end connects to?** - Configure VITE_HTTP_BACKEND_URL and VITE_WS_BACKEND_URL in front/.env.local For example, set VITE_HTTP_BACKEND_URL=http://124.10.20.1:7001
+- **Seeing UTF-8 errors when running the backend?** - On windows, open the .env file with notepad++, then go to Encoding and select UTF-8. 
 - **How can I provide feedback?** For feedback, feature requests and bug reports, open an issue or ping me on [Twitter](https://twitter.com/_abi_).

 ## 📚 Examples
@ -101,6 +117,4 @@ https://github.com/abi/screenshot-to-code/assets/23818/3fec0f77-44e8-4fb3-a769-a

 ## 🌍 Hosted Version

-🆕 [Try it here](https://screenshottocode.com) (bring your own OpenAI key - **your key must have access to GPT-4 Vision. See [FAQ](#%EF%B8%8F-faqs) section for details**). Or see [Getting Started](#-getting-started) for local install instructions.
-
-[!["Buy Me A Coffee"](https://www.buymeacoffee.com/assets/img/custom_images/orange_img.png)](https://www.buymeacoffee.com/abiraja)
+🆕 [Try it here (paid)](https://screenshottocode.com). Or see [Getting Started](#-getting-started) for local install instructions to use with your own API keys.
--- a/Troubleshooting.md
+++ b/Troubleshooting.md
@ -1,17 +1,22 @@
-### Getting an OpenAI API key with GPT4-Vision model access
+### Getting an OpenAI API key with GPT-4 model access

 You don't need a ChatGPT Pro account. Screenshot to code uses API keys from your OpenAI developer account. In order to get access to the GPT4 Vision model, log into your OpenAI account and then, follow these instructions:

 1. Open [OpenAI Dashboard](https://platform.openai.com/)
 1. Go to Settings > Billing
 1. Click at the Add payment details
-<img width="1030" alt="285636868-c80deb92-ab47-45cd-988f-deee67fbd44d" src="https://github.com/abi/screenshot-to-code/assets/23818/4e0f4b77-9578-4f9a-803c-c12b1502f3d7">
+<img width="900" alt="285636868-c80deb92-ab47-45cd-988f-deee67fbd44d" src="https://github.com/abi/screenshot-to-code/assets/23818/4e0f4b77-9578-4f9a-803c-c12b1502f3d7">
+
 4. You have to buy some credits. The minimum is $5.
-
 5. Go to Settings > Limits and check at the bottom of the page, your current tier has to be "Tier 1" to have GPT4 access
-<img width="785" alt="285636973-da38bd4d-8a78-4904-8027-ca67d729b933" src="https://github.com/abi/screenshot-to-code/assets/23818/8d07cd84-0cf9-4f88-bc00-80eba492eadf">
-6. Go to Screenshot to code and paste it in the Settings dialog under OpenAI key (gear icon). Your key is only stored in your browser. Never stored on our servers.
+<img width="900" alt="285636973-da38bd4d-8a78-4904-8027-ca67d729b933" src="https://github.com/abi/screenshot-to-code/assets/23818/8d07cd84-0cf9-4f88-bc00-80eba492eadf">

-Some users have also reported that it can take upto 30 minutes after your credit purchase for the GPT4 vision model to be activated.
+6. Navigate to OpenAI [api keys](https://platform.openai.com/api-keys) page and create and copy a new secret key.
+7. Go to Screenshot to code and paste it in the Settings dialog under OpenAI key (gear icon). Your key is only stored in your browser. Never stored on our servers.

-If you've followed these steps, and it still doesn't work, feel free to open a Github issue.
+## Still not working?
+
+- Some users have also reported that it can take upto 30 minutes after your credit purchase for the GPT4 vision model to be activated.
+- You need to add credits to your account AND set it to renew when credits run out in order to be upgraded to Tier 1. Make sure your "Settings > Limits" page shows that you are at Tier 1.
+
+If you've followed these steps, and it still doesn't work, feel free to open a Github issue. We only provide support for the open source version since we don't have debugging logs on the hosted version. If you're looking to use the hosted version, we recommend getting a paid subscription on screenshottocode.com
--- a/backend/.gitignore
+++ b/backend/.gitignore
@ -154,3 +154,7 @@ cython_debug/

 # Temporary eval output
 evals_data
+
+
+# Temporary video evals (Remove before merge)
+video_evals
--- a/backend/.pre-commit-config.yaml
+++ b/backend/.pre-commit-config.yaml
@ -0,0 +1,25 @@
+# See https://pre-commit.com for more information
+# See https://pre-commit.com/hooks.html for more hooks
+repos:
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v3.2.0
+    hooks:
+      - id: end-of-file-fixer
+      - id: check-yaml
+      - id: check-added-large-files
+  - repo: local
+    hooks:
+      - id: poetry-pytest
+        name: Run pytest with Poetry
+        entry: poetry run --directory backend pytest
+        language: system
+        pass_filenames: false
+        always_run: true
+        files: ^backend/
+      # - id: poetry-pyright
+      #   name: Run pyright with Poetry
+      #   entry: poetry run --directory backend pyright
+      #   language: system
+      #   pass_filenames: false
+      #   always_run: true
+      #   files: ^backend/
--- a/backend/Dockerfile
+++ b/backend/Dockerfile
@ -1,4 +1,4 @@
-FROM python:3.12-slim-bullseye
+FROM python:3.12.3-slim-bullseye

 ENV POETRY_VERSION 1.4.1

--- a/backend/access_token.py
+++ b/backend/access_token.py
@ -1,21 +0,0 @@
-import json
-import os
-import httpx
-
-
-async def validate_access_token(access_code: str):
-    async with httpx.AsyncClient() as client:
-        url = (
-            "https://backend.buildpicoapps.com/screenshot_to_code/validate_access_token"
-        )
-        data = json.dumps(
-            {
-                "access_code": access_code,
-                "secret": os.environ.get("PICO_BACKEND_SECRET"),
-            }
-        )
-        headers = {"Content-Type": "application/json"}
-
-        response = await client.post(url, content=data, headers=headers)
-        response_data = response.json()
-        return response_data
--- a/backend/codegen/init.py
+++ b/backend/codegen/init.py
--- a/backend/codegen/test_utils.py
+++ b/backend/codegen/test_utils.py
@ -0,0 +1,57 @@
+import unittest
+from codegen.utils import extract_html_content
+
+
+class TestUtils(unittest.TestCase):
+
+    def test_extract_html_content_with_html_tags(self):
+        text = "<html><body><p>Hello, World!</p></body></html>"
+        expected = "<html><body><p>Hello, World!</p></body></html>"
+        result = extract_html_content(text)
+        self.assertEqual(result, expected)
+
+    def test_extract_html_content_without_html_tags(self):
+        text = "No HTML content here."
+        expected = "No HTML content here."
+        result = extract_html_content(text)
+        self.assertEqual(result, expected)
+
+    def test_extract_html_content_with_partial_html_tags(self):
+        text = "<html><body><p>Hello, World!</p></body>"
+        expected = "<html><body><p>Hello, World!</p></body>"
+        result = extract_html_content(text)
+        self.assertEqual(result, expected)
+
+    def test_extract_html_content_with_multiple_html_tags(self):
+        text = "<html><body><p>First</p></body></html> Some text <html><body><p>Second</p></body></html>"
+        expected = "<html><body><p>First</p></body></html>"
+        result = extract_html_content(text)
+        self.assertEqual(result, expected)
+
+    ## The following are tests based on actual LLM outputs
+
+    def test_extract_html_content_some_explanation_before(self):
+        text = """Got it! You want the song list to be displayed horizontally. I'll update the code to ensure that the song list is displayed in a horizontal layout.
+
+        Here's the updated code:
+
+        <html lang="en"><head></head><body class="bg-black text-white"></body></html>"""
+        expected = '<html lang="en"><head></head><body class="bg-black text-white"></body></html>'
+        result = extract_html_content(text)
+        self.assertEqual(result, expected)
+
+    def test_markdown_tags(self):
+        text = "```html<head></head>```"
+        expected = "```html<head></head>```"
+        result = extract_html_content(text)
+        self.assertEqual(result, expected)
+
+    def test_doctype_text(self):
+        text = '<!DOCTYPE html><html lang="en"><head></head><body></body></html>'
+        expected = '<html lang="en"><head></head><body></body></html>'
+        result = extract_html_content(text)
+        self.assertEqual(result, expected)
+
+
+if __name__ == "__main__":
+    unittest.main()
--- a/backend/codegen/utils.py
+++ b/backend/codegen/utils.py
@ -0,0 +1,14 @@
+import re
+
+
+def extract_html_content(text: str):
+    # Use regex to find content within <html> tags and include the tags themselves
+    match = re.search(r"(<html.*?>.*?</html>)", text, re.DOTALL)
+    if match:
+        return match.group(1)
+    else:
+        # Otherwise, we just send the previous HTML over
+        print(
+            "[HTML Extraction] No <html> tags found in the generated content: " + text
+        )
+        return text
--- a/backend/config.py
+++ b/backend/config.py
@ -3,8 +3,13 @@
 # TODO: Should only be set to true when value is 'True', not any abitrary truthy value
 import os

+ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY", None)
+
+# Debugging-related

 SHOULD_MOCK_AI_RESPONSE = bool(os.environ.get("MOCK", False))
+IS_DEBUG_ENABLED = bool(os.environ.get("IS_DEBUG_ENABLED", False))
+DEBUG_DIR = os.environ.get("DEBUG_DIR", "")

 # Set to True when running in production (on the hosted version)
 # Used as a feature flag to enable or disable certain features
--- a/backend/custom_types.py
+++ b/backend/custom_types.py
@ -0,0 +1,7 @@
+from typing import Literal
+
+
+InputMode = Literal[
+    "image",
+    "video",
+]
--- a/backend/debug/DebugFileWriter.py
+++ b/backend/debug/DebugFileWriter.py
@ -0,0 +1,30 @@
+import os
+import logging
+import uuid
+
+from config import DEBUG_DIR, IS_DEBUG_ENABLED
+
+
+class DebugFileWriter:
+    def __init__(self):
+        if not IS_DEBUG_ENABLED:
+            return
+
+        try:
+            self.debug_artifacts_path = os.path.expanduser(
+                f"{DEBUG_DIR}/{str(uuid.uuid4())}"
+            )
+            os.makedirs(self.debug_artifacts_path, exist_ok=True)
+            print(f"Debugging artifacts will be stored in: {self.debug_artifacts_path}")
+        except:
+            logging.error("Failed to create debug directory")
+
+    def write_to_file(self, filename: str, content: str) -> None:
+        try:
+            with open(os.path.join(self.debug_artifacts_path, filename), "w") as file:
+                file.write(content)
+        except Exception as e:
+            logging.error(f"Failed to write to file: {e}")
+
+    def extract_html_content(self, text: str) -> str:
+        return str(text.split("<html>")[-1].rsplit("</html>", 1)[0] + "</html>")
--- a/backend/debug/init.py
+++ b/backend/debug/init.py
--- a/backend/evals/core.py
+++ b/backend/evals/core.py
@ -1,29 +1,40 @@
 import os
+from config import ANTHROPIC_API_KEY

-from llm import stream_openai_response
+from llm import Llm, stream_claude_response, stream_openai_response
 from prompts import assemble_prompt
 from prompts.types import Stack
-from utils import pprint_prompt


-async def generate_code_core(image_url: str, stack: Stack) -> str:
+async def generate_code_core(image_url: str, stack: Stack, model: Llm) -> str:
    prompt_messages = assemble_prompt(image_url, stack)
    openai_api_key = os.environ.get("OPENAI_API_KEY")
+    anthropic_api_key = ANTHROPIC_API_KEY
    openai_base_url = None

-    pprint_prompt(prompt_messages)
-
    async def process_chunk(content: str):
        pass

-    if not openai_api_key:
-        raise Exception("OpenAI API key not found")
+    if model == Llm.CLAUDE_3_SONNET or model == Llm.CLAUDE_3_5_SONNET_2024_06_20:
+        if not anthropic_api_key:
+            raise Exception("Anthropic API key not found")

-    completion = await stream_openai_response(
-        prompt_messages,
-        api_key=openai_api_key,
-        base_url=openai_base_url,
-        callback=lambda x: process_chunk(x),
-    )
+        completion = await stream_claude_response(
+            prompt_messages,
+            api_key=anthropic_api_key,
+            callback=lambda x: process_chunk(x),
+            model=model,
+        )
+    else:
+        if not openai_api_key:
+            raise Exception("OpenAI API key not found")
+
+        completion = await stream_openai_response(
+            prompt_messages,
+            api_key=openai_api_key,
+            base_url=openai_base_url,
+            callback=lambda x: process_chunk(x),
+            model=model,
+        )

    return completion
--- a/backend/image_generation.py
+++ b/backend/image_generation.py
@ -5,7 +5,7 @@ from openai import AsyncOpenAI
 from bs4 import BeautifulSoup


-async def process_tasks(prompts: List[str], api_key: str, base_url: str):
+async def process_tasks(prompts: List[str], api_key: str, base_url: str | None):
    tasks = [generate_image(prompt, api_key, base_url) for prompt in prompts]
    results = await asyncio.gather(*tasks, return_exceptions=True)

@ -20,17 +20,18 @@ async def process_tasks(prompts: List[str], api_key: str, base_url: str):
    return processed_results


-async def generate_image(prompt: str, api_key: str, base_url: str):
+async def generate_image(
+    prompt: str, api_key: str, base_url: str | None
+) -> Union[str, None]:
    client = AsyncOpenAI(api_key=api_key, base_url=base_url)
-    image_params: Dict[str, Union[str, int]] = {
-        "model": "dall-e-3",
-        "quality": "standard",
-        "style": "natural",
-        "n": 1,
-        "size": "1024x1024",
-        "prompt": prompt,
-    }
-    res = await client.images.generate(**image_params)
+    res = await client.images.generate(
+        model="dall-e-3",
+        quality="standard",
+        style="natural",
+        n=1,
+        size="1024x1024",
+        prompt=prompt,
+    )
    await client.close()
    return res.data[0].url

@ -63,13 +64,13 @@ def create_alt_url_mapping(code: str) -> Dict[str, str]:

 async def generate_images(
    code: str, api_key: str, base_url: Union[str, None], image_cache: Dict[str, str]
-):
+) -> str:
    # Find all images
    soup = BeautifulSoup(code, "html.parser")
    images = soup.find_all("img")

    # Extract alt texts as image prompts
-    alts = []
+    alts: List[str | None] = []
    for img in images:
        # Only include URL if the image starts with https://placehold.co
        # and it's not already in the image_cache
@ -80,10 +81,10 @@ async def generate_images(
            alts.append(img.get("alt", None))

    # Exclude images with no alt text
-    alts = [alt for alt in alts if alt is not None]
+    filtered_alts: List[str] = [alt for alt in alts if alt is not None]

    # Remove duplicates
-    prompts = list(set(alts))
+    prompts = list(set(filtered_alts))

    # Return early if there are no images to replace
    if len(prompts) == 0:
--- a/backend/image_processing/init.py
+++ b/backend/image_processing/init.py
--- a/backend/image_processing/utils.py
+++ b/backend/image_processing/utils.py
@ -0,0 +1,52 @@
+import base64
+import io
+import time
+from PIL import Image
+
+CLAUDE_IMAGE_MAX_SIZE = 5 * 1024 * 1024
+
+
+# Process image so it meets Claude requirements
+def process_image(image_data_url: str) -> tuple[str, str]:
+
+    media_type = image_data_url.split(";")[0].split(":")[1]
+    base64_data = image_data_url.split(",")[1]
+
+    # If image is already under max size, return as is
+    if len(base64_data) <= CLAUDE_IMAGE_MAX_SIZE:
+        print("[CLAUDE IMAGE PROCESSING] no processing needed")
+        return (media_type, base64_data)
+
+    # Time image processing
+    start_time = time.time()
+
+    image_bytes = base64.b64decode(base64_data)
+    img = Image.open(io.BytesIO(image_bytes))
+
+    # Convert and compress as JPEG
+    quality = 95
+    output = io.BytesIO()
+    img = img.convert("RGB")  # Ensure image is in RGB mode for JPEG conversion
+    img.save(output, format="JPEG", quality=quality)
+
+    # Reduce quality until image is under max size
+    while (
+        len(base64.b64encode(output.getvalue())) > CLAUDE_IMAGE_MAX_SIZE
+        and quality > 10
+    ):
+        output = io.BytesIO()
+        img.save(output, format="JPEG", quality=quality)
+        quality -= 5
+
+    # Log so we know it was modified
+    old_size = len(base64_data)
+    new_size = len(base64.b64encode(output.getvalue()))
+    print(
+        f"[CLAUDE IMAGE PROCESSING] image size updated: old size = {old_size} bytes, new size = {new_size} bytes"
+    )
+
+    end_time = time.time()
+    processing_time = end_time - start_time
+    print(f"[CLAUDE IMAGE PROCESSING] processing time: {processing_time:.2f} seconds")
+
+    return ("image/jpeg", base64.b64encode(output.getvalue()).decode("utf-8"))
--- a/backend/llm.py
+++ b/backend/llm.py
@ -1,8 +1,35 @@
-from typing import Awaitable, Callable, List
+import base64
+from enum import Enum
+from typing import Any, Awaitable, Callable, List, cast
+from anthropic import AsyncAnthropic
 from openai import AsyncOpenAI
 from openai.types.chat import ChatCompletionMessageParam, ChatCompletionChunk
+from config import IS_DEBUG_ENABLED
+from debug.DebugFileWriter import DebugFileWriter
+from image_processing.utils import process_image

-MODEL_GPT_4_VISION = "gpt-4-vision-preview"
+from utils import pprint_prompt
+
+
+# Actual model versions that are passed to the LLMs and stored in our logs
+class Llm(Enum):
+    GPT_4_VISION = "gpt-4-vision-preview"
+    GPT_4_TURBO_2024_04_09 = "gpt-4-turbo-2024-04-09"
+    GPT_4O_2024_05_13 = "gpt-4o-2024-05-13"
+    CLAUDE_3_SONNET = "claude-3-sonnet-20240229"
+    CLAUDE_3_OPUS = "claude-3-opus-20240229"
+    CLAUDE_3_HAIKU = "claude-3-haiku-20240307"
+    CLAUDE_3_5_SONNET_2024_06_20 = "claude-3-5-sonnet-20240620"
+
+
+# Will throw errors if you send a garbage string
+def convert_frontend_str_to_llm(frontend_str: str) -> Llm:
+    if frontend_str == "gpt_4_vision":
+        return Llm.GPT_4_VISION
+    elif frontend_str == "claude_3_sonnet":
+        return Llm.CLAUDE_3_SONNET
+    else:
+        return Llm(frontend_str)


 async def stream_openai_response(
@ -10,27 +37,192 @@ async def stream_openai_response(
    api_key: str,
    base_url: str | None,
    callback: Callable[[str], Awaitable[None]],
+    model: Llm,
 ) -> str:
    client = AsyncOpenAI(api_key=api_key, base_url=base_url)

-    model = MODEL_GPT_4_VISION
-
    # Base parameters
-    params = {"model": model, "messages": messages, "stream": True, "timeout": 600}
+    params = {
+        "model": model.value,
+        "messages": messages,
+        "stream": True,
+        "timeout": 600,
+        "temperature": 0.0,
+    }

-    # Add 'max_tokens' only if the model is a GPT4 vision model
-    if model == MODEL_GPT_4_VISION:
+    # Add 'max_tokens' only if the model is a GPT4 vision or Turbo model
+    if (
+        model == Llm.GPT_4_VISION
+        or model == Llm.GPT_4_TURBO_2024_04_09
+        or model == Llm.GPT_4O_2024_05_13
+    ):
        params["max_tokens"] = 4096
-        params["temperature"] = 0

    stream = await client.chat.completions.create(**params)  # type: ignore
    full_response = ""
    async for chunk in stream:  # type: ignore
        assert isinstance(chunk, ChatCompletionChunk)
-        content = chunk.choices[0].delta.content or ""
-        full_response += content
-        await callback(content)
+        if (
+            chunk.choices
+            and len(chunk.choices) > 0
+            and chunk.choices[0].delta
+            and chunk.choices[0].delta.content
+        ):
+            content = chunk.choices[0].delta.content or ""
+            full_response += content
+            await callback(content)

    await client.close()

    return full_response
+
+
+# TODO: Have a seperate function that translates OpenAI messages to Claude messages
+async def stream_claude_response(
+    messages: List[ChatCompletionMessageParam],
+    api_key: str,
+    callback: Callable[[str], Awaitable[None]],
+    model: Llm,
+) -> str:
+
+    client = AsyncAnthropic(api_key=api_key)
+
+    # Base parameters
+    max_tokens = 4096
+    temperature = 0.0
+
+    # Translate OpenAI messages to Claude messages
+    system_prompt = cast(str, messages[0].get("content"))
+    claude_messages = [dict(message) for message in messages[1:]]
+    for message in claude_messages:
+        if not isinstance(message["content"], list):
+            continue
+
+        for content in message["content"]:  # type: ignore
+            if content["type"] == "image_url":
+                content["type"] = "image"
+
+                # Extract base64 data and media type from data URL
+                # Example base64 data URL: data:image/png;base64,iVBOR...
+                image_data_url = cast(str, content["image_url"]["url"])
+
+                # Process image and split media type and data
+                # so it works with Claude (under 5mb in base64 encoding)
+                (media_type, base64_data) = process_image(image_data_url)
+
+                # Remove OpenAI parameter
+                del content["image_url"]
+
+                content["source"] = {
+                    "type": "base64",
+                    "media_type": media_type,
+                    "data": base64_data,
+                }
+
+    # Stream Claude response
+    async with client.messages.stream(
+        model=model.value,
+        max_tokens=max_tokens,
+        temperature=temperature,
+        system=system_prompt,
+        messages=claude_messages,  # type: ignore
+    ) as stream:
+        async for text in stream.text_stream:
+            await callback(text)
+
+    # Return final message
+    response = await stream.get_final_message()
+
+    # Close the Anthropic client
+    await client.close()
+
+    return response.content[0].text
+
+
+async def stream_claude_response_native(
+    system_prompt: str,
+    messages: list[Any],
+    api_key: str,
+    callback: Callable[[str], Awaitable[None]],
+    include_thinking: bool = False,
+    model: Llm = Llm.CLAUDE_3_OPUS,
+) -> str:
+
+    client = AsyncAnthropic(api_key=api_key)
+
+    # Base model parameters
+    max_tokens = 4096
+    temperature = 0.0
+
+    # Multi-pass flow
+    current_pass_num = 1
+    max_passes = 2
+
+    prefix = "<thinking>"
+    response = None
+
+    # For debugging
+    full_stream = ""
+    debug_file_writer = DebugFileWriter()
+
+    while current_pass_num <= max_passes:
+        current_pass_num += 1
+
+        # Set up message depending on whether we have a <thinking> prefix
+        messages_to_send = (
+            messages + [{"role": "assistant", "content": prefix}]
+            if include_thinking
+            else messages
+        )
+
+        pprint_prompt(messages_to_send)
+
+        async with client.messages.stream(
+            model=model.value,
+            max_tokens=max_tokens,
+            temperature=temperature,
+            system=system_prompt,
+            messages=messages_to_send,  # type: ignore
+        ) as stream:
+            async for text in stream.text_stream:
+                print(text, end="", flush=True)
+                full_stream += text
+                await callback(text)
+
+        response = await stream.get_final_message()
+        response_text = response.content[0].text
+
+        # Write each pass's code to .html file and thinking to .txt file
+        if IS_DEBUG_ENABLED:
+            debug_file_writer.write_to_file(
+                f"pass_{current_pass_num - 1}.html",
+                debug_file_writer.extract_html_content(response_text),
+            )
+            debug_file_writer.write_to_file(
+                f"thinking_pass_{current_pass_num - 1}.txt",
+                response_text.split("</thinking>")[0],
+            )
+
+        # Set up messages array for next pass
+        messages += [
+            {"role": "assistant", "content": str(prefix) + response.content[0].text},
+            {
+                "role": "user",
+                "content": "You've done a good job with a first draft. Improve this further based on the original instructions so that the app is fully functional and looks like the original video of the app we're trying to replicate.",
+            },
+        ]
+
+        print(
+            f"Token usage: Input Tokens: {response.usage.input_tokens}, Output Tokens: {response.usage.output_tokens}"
+        )
+
+    # Close the Anthropic client
+    await client.close()
+
+    if IS_DEBUG_ENABLED:
+        debug_file_writer.write_to_file("full_stream.txt", full_stream)
+
+    if not response:
+        raise Exception("No HTML response found in AI response")
+    else:
+        return response.content[0].text
--- a/backend/mock_llm.py
+++ b/backend/mock_llm.py
--- a/backend/poetry.lock
+++ b/backend/poetry.lock
--- a/backend/prompts/init.py
+++ b/backend/prompts/init.py
@ -1,6 +1,7 @@
 from typing import List, NoReturn, Union

 from openai.types.chat import ChatCompletionMessageParam, ChatCompletionContentPartParam
+from llm import Llm

 from prompts.imported_code_prompts import IMPORTED_CODE_SYSTEM_PROMPTS
 from prompts.screenshot_system_prompts import SYSTEM_PROMPTS
@ -17,7 +18,7 @@ Generate code for a SVG that looks exactly like this.


 def assemble_imported_code_prompt(
-    code: str, stack: Stack, result_image_data_url: Union[str, None] = None
+    code: str, stack: Stack, model: Llm
 ) -> List[ChatCompletionMessageParam]:
    system_content = IMPORTED_CODE_SYSTEM_PROMPTS[stack]

@ -26,16 +27,25 @@ def assemble_imported_code_prompt(
        if stack != "svg"
        else "Here is the code of the SVG: " + code
    )
-    return [
-        {
-            "role": "system",
-            "content": system_content,
-        },
-        {
-            "role": "user",
-            "content": user_content,
-        },
-    ]
+
+    if model == Llm.CLAUDE_3_5_SONNET_2024_06_20:
+        return [
+            {
+                "role": "system",
+                "content": system_content + "\n " + user_content,
+            }
+        ]
+    else:
+        return [
+            {
+                "role": "system",
+                "content": system_content,
+            },
+            {
+                "role": "user",
+                "content": user_content,
+            },
+        ]
    # TODO: Use result_image_data_url


--- a/backend/prompts/claude_prompts.py
+++ b/backend/prompts/claude_prompts.py
@ -0,0 +1,114 @@
+# Not used yet
+# References:
+# https://github.com/hundredblocks/transcription_demo
+# https://docs.anthropic.com/claude/docs/prompt-engineering
+# https://github.com/anthropics/anthropic-cookbook/blob/main/multimodal/best_practices_for_vision.ipynb
+
+VIDEO_PROMPT = """
+You are an expert at building single page, funtional apps using HTML, Jquery and Tailwind CSS.
+You also have perfect vision and pay great attention to detail.
+
+You will be given screenshots in order at consistent intervals from a video of a user interacting with a web app. You need to re-create the same app exactly such that the same user interactions will produce the same results in the app you build.
+
+- Make sure the app looks exactly like the screenshot.
+- Pay close attention to background color, text color, font size, font family, 
+padding, margin, border, etc. Match the colors and sizes exactly.
+- For images, use placeholder images from https://placehold.co and include a detailed description of the image in the alt text so that an image generation AI can generate the image later.
+- If some fuctionality requires a backend call, just mock the data instead.
+- MAKE THE APP FUNCTIONAL using Javascript. Allow the user to interact with the app and get the same behavior as the video.
+
+In terms of libraries,
+
+- Use this script to include Tailwind: <script src="https://cdn.tailwindcss.com"></script>
+- You can use Google Fonts
+- Font Awesome for icons: <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/all.min.css"></link>
+- Use jQuery: <script src="https://code.jquery.com/jquery-3.7.1.min.js"></script>
+
+Before generating the code for the app, think step-by-step: first, about the user flow depicated in the video and then about you how would you build it and how you would structure the code. Do the thinking within <thinking></thinking> tags. Then, provide your code within <html></html> tags.
+"""
+
+VIDEO_PROMPT_ALPINE_JS = """
+You are an expert at building single page, funtional apps using HTML, Alpine.js and Tailwind CSS.
+You also have perfect vision and pay great attention to detail.
+
+You will be given screenshots in order at consistent intervals from a video of a user interacting with a web app. You need to re-create the same app exactly such that the same user interactions will produce the same results in the app you build.
+
+- Make sure the app looks exactly like the screenshot.
+- Pay close attention to background color, text color, font size, font family, 
+padding, margin, border, etc. Match the colors and sizes exactly.
+- For images, use placeholder images from https://placehold.co and include a detailed description of the image in the alt text so that an image generation AI can generate the image later.
+- If some fuctionality requires a backend call, just mock the data instead.
+- MAKE THE APP FUNCTIONAL using Javascript. Allow the user to interact with the app and get the same behavior as the video.
+
+In terms of libraries,
+
+- Use this script to include Tailwind: <script src="https://cdn.tailwindcss.com"></script>
+- You can use Google Fonts
+- Font Awesome for icons: <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/all.min.css"></link>
+- Use Alpine.js: <script defer src="https://cdn.jsdelivr.net/npm/alpinejs@3.x.x/dist/cdn.min.js"></script>
+
+Before generating the code for the app, think step-by-step: first, about the user flow depicated in the video and then about you how would you build it and how you would structure the code. Do the thinking within <thinking></thinking> tags. Then, provide your code within <html></html> tags.
+"""
+
+
+HTML_TAILWIND_CLAUDE_SYSTEM_PROMPT = """
+You have perfect vision and pay great attention to detail which makes you an expert at building single page apps using Tailwind, HTML and JS.
+You take screenshots of a reference web page from the user, and then build single page apps 
+using Tailwind, HTML and JS.
+You might also be given a screenshot (The second image) of a web page that you have already built, and asked to
+update it to look more like the reference image(The first image).
+
+- Make sure the app looks exactly like the screenshot.
+- Do not leave out smaller UI elements. Make sure to include every single thing in the screenshot.
+- Pay close attention to background color, text color, font size, font family, 
+padding, margin, border, etc. Match the colors and sizes exactly.
+- In particular, pay attention to background color and overall color scheme.
+- Use the exact text from the screenshot.
+- Do not add comments in the code such as "<!-- Add other navigation links as needed -->" and "<!-- ... other news items ... -->" in place of writing the full code. WRITE THE FULL CODE.
+- Make sure to always get the layout right (if things are arranged in a row in the screenshot, they should be in a row in the app as well)
+- Repeat elements as needed to match the screenshot. For example, if there are 15 items, the code should have 15 items. DO NOT LEAVE comments like "<!-- Repeat for each news item -->" or bad things will happen.
+- For images, use placeholder images from https://placehold.co and include a detailed description of the image in the alt text so that an image generation AI can generate the image later.
+
+In terms of libraries,
+
+- Use this script to include Tailwind: <script src="https://cdn.tailwindcss.com"></script>
+- You can use Google Fonts
+- Font Awesome for icons: <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/all.min.css"></link>
+
+Return only the full code in <html></html> tags.
+Do not include markdown "```" or "```html" at the start or end.
+"""
+
+#
+
+REACT_TAILWIND_CLAUDE_SYSTEM_PROMPT = """
+You have perfect vision and pay great attention to detail which makes you an expert at building single page apps using React/Tailwind.
+You take screenshots of a reference web page from the user, and then build single page apps 
+using React and Tailwind CSS.
+You might also be given a screenshot (The second image) of a web page that you have already built, and asked to
+update it to look more like the reference image(The first image).
+
+- Make sure the app looks exactly like the screenshot.
+- Do not leave out smaller UI elements. Make sure to include every single thing in the screenshot.
+- Pay close attention to background color, text color, font size, font family, 
+padding, margin, border, etc. Match the colors and sizes exactly.
+- In particular, pay attention to background color and overall color scheme.
+- Use the exact text from the screenshot.
+- Do not add comments in the code such as "<!-- Add other navigation links as needed -->" and "<!-- ... other news items ... -->" in place of writing the full code. WRITE THE FULL CODE.
+- Make sure to always get the layout right (if things are arranged in a row in the screenshot, they should be in a row in the app as well)
+- CREATE REUSABLE COMPONENTS FOR REPEATING ELEMENTS. For example, if there are 15 similar items in the screenshot, your code should include a reusable component that generates these items. and use loops to instantiate these components as needed.
+- For images, use placeholder images from https://placehold.co and include a detailed description of the image in the alt text so that an image generation AI can generate the image later.
+
+In terms of libraries,
+
+- Use these script to include React so that it can run on a standalone page:
+    <script src="https://unpkg.com/react/umd/react.development.js"></script>
+    <script src="https://unpkg.com/react-dom/umd/react-dom.development.js"></script>
+    <script src="https://unpkg.com/@babel/standalone/babel.js"></script>
+- Use this script to include Tailwind: <script src="https://cdn.tailwindcss.com"></script>
+- You can use Google Fonts
+- Font Awesome for icons: <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/all.min.css"></link>
+
+Return only the full code in <html></html> tags.
+Do not include markdown "```" or "```html" at the start or end.
+"""
--- a/backend/prompts/test_prompts.py
+++ b/backend/prompts/test_prompts.py
@ -311,35 +311,35 @@ def test_prompts():
    tailwind_prompt = assemble_prompt(
        "image_data_url", "html_tailwind", "result_image_data_url"
    )
-    assert tailwind_prompt[0]["content"] == TAILWIND_SYSTEM_PROMPT
+    assert tailwind_prompt[0].get("content") == TAILWIND_SYSTEM_PROMPT
    assert tailwind_prompt[1]["content"][2]["text"] == USER_PROMPT  # type: ignore

    react_tailwind_prompt = assemble_prompt(
        "image_data_url", "react_tailwind", "result_image_data_url"
    )
-    assert react_tailwind_prompt[0]["content"] == REACT_TAILWIND_SYSTEM_PROMPT
+    assert react_tailwind_prompt[0].get("content") == REACT_TAILWIND_SYSTEM_PROMPT
    assert react_tailwind_prompt[1]["content"][2]["text"] == USER_PROMPT  # type: ignore

    bootstrap_prompt = assemble_prompt(
        "image_data_url", "bootstrap", "result_image_data_url"
    )
-    assert bootstrap_prompt[0]["content"] == BOOTSTRAP_SYSTEM_PROMPT
+    assert bootstrap_prompt[0].get("content") == BOOTSTRAP_SYSTEM_PROMPT
    assert bootstrap_prompt[1]["content"][2]["text"] == USER_PROMPT  # type: ignore

    ionic_tailwind = assemble_prompt(
        "image_data_url", "ionic_tailwind", "result_image_data_url"
    )
-    assert ionic_tailwind[0]["content"] == IONIC_TAILWIND_SYSTEM_PROMPT
+    assert ionic_tailwind[0].get("content") == IONIC_TAILWIND_SYSTEM_PROMPT
    assert ionic_tailwind[1]["content"][2]["text"] == USER_PROMPT  # type: ignore

    vue_tailwind = assemble_prompt(
        "image_data_url", "vue_tailwind", "result_image_data_url"
    )
-    assert vue_tailwind[0]["content"] == VUE_TAILWIND_SYSTEM_PROMPT
+    assert vue_tailwind[0].get("content") == VUE_TAILWIND_SYSTEM_PROMPT
    assert vue_tailwind[1]["content"][2]["text"] == USER_PROMPT  # type: ignore

    svg_prompt = assemble_prompt("image_data_url", "svg", "result_image_data_url")
-    assert svg_prompt[0]["content"] == SVG_SYSTEM_PROMPT
+    assert svg_prompt[0].get("content") == SVG_SYSTEM_PROMPT
    assert svg_prompt[1]["content"][2]["text"] == SVG_USER_PROMPT  # type: ignore


--- a/backend/pyproject.toml
+++ b/backend/pyproject.toml
@ -14,10 +14,15 @@ openai = "^1.2.4"
 python-dotenv = "^1.0.0"
 beautifulsoup4 = "^4.12.2"
 httpx = "^0.25.1"
+pre-commit = "^3.6.2"
+anthropic = "^0.18.0"
+moviepy = "^1.0.3"
+pillow = "^10.3.0"
+types-pillow = "^10.2.0.20240520"

 [tool.poetry.group.dev.dependencies]
 pytest = "^7.4.3"
-pyright = "^1.1.345"
+pyright = "^1.1.352"

 [build-system]
 requires = ["poetry-core"]
--- a/backend/routes/evals.py
+++ b/backend/routes/evals.py
@ -7,10 +7,13 @@ from evals.config import EVALS_DIR

 router = APIRouter()

+# Update this if the number of outputs generated per input changes
+N = 1
+

 class Eval(BaseModel):
    input: str
-    output: str
+    outputs: list[str]


@router.get("/evals")
@ -25,21 +28,27 @@ async def get_evals():
            input_file_path = os.path.join(input_dir, file)
            input_file = await image_to_data_url(input_file_path)

-            # Construct the corresponding output file name
-            output_file_name = file.replace(".png", ".html")
-            output_file_path = os.path.join(output_dir, output_file_name)
+            # Construct the corresponding output file names
+            output_file_names = [
+                file.replace(".png", f"_{i}.html") for i in range(0, N)
+            ]  # Assuming 3 outputs for each input

-            # Check if the output file exists
-            if os.path.exists(output_file_path):
-                with open(output_file_path, "r") as f:
-                    output_file_data = f.read()
-            else:
-                output_file_data = "Output file not found."
+            output_files_data: list[str] = []
+            for output_file_name in output_file_names:
+                output_file_path = os.path.join(output_dir, output_file_name)
+                # Check if the output file exists
+                if os.path.exists(output_file_path):
+                    with open(output_file_path, "r") as f:
+                        output_files_data.append(f.read())
+                else:
+                    output_files_data.append(
+                        "<html><h1>Output file not found.</h1></html>"
+                    )

            evals.append(
                Eval(
                    input=input_file,
-                    output=output_file_data,
+                    outputs=output_files_data,
                )
            )

--- a/backend/routes/generate_code.py
+++ b/backend/routes/generate_code.py
@ -2,19 +2,30 @@ import os
 import traceback
 from fastapi import APIRouter, WebSocket
 import openai
-from config import IS_PROD, SHOULD_MOCK_AI_RESPONSE
-from llm import stream_openai_response
+from codegen.utils import extract_html_content
+from config import ANTHROPIC_API_KEY, IS_PROD, SHOULD_MOCK_AI_RESPONSE
+from custom_types import InputMode
+from llm import (
+    Llm,
+    convert_frontend_str_to_llm,
+    stream_claude_response,
+    stream_claude_response_native,
+    stream_openai_response,
+)
 from openai.types.chat import ChatCompletionMessageParam
 from mock_llm import mock_completion
-from typing import Dict, List, cast, get_args
+from typing import Dict, List, Union, cast, get_args
 from image_generation import create_alt_url_mapping, generate_images
 from prompts import assemble_imported_code_prompt, assemble_prompt
-from access_token import validate_access_token
 from datetime import datetime
 import json
+from prompts.claude_prompts import VIDEO_PROMPT
 from prompts.types import Stack
+from utils import pprint_prompt

-from utils import pprint_prompt  # type: ignore
+# from utils import pprint_prompt
+from video.utils import extract_tag_content, assemble_claude_prompt_video
+from ws.constants import APP_ERROR_WEB_SOCKET_CODE  # type: ignore


 router = APIRouter()
@ -49,7 +60,7 @@ async def stream_code(websocket: WebSocket):
        message: str,
    ):
        await websocket.send_json({"type": "error", "value": message})
-        await websocket.close()
+        await websocket.close(APP_ERROR_WEB_SOCKET_CODE)

    # TODO: Are the values always strings?
    params: Dict[str, str] = await websocket.receive_json()
@ -60,52 +71,70 @@ async def stream_code(websocket: WebSocket):
    generated_code_config = ""
    if "generatedCodeConfig" in params and params["generatedCodeConfig"]:
        generated_code_config = params["generatedCodeConfig"]
-    print(f"Generating {generated_code_config} code")
-
-    # Get the OpenAI API key from the request. Fall back to environment variable if not provided.
-    # If neither is provided, we throw an error.
-    openai_api_key = None
-    if "accessCode" in params and params["accessCode"]:
-        print("Access code - using platform API key")
-        res = await validate_access_token(params["accessCode"])
-        if res["success"]:
-            openai_api_key = os.environ.get("PLATFORM_OPENAI_API_KEY")
-        else:
-            await websocket.send_json(
-                {
-                    "type": "error",
-                    "value": res["failure_reason"],
-                }
-            )
-            return
-    else:
-        if params["openAiApiKey"]:
-            openai_api_key = params["openAiApiKey"]
-            print("Using OpenAI API key from client-side settings dialog")
-        else:
-            openai_api_key = os.environ.get("OPENAI_API_KEY")
-            if openai_api_key:
-                print("Using OpenAI API key from environment variable")
-
-    if not openai_api_key:
-        print("OpenAI API key not found")
-        await websocket.send_json(
-            {
-                "type": "error",
-                "value": "No OpenAI API key found. Please add your API key in the settings dialog or add it to backend/.env file. If you add it to .env, make sure to restart the backend server.",
-            }
-        )
-        return
-
-    # Validate the generated code config
    if not generated_code_config in get_args(Stack):
        await throw_error(f"Invalid generated code config: {generated_code_config}")
        return
    # Cast the variable to the Stack type
    valid_stack = cast(Stack, generated_code_config)

+    # Validate the input mode
+    input_mode = params.get("inputMode")
+    if not input_mode in get_args(InputMode):
+        await throw_error(f"Invalid input mode: {input_mode}")
+        raise Exception(f"Invalid input mode: {input_mode}")
+    # Cast the variable to the right type
+    validated_input_mode = cast(InputMode, input_mode)
+
+    # Read the model from the request. Fall back to default if not provided.
+    code_generation_model_str = params.get(
+        "codeGenerationModel", Llm.GPT_4O_2024_05_13.value
+    )
+    try:
+        code_generation_model = convert_frontend_str_to_llm(code_generation_model_str)
+    except:
+        await throw_error(f"Invalid model: {code_generation_model_str}")
+        raise Exception(f"Invalid model: {code_generation_model_str}")
+    exact_llm_version = None
+
+    print(
+        f"Generating {generated_code_config} code for uploaded {input_mode} using {code_generation_model} model..."
+    )
+
+    # Get the OpenAI API key from the request. Fall back to environment variable if not provided.
+    # If neither is provided, we throw an error.
+    openai_api_key = None
+    if params["openAiApiKey"]:
+        openai_api_key = params["openAiApiKey"]
+        print("Using OpenAI API key from client-side settings dialog")
+    else:
+        openai_api_key = os.environ.get("OPENAI_API_KEY")
+        if openai_api_key:
+            print("Using OpenAI API key from environment variable")
+
+    if not openai_api_key and (
+        code_generation_model == Llm.GPT_4_VISION
+        or code_generation_model == Llm.GPT_4_TURBO_2024_04_09
+        or code_generation_model == Llm.GPT_4O_2024_05_13
+    ):
+        print("OpenAI API key not found")
+        await throw_error(
+            "No OpenAI API key found. Please add your API key in the settings dialog or add it to backend/.env file. If you add it to .env, make sure to restart the backend server."
+        )
+        return
+
+    # Get the Anthropic API key from the request. Fall back to environment variable if not provided.
+    # If neither is provided, we throw an error later only if Claude is used.
+    anthropic_api_key = None
+    if "anthropicApiKey" in params and params["anthropicApiKey"]:
+        anthropic_api_key = params["anthropicApiKey"]
+        print("Using Anthropic API key from client-side settings dialog")
+    else:
+        anthropic_api_key = ANTHROPIC_API_KEY
+        if anthropic_api_key:
+            print("Using Anthropic API key from environment variable")
+
    # Get the OpenAI Base URL from the request. Fall back to environment variable if not provided.
-    openai_base_url = None
+    openai_base_url: Union[str, None] = None
    # Disable user-specified OpenAI Base URL in prod
    if not os.environ.get("IS_PROD"):
        if "openAiBaseURL" in params and params["openAiBaseURL"]:
@ -139,7 +168,7 @@ async def stream_code(websocket: WebSocket):
    if params.get("isImportedFromCode") and params["isImportedFromCode"]:
        original_imported_code = params["history"][0]
        prompt_messages = assemble_imported_code_prompt(
-            original_imported_code, valid_stack
+            original_imported_code, valid_stack, code_generation_model
        )
        for index, text in enumerate(params["history"][1:]):
            if index % 2 == 0:
@ -190,18 +219,60 @@ async def stream_code(websocket: WebSocket):

            image_cache = create_alt_url_mapping(params["history"][-2])

-    pprint_prompt(prompt_messages)
+    if validated_input_mode == "video":
+        video_data_url = params["image"]
+        prompt_messages = await assemble_claude_prompt_video(video_data_url)
+
+    # pprint_prompt(prompt_messages)  # type: ignore

    if SHOULD_MOCK_AI_RESPONSE:
-        completion = await mock_completion(process_chunk)
+        completion = await mock_completion(
+            process_chunk, input_mode=validated_input_mode
+        )
    else:
        try:
-            completion = await stream_openai_response(
-                prompt_messages,
-                api_key=openai_api_key,
-                base_url=openai_base_url,
-                callback=lambda x: process_chunk(x),
-            )
+            if validated_input_mode == "video":
+                if not anthropic_api_key:
+                    await throw_error(
+                        "Video only works with Anthropic models. No Anthropic API key found. Please add the environment variable ANTHROPIC_API_KEY to backend/.env or in the settings dialog"
+                    )
+                    raise Exception("No Anthropic key")
+
+                completion = await stream_claude_response_native(
+                    system_prompt=VIDEO_PROMPT,
+                    messages=prompt_messages,  # type: ignore
+                    api_key=anthropic_api_key,
+                    callback=lambda x: process_chunk(x),
+                    model=Llm.CLAUDE_3_OPUS,
+                    include_thinking=True,
+                )
+                exact_llm_version = Llm.CLAUDE_3_OPUS
+            elif (
+                code_generation_model == Llm.CLAUDE_3_SONNET
+                or code_generation_model == Llm.CLAUDE_3_5_SONNET_2024_06_20
+            ):
+                if not anthropic_api_key:
+                    await throw_error(
+                        "No Anthropic API key found. Please add the environment variable ANTHROPIC_API_KEY to backend/.env or in the settings dialog"
+                    )
+                    raise Exception("No Anthropic key")
+
+                completion = await stream_claude_response(
+                    prompt_messages,  # type: ignore
+                    api_key=anthropic_api_key,
+                    callback=lambda x: process_chunk(x),
+                    model=code_generation_model,
+                )
+                exact_llm_version = code_generation_model
+            else:
+                completion = await stream_openai_response(
+                    prompt_messages,  # type: ignore
+                    api_key=openai_api_key,
+                    base_url=openai_base_url,
+                    callback=lambda x: process_chunk(x),
+                    model=code_generation_model,
+                )
+                exact_llm_version = code_generation_model
        except openai.AuthenticationError as e:
            print("[GENERATE_CODE] Authentication failed", e)
            error_message = (
@ -237,8 +308,16 @@ async def stream_code(websocket: WebSocket):
            )
            return await throw_error(error_message)

+    if validated_input_mode == "video":
+        completion = extract_tag_content("html", completion)
+
+    print("Exact used model for generation: ", exact_llm_version)
+
+    # Strip the completion of everything except the HTML content
+    completion = extract_html_content(completion)
+
    # Write the messages dict into a log so that we can debug later
-    write_logs(prompt_messages, completion)
+    write_logs(prompt_messages, completion)  # type: ignore

    try:
        if should_generate_images:
--- a/backend/run_evals.py
+++ b/backend/run_evals.py
@ -1,6 +1,8 @@
 # Load environment variables first
 from dotenv import load_dotenv

+from llm import Llm
+
 load_dotenv()

 import os
@ -11,6 +13,10 @@ from evals.config import EVALS_DIR
 from evals.core import generate_code_core
 from evals.utils import image_to_data_url

+STACK = "ionic_tailwind"
+MODEL = Llm.GPT_4O_2024_05_13
+N = 1  # Number of outputs to generate
+

 async def main():
    INPUT_DIR = EVALS_DIR + "/inputs"
@ -23,16 +29,21 @@ async def main():
    for filename in evals:
        filepath = os.path.join(INPUT_DIR, filename)
        data_url = await image_to_data_url(filepath)
-        task = generate_code_core(data_url, "vue_tailwind")
-        tasks.append(task)
+        for _ in range(N):  # Generate N tasks for each input
+            task = generate_code_core(image_url=data_url, stack=STACK, model=MODEL)
+            tasks.append(task)

    results = await asyncio.gather(*tasks)

    os.makedirs(OUTPUT_DIR, exist_ok=True)

-    for filename, content in zip(evals, results):
-        # File name is derived from the original filename in evals
-        output_filename = f"{os.path.splitext(filename)[0]}.html"
+    for i, content in enumerate(results):
+        # Calculate index for filename and output number
+        eval_index = i // N
+        output_number = i % N
+        filename = evals[eval_index]
+        # File name is derived from the original filename in evals with an added output number
+        output_filename = f"{os.path.splitext(filename)[0]}_{output_number}.html"
        output_filepath = os.path.join(OUTPUT_DIR, output_filename)
        with open(output_filepath, "w") as file:
            file.write(content)
--- a/backend/test_llm.py
+++ b/backend/test_llm.py
@ -0,0 +1,41 @@
+import unittest
+from llm import convert_frontend_str_to_llm, Llm
+
+
+class TestConvertFrontendStrToLlm(unittest.TestCase):
+    def test_convert_valid_strings(self):
+        self.assertEqual(
+            convert_frontend_str_to_llm("gpt_4_vision"),
+            Llm.GPT_4_VISION,
+            "Should convert 'gpt_4_vision' to Llm.GPT_4_VISION",
+        )
+        self.assertEqual(
+            convert_frontend_str_to_llm("claude_3_sonnet"),
+            Llm.CLAUDE_3_SONNET,
+            "Should convert 'claude_3_sonnet' to Llm.CLAUDE_3_SONNET",
+        )
+        self.assertEqual(
+            convert_frontend_str_to_llm("claude-3-opus-20240229"),
+            Llm.CLAUDE_3_OPUS,
+            "Should convert 'claude-3-opus-20240229' to Llm.CLAUDE_3_OPUS",
+        )
+        self.assertEqual(
+            convert_frontend_str_to_llm("gpt-4-turbo-2024-04-09"),
+            Llm.GPT_4_TURBO_2024_04_09,
+            "Should convert 'gpt-4-turbo-2024-04-09' to Llm.GPT_4_TURBO_2024_04_09",
+        )
+        self.assertEqual(
+            convert_frontend_str_to_llm("gpt-4o-2024-05-13"),
+            Llm.GPT_4O_2024_05_13,
+            "Should convert 'gpt-4o-2024-05-13' to Llm.GPT_4O_2024_05_13",
+        )
+
+    def test_convert_invalid_string_raises_exception(self):
+        with self.assertRaises(ValueError):
+            convert_frontend_str_to_llm("invalid_string")
+        with self.assertRaises(ValueError):
+            convert_frontend_str_to_llm("another_invalid_string")
+
+
+if __name__ == "__main__":
+    unittest.main()
--- a/backend/video/utils.py
+++ b/backend/video/utils.py
@ -0,0 +1,134 @@
+# Extract HTML content from the completion string
+import base64
+import io
+import mimetypes
+import os
+import tempfile
+import uuid
+from typing import Any, Union, cast
+from moviepy.editor import VideoFileClip  # type: ignore
+from PIL import Image
+import math
+
+
+DEBUG = True
+TARGET_NUM_SCREENSHOTS = (
+    20  # Should be max that Claude supports (20) - reduce to save tokens on testing
+)
+
+
+async def assemble_claude_prompt_video(video_data_url: str) -> list[Any]:
+    images = split_video_into_screenshots(video_data_url)
+
+    # Save images to tmp if we're debugging
+    if DEBUG:
+        save_images_to_tmp(images)
+
+    # Validate number of images
+    print(f"Number of frames extracted from video: {len(images)}")
+    if len(images) > 20:
+        print(f"Too many screenshots: {len(images)}")
+        raise ValueError("Too many screenshots extracted from video")
+
+    # Convert images to the message format for Claude
+    content_messages: list[dict[str, Union[dict[str, str], str]]] = []
+    for image in images:
+
+        # Convert Image to buffer
+        buffered = io.BytesIO()
+        image.save(buffered, format="JPEG")
+
+        # Encode bytes as base64
+        base64_data = base64.b64encode(buffered.getvalue()).decode("utf-8")
+        media_type = "image/jpeg"
+
+        content_messages.append(
+            {
+                "type": "image",
+                "source": {
+                    "type": "base64",
+                    "media_type": media_type,
+                    "data": base64_data,
+                },
+            }
+        )
+
+    return [
+        {
+            "role": "user",
+            "content": content_messages,
+        },
+    ]
+
+
+# Returns a list of images/frame (RGB format)
+def split_video_into_screenshots(video_data_url: str) -> list[Image.Image]:
+    target_num_screenshots = TARGET_NUM_SCREENSHOTS
+
+    # Decode the base64 URL to get the video bytes
+    video_encoded_data = video_data_url.split(",")[1]
+    video_bytes = base64.b64decode(video_encoded_data)
+
+    mime_type = video_data_url.split(";")[0].split(":")[1]
+    suffix = mimetypes.guess_extension(mime_type)
+
+    with tempfile.NamedTemporaryFile(suffix=suffix, delete=True) as temp_video_file:
+        print(temp_video_file.name)
+        temp_video_file.write(video_bytes)
+        temp_video_file.flush()
+        clip = VideoFileClip(temp_video_file.name)
+        images: list[Image.Image] = []
+        total_frames = cast(int, clip.reader.nframes)  # type: ignore
+
+        # Calculate frame skip interval by dividing total frames by the target number of screenshots
+        # Ensuring a minimum skip of 1 frame
+        frame_skip = max(1, math.ceil(total_frames / target_num_screenshots))
+
+        # Iterate over each frame in the clip
+        for i, frame in enumerate(clip.iter_frames()):
+            # Save every nth frame
+            if i % frame_skip == 0:
+                frame_image = Image.fromarray(frame)  # type: ignore
+                images.append(frame_image)
+                # Ensure that we don't capture more than the desired number of frames
+                if len(images) >= target_num_screenshots:
+                    break
+
+        # Close the video file to release resources
+        clip.close()
+
+        return images
+
+
+# Save a list of PIL images to a random temporary directory
+def save_images_to_tmp(images: list[Image.Image]):
+
+    # Create a unique temporary directory
+    unique_dir_name = f"screenshots_{uuid.uuid4()}"
+    tmp_screenshots_dir = os.path.join(tempfile.gettempdir(), unique_dir_name)
+    os.makedirs(tmp_screenshots_dir, exist_ok=True)
+
+    for idx, image in enumerate(images):
+        # Generate a unique image filename using index
+        image_filename = f"screenshot_{idx}.jpg"
+        tmp_filepath = os.path.join(tmp_screenshots_dir, image_filename)
+        image.save(tmp_filepath, format="JPEG")
+
+    print("Saved to " + tmp_screenshots_dir)
+
+
+def extract_tag_content(tag: str, text: str) -> str:
+    """
+    Extracts content for a given tag from the provided text.
+
+    :param tag: The tag to search for.
+    :param text: The text to search within.
+    :return: The content found within the tag, if any.
+    """
+    tag_start = f"<{tag}>"
+    tag_end = f"</{tag}>"
+    start_idx = text.find(tag_start)
+    end_idx = text.find(tag_end, start_idx)
+    if start_idx != -1 and end_idx != -1:
+        return text[start_idx : end_idx + len(tag_end)]
+    return ""
--- a/backend/video_to_app.py
+++ b/backend/video_to_app.py
@ -0,0 +1,122 @@
+# Load environment variables first
+
+from dotenv import load_dotenv
+
+load_dotenv()
+
+
+import base64
+import mimetypes
+import time
+import subprocess
+import os
+import asyncio
+from datetime import datetime
+from prompts.claude_prompts import VIDEO_PROMPT
+from utils import pprint_prompt
+from config import ANTHROPIC_API_KEY
+from video.utils import extract_tag_content, assemble_claude_prompt_video
+from llm import (
+    Llm,
+    stream_claude_response_native,
+)
+
+STACK = "html_tailwind"
+
+VIDEO_DIR = "./video_evals/videos"
+SCREENSHOTS_DIR = "./video_evals/screenshots"
+OUTPUTS_DIR = "./video_evals/outputs"
+
+
+async def main():
+    video_filename = "shortest.mov"
+    is_followup = False
+
+    if not ANTHROPIC_API_KEY:
+        raise ValueError("ANTHROPIC_API_KEY is not set")
+
+    # Get previous HTML
+    previous_html = ""
+    if is_followup:
+        previous_html_file = max(
+            [
+                os.path.join(OUTPUTS_DIR, f)
+                for f in os.listdir(OUTPUTS_DIR)
+                if f.endswith(".html")
+            ],
+            key=os.path.getctime,
+        )
+        with open(previous_html_file, "r") as file:
+            previous_html = file.read()
+
+    video_file = os.path.join(VIDEO_DIR, video_filename)
+    mime_type = mimetypes.guess_type(video_file)[0]
+    with open(video_file, "rb") as file:
+        video_content = file.read()
+    video_data_url = (
+        f"data:{mime_type};base64,{base64.b64encode(video_content).decode('utf-8')}"
+    )
+
+    prompt_messages = await assemble_claude_prompt_video(video_data_url)
+
+    # Tell the model to continue
+    # {"role": "assistant", "content": SECOND_MESSAGE},
+    # {"role": "user", "content": "continue"},
+
+    if is_followup:
+        prompt_messages += [
+            {"role": "assistant", "content": previous_html},
+            {
+                "role": "user",
+                "content": "You've done a good job with a first draft. Improve this further based on the original instructions so that the app is fully functional like in the original video.",
+            },
+        ]  # type: ignore
+
+    async def process_chunk(content: str):
+        print(content, end="", flush=True)
+
+    response_prefix = "<thinking>"
+
+    pprint_prompt(prompt_messages)  # type: ignore
+
+    start_time = time.time()
+
+    completion = await stream_claude_response_native(
+        system_prompt=VIDEO_PROMPT,
+        messages=prompt_messages,
+        api_key=ANTHROPIC_API_KEY,
+        callback=lambda x: process_chunk(x),
+        model=Llm.CLAUDE_3_OPUS,
+        include_thinking=True,
+    )
+
+    end_time = time.time()
+
+    # Prepend the response prefix to the completion
+    completion = response_prefix + completion
+
+    # Extract the outputs
+    html_content = extract_tag_content("html", completion)
+    thinking = extract_tag_content("thinking", completion)
+
+    print(thinking)
+    print(f"Operation took {end_time - start_time} seconds")
+
+    os.makedirs(OUTPUTS_DIR, exist_ok=True)
+
+    # Generate a unique filename based on the current time
+    timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
+    filename = f"video_test_output_{timestamp}.html"
+    output_path = os.path.join(OUTPUTS_DIR, filename)
+
+    # Write the HTML content to the file
+    with open(output_path, "w") as file:
+        file.write(html_content)
+
+    print(f"Output file path: {output_path}")
+
+    # Show a notification
+    subprocess.run(["osascript", "-e", 'display notification "Coding Complete"'])
+
+
+asyncio.run(main())
--- a/backend/ws/init.py
+++ b/backend/ws/init.py
--- a/backend/ws/constants.py
+++ b/backend/ws/constants.py
@ -0,0 +1,2 @@
+# WebSocket protocol (RFC 6455) allows for the use of custom close codes in the range 4000-4999
+APP_ERROR_WEB_SOCKET_CODE = 4332
--- a/blog/evaluating-claude.md
+++ b/blog/evaluating-claude.md
@ -0,0 +1,59 @@
+# Claude 3 for converting screenshots to code
+
+Claude 3 dropped yesterday, claiming to rival GPT-4 on a wide variety of tasks. I maintain a very popular open source project called “screenshot-to-code” (this one!) that uses GPT-4 vision to convert screenshots/designs into clean code. Naturally, I was excited to see how good Claude 3 was at this task.
+
+**TLDR:** Claude 3 is on par with GPT-4 vision for screenshot to code, better in some ways but worse in others.
+
+## Evaluation Setup
+
+I don’t know of a public benchmark for “screenshot to code” so I created simple evaluation setup for the purposes of testing:
+
+- **Evaluation Dataset**: 16 screenshots with a mix of UI elements, landing pages, dashboards and popular websites.
+<img width="784" alt="Screenshot 2024-03-05 at 3 05 52 PM" src="https://github.com/abi/screenshot-to-code/assets/23818/c32af2db-eb5a-44c1-9a19-2f0c3dd11ab4">
+
+- **Evaluation Metric**: Replication accuracy, as in “How close does the generated code look to the screenshot?” While there are other metrics that are important like code quality, speed and so on, this is by far the #1 thing most users of the repo care about.
+- **Evaluation Mechanism**: Each output is subjectively rated by a human on a rating scale from 0 to 4. 4 = very close to an exact replica while 0 = nothing like the screenshot. With 16 screenshots, the maximum any model can score is 64.
+
+
+To make the evaluation process easy, I created [a Python script](https://github.com/abi/screenshot-to-code/blob/main/backend/run_evals.py) that runs code for all the inputs in parallel. I also made a simple UI to do a side-by-side comparison of the input and output.
+
+![Google Chrome](https://github.com/abi/screenshot-to-code/assets/23818/38126f8f-205d-4ed1-b8cf-039e81dcc3d0)
+
+
+## Results
+
+Quick note about what kind of code we’ll be generating: currently, screenshot-to-code supports generating code in HTML + Tailwind, React, Vue, and several other frameworks. Stacks can impact the replication accuracy quite a bit. For example, because Bootstrap uses a relatively restrictive set of user elements, generations using Bootstrap tend to have a distinct "Bootstrap" style.
+
+I only ran the evals on HTML/Tailwind here which is the stack where GPT-4 vision tends to perform the best.
+
+Here are the results (average of 3 runs for each model):
+
+- GPT-4 Vision obtains a score of **65.10%** - this is what we’re trying to beat
+- Claude 3 Sonnet receives a score of **70.31%**, which is a bit better.
+- Surprisingly, Claude 3 Opus which is supposed to be the smarter and slower model scores worse than both GPT-4 vision and Claude 3 Sonnet, comes in at **61.46%**.
+
+Overall, a very strong showing for Claude 3. Obviously, there's a lot of subjectivity involved in this evaluation but Claude 3 is definitely on par with GPT-4 Vision, if not better.
+
+You can see the [side-by-side comparison for a run of Claude 3 Sonnet here](https://github.com/abi/screenshot-to-code-files/blob/main/sonnet%20results.png). And for [a run of GPT-4 Vision here](https://github.com/abi/screenshot-to-code-files/blob/main/gpt%204%20vision%20results.png).
+
+Other notes:
+
+- The prompts used are optimized for GPT-4 vision. Adjusting the prompts a bit for Claude did yield a small improvement. But nothing game-changing and potentially not worth the trade-off of maintaining two sets of prompts.
+- All the models excel at code quality - the quality is usually comparable to a human or better.
+- Claude 3 is much less lazy than GPT-4 Vision. When asked to recreate Hacker News, GPT-4 Vision will only create two items in the list and leave comments in this code like `<!-- Repeat for each news item -->` and `<!-- ... other news items ... -->`.
+<img width="699" alt="Screenshot 2024-03-05 at 9 25 04 PM" src="https://github.com/abi/screenshot-to-code/assets/23818/04b03155-45e0-40b0-8de0-b1f0b4382bee">
+
+While Claude 3 Sonnet can sometimes be lazy too, most of the time, it does what you ask it to do.
+
+<img width="904" alt="Screenshot 2024-03-05 at 9 30 23 PM" src="https://github.com/abi/screenshot-to-code/assets/23818/b7c7d1ba-47c1-414d-928f-6989e81cf41d">
+
+- For some reason, all the models struggle with side-by-side "flex" layouts
+<img width="1090" alt="Screenshot 2024-03-05 at 9 20 58 PM" src="https://github.com/abi/screenshot-to-code/assets/23818/8957bb3a-da66-467d-997d-1c7cc24e6d9a">
+
+- Claude 3 Sonnet is a lot faster
+- Claude 3 gets background and text colors wrong quite often! (like in the Hacker News image above)
+- My suspicion is that Claude 3 Opus results can be improved to be on par with the other models through better prompting
+  
+Overall, I'm very impressed with Claude 3 Sonnet for this use case. I've added it as an alternative to GPT-4 Vision in the open source repo (hosted version update coming soon).
+
+If you’d like to contribute to this effort, I have some documentation on [running these evals yourself here](https://github.com/abi/screenshot-to-code/blob/main/Evaluation.md). I'm also working on a better evaluation mechanism with Elo ratings and would love some help on that.
--- a/frontend/.gitignore
+++ b/frontend/.gitignore
@ -25,3 +25,6 @@ dist-ssr

 # Env files
 .env*
+
+# Test files
+src/tests/results/
--- a/frontend/Dockerfile
+++ b/frontend/Dockerfile
@ -1,4 +1,4 @@
-FROM node:20.9-bullseye-slim
+FROM node:22-bullseye-slim

 # Set the working directory in the container
 WORKDIR /app
@ -6,6 +6,9 @@ WORKDIR /app
 # Copy package.json and yarn.lock
 COPY package.json yarn.lock /app/

+# Set the environment variable to skip Puppeteer download
+ENV PUPPETEER_SKIP_DOWNLOAD=true
+
 # Install dependencies
 RUN yarn install

@ -16,4 +19,4 @@ COPY ./ /app/
 EXPOSE 5173

 # Command to run the application
-CMD ["yarn", "dev", "--host", "0.0.0.0"]
+CMD ["yarn", "dev", "--host", "0.0.0.0"]
--- a/frontend/components.json
+++ b/frontend/components.json
@ -13,4 +13,4 @@
    "components": "@/components",
    "utils": "@/lib/utils"
  }
-}
+}
--- a/frontend/index.html
+++ b/frontend/index.html
@ -2,11 +2,7 @@
 <html lang="en">
  <head>
    <meta charset="UTF-8" />
-    <link
-      rel="icon"
-      type="image/svg+xml"
-      href="https://picoapps.xyz/favicon.png"
-    />
+    <link rel="icon" type="image/png" href="/favicon/main.png" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />

    <!-- Google Fonts -->
--- a/frontend/jest.config.js
+++ b/frontend/jest.config.js
@ -0,0 +1,9 @@
+export default {
+  preset: "ts-jest",
+  testEnvironment: "node",
+  setupFiles: ["<rootDir>/src/setupTests.ts"],
+  transform: {
+    "^.+\\.tsx?$": "ts-jest",
+  },
+  testTimeout: 30000,
+};
--- a/frontend/package.json
+++ b/frontend/package.json
@ -10,7 +10,7 @@
    "build-hosted": "tsc && vite build --mode prod",
    "lint": "eslint . --ext ts,tsx --report-unused-disable-directives --max-warnings 0",
    "preview": "vite preview",
-    "test": "vitest"
+    "test": "jest"
  },
  "dependencies": {
    "@codemirror/lang-html": "^6.4.6",
@ -45,21 +45,29 @@
    "tailwind-merge": "^2.0.0",
    "tailwindcss-animate": "^1.0.7",
    "thememirror": "^2.0.1",
-    "vite-plugin-checker": "^0.6.2"
+    "vite-plugin-checker": "^0.6.2",
+    "webm-duration-fix": "^1.0.4",
+    "zustand": "^4.5.2"
  },
  "devDependencies": {
+    "@types/jest": "^29.5.12",
    "@types/node": "^20.9.0",
+    "@types/puppeteer": "^7.0.4",
    "@types/react": "^18.2.15",
    "@types/react-dom": "^18.2.7",
    "@typescript-eslint/eslint-plugin": "^6.0.0",
    "@typescript-eslint/parser": "^6.0.0",
    "@vitejs/plugin-react": "^4.0.3",
    "autoprefixer": "^10.4.16",
+    "dotenv": "^16.4.5",
    "eslint": "^8.45.0",
    "eslint-plugin-react-hooks": "^4.6.0",
    "eslint-plugin-react-refresh": "^0.4.3",
+    "jest": "^29.7.0",
    "postcss": "^8.4.31",
+    "puppeteer": "^22.6.4",
    "tailwindcss": "^3.3.5",
+    "ts-jest": "^29.1.2",
    "typescript": "^5.0.2",
    "vite": "^4.4.5",
    "vite-plugin-html": "^3.2.0",
--- a/frontend/public/favicon/coding.png
+++ b/frontend/public/favicon/coding.png
--- a/frontend/public/favicon/main.png
+++ b/frontend/public/favicon/main.png
--- a/frontend/src/.env.jest.example
+++ b/frontend/src/.env.jest.example
@ -0,0 +1,2 @@
+TEST_SCREENSHOTONE_API_KEY=
+TEST_ROOT_PATH=
--- a/frontend/src/App.tsx
+++ b/frontend/src/App.tsx
@ -35,6 +35,13 @@ import { extractHistoryTree } from "./components/history/utils";
 import toast from "react-hot-toast";
 import ImportCodeSection from "./components/ImportCodeSection";
 import { Stack } from "./lib/stacks";
+import { CodeGenerationModel } from "./lib/models";
+import ModelSettingsSection from "./components/ModelSettingsSection";
+import { extractHtml } from "./components/preview/extractHtml";
+import useBrowserTabIndicator from "./hooks/useBrowserTabIndicator";
+import TipLink from "./components/core/TipLink";
+import SelectAndEditModeToggleButton from "./components/select-and-edit/SelectAndEditModeToggleButton";
+import { useAppStore } from "./store/app-store";

 const IS_OPENAI_DOWN = false;

@ -42,27 +49,36 @@ function App() {
  const [appState, setAppState] = useState<AppState>(AppState.INITIAL);
  const [generatedCode, setGeneratedCode] = useState<string>("");

+  const [inputMode, setInputMode] = useState<"image" | "video">("image");
+
  const [referenceImages, setReferenceImages] = useState<string[]>([]);
  const [executionConsole, setExecutionConsole] = useState<string[]>([]);
  const [updateInstruction, setUpdateInstruction] = useState("");
  const [isImportedFromCode, setIsImportedFromCode] = useState<boolean>(false);

+  const { disableInSelectAndEditMode } = useAppStore();
+
  // Settings
  const [settings, setSettings] = usePersistedState<Settings>(
    {
      openAiApiKey: null,
      openAiBaseURL: null,
+      anthropicApiKey: null,
      screenshotOneApiKey: null,
      isImageGenerationEnabled: true,
      editorTheme: EditorTheme.COBALT,
      generatedCodeConfig: Stack.HTML_TAILWIND,
+      codeGenerationModel: CodeGenerationModel.CLAUDE_3_5_SONNET_2024_06_20,
      // Only relevant for hosted version
      isTermOfServiceAccepted: false,
-      accessCode: null,
    },
    "setting"
  );

+  // Code generation model from local storage or the default value
+  const selectedCodeGenerationModel =
+    settings.codeGenerationModel || CodeGenerationModel.GPT_4_VISION;
+
  // App history
  const [appHistory, setAppHistory] = useState<History>([]);
  // Tracks the currently shown version from app history
@ -73,6 +89,26 @@ function App() {

  const wsRef = useRef<WebSocket>(null);

+  const showReactWarning =
+    selectedCodeGenerationModel ===
+      CodeGenerationModel.GPT_4_TURBO_2024_04_09 &&
+    settings.generatedCodeConfig === Stack.REACT_TAILWIND;
+
+  const showBetterModelMessage =
+    selectedCodeGenerationModel !== CodeGenerationModel.GPT_4O_2024_05_13 &&
+    selectedCodeGenerationModel !==
+      CodeGenerationModel.CLAUDE_3_5_SONNET_2024_06_20 &&
+    appState === AppState.INITIAL;
+
+  const showSelectAndEditFeature =
+    (selectedCodeGenerationModel === CodeGenerationModel.GPT_4O_2024_05_13 ||
+      selectedCodeGenerationModel ===
+        CodeGenerationModel.CLAUDE_3_5_SONNET_2024_06_20) &&
+    settings.generatedCodeConfig === Stack.HTML_TAILWIND;
+
+  // Indicate coding state using the browser tab's favicon and title
+  useBrowserTabIndicator(appState === AppState.CODING);
+
  // When the user already has the settings in local storage, newly added keys
  // do not get added to the settings so if it's falsy, we populate it with the default
  // value
@ -125,6 +161,26 @@ function App() {
    setAppHistory([]);
    setCurrentVersion(null);
    setShouldIncludeResultImage(false);
+    disableInSelectAndEditMode();
+  };
+
+  const regenerate = () => {
+    if (currentVersion === null) {
+      toast.error(
+        "No current version set. Please open a Github issue as this shouldn't happen."
+      );
+      return;
+    }
+
+    // Retrieve the previous command
+    const previousCommand = appHistory[currentVersion];
+    if (previousCommand.type !== "ai_create") {
+      toast.error("Only the first version can be regenerated.");
+      return;
+    }
+
+    // Re-run the create
+    doCreate(referenceImages, inputMode);
  };

  const cancelCodeGeneration = () => {
@ -133,6 +189,11 @@ function App() {
    cancelCodeGenerationAndReset();
  };

+  const previewCode =
+    inputMode === "video" && appState === AppState.CODING
+      ? extractHtml(generatedCode)
+      : generatedCode;
+
  const cancelCodeGenerationAndReset = () => {
    // When this is the first version, reset the entire app state
    if (currentVersion === null) {
@ -189,7 +250,9 @@ function App() {
                parentIndex: parentVersion,
                code,
                inputs: {
-                  prompt: updateInstruction,
+                  prompt: params.history
+                    ? params.history[params.history.length - 1]
+                    : updateInstruction,
                },
              },
            ];
@ -212,16 +275,18 @@ function App() {
  }

  // Initial version creation
-  function doCreate(referenceImages: string[]) {
+  function doCreate(referenceImages: string[], inputMode: "image" | "video") {
    // Reset any existing state
    reset();

    setReferenceImages(referenceImages);
+    setInputMode(inputMode);
    if (referenceImages.length > 0) {
      doGenerateCode(
        {
          generationType: "create",
          image: referenceImages[0],
+          inputMode,
        },
        currentVersion
      );
@ -229,7 +294,15 @@ function App() {
  }

  // Subsequent updates
-  async function doUpdate() {
+  async function doUpdate(
+    updateInstruction: string,
+    selectedElement?: HTMLElement
+  ) {
+    if (updateInstruction.trim() === "") {
+      toast.error("Please include some instructions for AI on what to update.");
+      return;
+    }
+
    if (currentVersion === null) {
      toast.error(
        "No current version set. Contact support or open a Github issue."
@ -247,13 +320,24 @@ function App() {
      return;
    }

-    const updatedHistory = [...historyTree, updateInstruction];
+    let modifiedUpdateInstruction = updateInstruction;
+
+    // Send in a reference to the selected element if it exists
+    if (selectedElement) {
+      modifiedUpdateInstruction =
+        updateInstruction +
+        " referring to this element specifically: " +
+        selectedElement.outerHTML;
+    }
+
+    const updatedHistory = [...historyTree, modifiedUpdateInstruction];

    if (shouldIncludeResultImage) {
      const resultImage = await takeScreenshot();
      doGenerateCode(
        {
          generationType: "update",
+          inputMode,
          image: referenceImages[0],
          resultImage: resultImage,
          history: updatedHistory,
@ -265,6 +349,7 @@ function App() {
      doGenerateCode(
        {
          generationType: "update",
+          inputMode,
          image: referenceImages[0],
          history: updatedHistory,
          isImportedFromCode,
@ -291,6 +376,13 @@ function App() {
    }));
  }

+  function setCodeGenerationModel(codeGenerationModel: CodeGenerationModel) {
+    setSettings((prev) => ({
+      ...prev,
+      codeGenerationModel,
+    }));
+  }
+
  function importFromCode(code: string, stack: Stack) {
    setIsImportedFromCode(true);

@ -312,7 +404,7 @@ function App() {

  return (
    <div className="mt-2 dark:bg-black dark:text-white">
-      {IS_RUNNING_ON_CLOUD && <PicoBadge settings={settings} />}
+      {IS_RUNNING_ON_CLOUD && <PicoBadge />}
      {IS_RUNNING_ON_CLOUD && (
        <TermsOfServiceDialog
          open={!settings.isTermOfServiceAccepted}
@ -334,10 +426,33 @@ function App() {
            }
          />

-          {IS_RUNNING_ON_CLOUD &&
-            !(settings.openAiApiKey || settings.accessCode) && (
-              <OnboardingNote />
-            )}
+          <ModelSettingsSection
+            codeGenerationModel={selectedCodeGenerationModel}
+            setCodeGenerationModel={setCodeGenerationModel}
+            shouldDisableUpdates={
+              appState === AppState.CODING || appState === AppState.CODE_READY
+            }
+          />
+
+          {showReactWarning && (
+            <div className="text-sm bg-yellow-200 rounded p-2">
+              Sorry - React is not currently working with GPT-4 Turbo. Please
+              use GPT-4 Vision or Claude Sonnet. We are working on a fix.
+            </div>
+          )}
+
+          {showBetterModelMessage && (
+            <div className="rounded-lg p-2 bg-fuchsia-200">
+              <p className="text-gray-800 text-sm">
+                Now supporting GPT-4o and Claude Sonnet 3.5. Higher quality and
+                2x faster. Give it a try!
+              </p>
+            </div>
+          )}
+
+          {appState !== AppState.CODE_READY && <TipLink />}
+
+          {IS_RUNNING_ON_CLOUD && !settings.openAiApiKey && <OnboardingNote />}

          {IS_OPENAI_DOWN && (
            <div className="bg-black text-white dark:bg-white dark:text-black p-3 rounded">
@ -352,11 +467,25 @@ function App() {
              {/* Show code preview only when coding */}
              {appState === AppState.CODING && (
                <div className="flex flex-col">
+                  {/* Speed disclaimer for video mode */}
+                  {inputMode === "video" && (
+                    <div
+                      className="bg-yellow-100 border-l-4 border-yellow-500 text-yellow-700
+                    p-2 text-xs mb-4 mt-1"
+                    >
+                      Code generation from videos can take 3-4 minutes. We do
+                      multiple passes to get the best result. Please be patient.
+                    </div>
+                  )}
+
                  <div className="flex items-center gap-x-1">
                    <Spinner />
                    {executionConsole.slice(-1)[0]}
                  </div>
-                  <div className="flex mt-4 w-full">
+
+                  <CodePreview code={generatedCode} />
+
+                  <div className="flex w-full">
                    <Button
                      onClick={cancelCodeGeneration}
                      className="w-full dark:text-white dark:bg-gray-700"
@ -364,7 +493,6 @@ function App() {
                      Cancel
                    </Button>
                  </div>
-                  <CodePreview code={generatedCode} />
                </div>
              )}

@ -387,26 +515,25 @@ function App() {
                      />
                    </div>
                    <Button
-                      onClick={doUpdate}
-                      className="dark:text-white dark:bg-gray-700"
+                      onClick={() => doUpdate(updateInstruction)}
+                      className="dark:text-white dark:bg-gray-700 update-btn"
                    >
                      Update
                    </Button>
                  </div>
-                  <div className="flex items-center gap-x-2 mt-2">
+                  <div className="flex items-center justify-end gap-x-2 mt-2">
                    <Button
-                      onClick={downloadCode}
-                      className="flex items-center gap-x-2 dark:text-white dark:bg-gray-700"
+                      onClick={regenerate}
+                      className="flex items-center gap-x-2 dark:text-white dark:bg-gray-700 regenerate-btn"
                    >
-                      <FaDownload /> Download
-                    </Button>
-                    <Button
-                      onClick={reset}
-                      className="flex items-center gap-x-2 dark:text-white dark:bg-gray-700"
-                    >
-                      <FaUndo />
-                      Reset
+                      🔄 Regenerate
                    </Button>
+                    {showSelectAndEditFeature && (
+                      <SelectAndEditModeToggleButton />
+                    )}
+                  </div>
+                  <div className="flex justify-end items-center mt-2">
+                    <TipLink />
                  </div>
                </div>
              )}
@ -420,14 +547,27 @@ function App() {
                        "scanning relative": appState === AppState.CODING,
                      })}
                    >
-                      <img
-                        className="w-[340px] border border-gray-200 rounded-md"
-                        src={referenceImages[0]}
-                        alt="Reference"
-                      />
+                      {inputMode === "image" && (
+                        <img
+                          className="w-[340px] border border-gray-200 rounded-md"
+                          src={referenceImages[0]}
+                          alt="Reference"
+                        />
+                      )}
+                      {inputMode === "video" && (
+                        <video
+                          muted
+                          autoPlay
+                          loop
+                          className="w-[340px] border border-gray-200 rounded-md"
+                          src={referenceImages[0]}
+                        />
+                      )}
                    </div>
                    <div className="text-gray-400 uppercase text-sm text-center mt-1">
-                      Original Screenshot
+                      {inputMode === "video"
+                        ? "Original Video"
+                        : "Original Screenshot"}
                    </div>
                  </div>
                )}
@ -482,29 +622,59 @@ function App() {
        {(appState === AppState.CODING || appState === AppState.CODE_READY) && (
          <div className="ml-4">
            <Tabs defaultValue="desktop">
-              <div className="flex justify-end mr-8 mb-4">
-                <TabsList>
-                  <TabsTrigger value="desktop" className="flex gap-x-2">
-                    <FaDesktop /> Desktop
-                  </TabsTrigger>
-                  <TabsTrigger value="mobile" className="flex gap-x-2">
-                    <FaMobile /> Mobile
-                  </TabsTrigger>
-                  <TabsTrigger value="code" className="flex gap-x-2">
-                    <FaCode />
-                    Code
-                  </TabsTrigger>
-                </TabsList>
+              <div className="flex justify-between mr-8 mb-4">
+                <div className="flex items-center gap-x-2">
+                  {appState === AppState.CODE_READY && (
+                    <>
+                      <Button
+                        onClick={reset}
+                        className="flex items-center ml-4 gap-x-2 dark:text-white dark:bg-gray-700"
+                      >
+                        <FaUndo />
+                        Reset
+                      </Button>
+                      <Button
+                        onClick={downloadCode}
+                        variant="secondary"
+                        className="flex items-center gap-x-2 mr-4 dark:text-white dark:bg-gray-700 download-btn"
+                      >
+                        <FaDownload /> Download
+                      </Button>
+                    </>
+                  )}
+                </div>
+                <div className="flex items-center">
+                  <TabsList>
+                    <TabsTrigger value="desktop" className="flex gap-x-2">
+                      <FaDesktop /> Desktop
+                    </TabsTrigger>
+                    <TabsTrigger value="mobile" className="flex gap-x-2">
+                      <FaMobile /> Mobile
+                    </TabsTrigger>
+                    <TabsTrigger value="code" className="flex gap-x-2">
+                      <FaCode />
+                      Code
+                    </TabsTrigger>
+                  </TabsList>
+                </div>
              </div>
              <TabsContent value="desktop">
-                <Preview code={generatedCode} device="desktop" />
+                <Preview
+                  code={previewCode}
+                  device="desktop"
+                  doUpdate={doUpdate}
+                />
              </TabsContent>
              <TabsContent value="mobile">
-                <Preview code={generatedCode} device="mobile" />
+                <Preview
+                  code={previewCode}
+                  device="mobile"
+                  doUpdate={doUpdate}
+                />
              </TabsContent>
              <TabsContent value="code">
                <CodeTab
-                  code={generatedCode}
+                  code={previewCode}
                  setCode={setGeneratedCode}
                  settings={settings}
                />
--- a/frontend/src/components/ImageUpload.tsx
+++ b/frontend/src/components/ImageUpload.tsx
@ -3,6 +3,10 @@ import { useState, useEffect, useMemo } from "react";
 import { useDropzone } from "react-dropzone";
 // import { PromptImage } from "../../../types";
 import { toast } from "react-hot-toast";
+import { URLS } from "../urls";
+import { Badge } from "./ui/badge";
+import ScreenRecorder from "./recording/ScreenRecorder";
+import { ScreenRecorderState } from "../types";

 const baseStyle = {
  flex: 1,
@ -51,19 +55,31 @@ type FileWithPreview = {
 } & File;

 interface Props {
-  setReferenceImages: (referenceImages: string[]) => void;
+  setReferenceImages: (
+    referenceImages: string[],
+    inputMode: "image" | "video"
+  ) => void;
 }

 function ImageUpload({ setReferenceImages }: Props) {
  const [files, setFiles] = useState<FileWithPreview[]>([]);
+  // TODO: Switch to Zustand
+  const [screenRecorderState, setScreenRecorderState] =
+    useState<ScreenRecorderState>(ScreenRecorderState.INITIAL);
+
  const { getRootProps, getInputProps, isFocused, isDragAccept, isDragReject } =
    useDropzone({
      maxFiles: 1,
-      maxSize: 1024 * 1024 * 5, // 5 MB
+      maxSize: 1024 * 1024 * 20, // 20 MB
      accept: {
+        // Image formats
        "image/png": [".png"],
        "image/jpeg": [".jpeg"],
        "image/jpg": [".jpg"],
+        // Video formats
+        "video/quicktime": [".mov"],
+        "video/mp4": [".mp4"],
+        "video/webm": [".webm"],
      },
      onDrop: (acceptedFiles) => {
        // Set up the preview thumbnail images
@ -78,7 +94,14 @@ function ImageUpload({ setReferenceImages }: Props) {
        // Convert images to data URLs and set the prompt images state
        Promise.all(acceptedFiles.map((file) => fileToDataURL(file)))
          .then((dataUrls) => {
-            setReferenceImages(dataUrls.map((dataUrl) => dataUrl as string));
+            if (dataUrls.length > 0) {
+              setReferenceImages(
+                dataUrls.map((dataUrl) => dataUrl as string),
+                (dataUrls[0] as string).startsWith("data:video")
+                  ? "video"
+                  : "image"
+              );
+            }
          })
          .catch((error) => {
            toast.error("Error reading files" + error);
@ -140,15 +163,34 @@ function ImageUpload({ setReferenceImages }: Props) {

  return (
    <section className="container">
-      {/* eslint-disable-next-line @typescript-eslint/no-explicit-any */}
-      <div {...getRootProps({ style: style as any })}>
-        <input {...getInputProps()} />
-        <p className="text-slate-700 text-lg">
-          Drag & drop a screenshot here, <br />
-          or paste from clipboard, <br />
-          or click to upload
-        </p>
-      </div>
+      {screenRecorderState === ScreenRecorderState.INITIAL && (
+        /* eslint-disable-next-line @typescript-eslint/no-explicit-any */
+        <div {...getRootProps({ style: style as any })}>
+          <input {...getInputProps()} className="file-input" />
+          <p className="text-slate-700 text-lg">
+            Drag & drop a screenshot here, <br />
+            or click to upload
+          </p>
+        </div>
+      )}
+      {screenRecorderState === ScreenRecorderState.INITIAL && (
+        <div className="text-center text-sm text-slate-800 mt-4">
+          <Badge>New!</Badge> Upload a screen recording (.mp4, .mov) or record
+          your screen to clone a whole app (experimental).{" "}
+          <a
+            className="underline"
+            href={URLS["intro-to-video"]}
+            target="_blank"
+          >
+            Learn more.
+          </a>
+        </div>
+      )}
+      <ScreenRecorder
+        screenRecorderState={screenRecorderState}
+        setScreenRecorderState={setScreenRecorderState}
+        generateCode={setReferenceImages}
+      />
    </section>
  );
 }
--- a/frontend/src/components/ImportCodeSection.tsx
+++ b/frontend/src/components/ImportCodeSection.tsx
@ -38,7 +38,9 @@ function ImportCodeSection({ importFromCode }: Props) {
  return (
    <Dialog>
      <DialogTrigger asChild>
-        <Button variant="secondary">Import from Code</Button>
+        <Button className="import-from-code-btn" variant="secondary">
+          Import from Code
+        </Button>
      </DialogTrigger>
      <DialogContent className="sm:max-w-[425px]">
        <DialogHeader>
@ -62,7 +64,7 @@ function ImportCodeSection({ importFromCode }: Props) {
        />

        <DialogFooter>
-          <Button type="submit" onClick={doImport}>
+          <Button className="import-btn" type="submit" onClick={doImport}>
            Import
          </Button>
        </DialogFooter>
--- a/frontend/src/components/ModelSettingsSection.tsx
+++ b/frontend/src/components/ModelSettingsSection.tsx
@ -0,0 +1,65 @@
+import {
+  Select,
+  SelectContent,
+  SelectGroup,
+  SelectItem,
+  SelectTrigger,
+} from "./ui/select";
+import {
+  CODE_GENERATION_MODEL_DESCRIPTIONS,
+  CodeGenerationModel,
+} from "../lib/models";
+import { Badge } from "./ui/badge";
+
+interface Props {
+  codeGenerationModel: CodeGenerationModel;
+  setCodeGenerationModel: (codeGenerationModel: CodeGenerationModel) => void;
+  shouldDisableUpdates?: boolean;
+}
+
+function ModelSettingsSection({
+  codeGenerationModel,
+  setCodeGenerationModel,
+  shouldDisableUpdates = false,
+}: Props) {
+  return (
+    <div className="flex flex-col gap-y-2 justify-between text-sm">
+      <div className="grid grid-cols-3 items-center gap-4">
+        <span>AI Model:</span>
+        <Select
+          value={codeGenerationModel}
+          onValueChange={(value: string) =>
+            setCodeGenerationModel(value as CodeGenerationModel)
+          }
+          disabled={shouldDisableUpdates}
+        >
+          <SelectTrigger className="col-span-2" id="output-settings-js">
+            <span className="font-semibold">
+              {CODE_GENERATION_MODEL_DESCRIPTIONS[codeGenerationModel].name}
+            </span>
+          </SelectTrigger>
+          <SelectContent>
+            <SelectGroup>
+              {Object.values(CodeGenerationModel).map((model) => (
+                <SelectItem key={model} value={model}>
+                  <div className="flex items-center">
+                    <span className="font-semibold">
+                      {CODE_GENERATION_MODEL_DESCRIPTIONS[model].name}
+                    </span>
+                    {CODE_GENERATION_MODEL_DESCRIPTIONS[model].inBeta && (
+                      <Badge className="ml-2" variant="secondary">
+                        Beta
+                      </Badge>
+                    )}
+                  </div>
+                </SelectItem>
+              ))}
+            </SelectGroup>
+          </SelectContent>
+        </Select>
+      </div>
+    </div>
+  );
+}
+
+export default ModelSettingsSection;
--- a/frontend/src/components/PicoBadge.tsx
+++ b/frontend/src/components/PicoBadge.tsx
@ -1,6 +1,4 @@
-import { Settings } from "../types";
-
-export function PicoBadge({ settings }: { settings: Settings }) {
+export function PicoBadge() {
  return (
    <>
      <a
@ -14,26 +12,14 @@ export function PicoBadge({ settings }: { settings: Settings }) {
          feature requests?
        </div>
      </a>
-      {!settings.accessCode && (
-        <a href="https://picoapps.xyz?ref=screenshot-to-code" target="_blank">
-          <div
-            className="fixed z-50 bottom-5 right-5 rounded-md shadow text-black
+      <a href="https://picoapps.xyz?ref=screenshot-to-code" target="_blank">
+        <div
+          className="fixed z-50 bottom-5 right-5 rounded-md shadow text-black
         bg-white px-4 text-xs py-3 cursor-pointer"
-          >
-            an open source project by Pico
-          </div>
-        </a>
-      )}
-      {settings.accessCode && (
-        <a href="mailto:support@picoapps.xyz" target="_blank">
-          <div
-            className="fixed z-50 bottom-5 right-5 rounded-md shadow text-black
-         bg-white px-4 text-xs py-3 cursor-pointer"
-          >
-            email support
-          </div>
-        </a>
-      )}
+        >
+          an open source project by Pico
+        </div>
+      </a>
    </>
  );
 }
--- a/frontend/src/components/Preview.tsx
+++ b/frontend/src/components/Preview.tsx
@ -1,24 +1,35 @@
-import { useEffect, useRef } from "react";
+import { useEffect, useRef, useState } from "react";
 import classNames from "classnames";
-// import useThrottle from "../hooks/useThrottle";
+import useThrottle from "../hooks/useThrottle";
+import EditPopup from "./select-and-edit/EditPopup";

 interface Props {
  code: string;
  device: "mobile" | "desktop";
+  doUpdate: (updateInstruction: string, selectedElement?: HTMLElement) => void;
 }

-function Preview({ code, device }: Props) {
-  const throttledCode = code;
-  // Temporary disable throttling for the preview not updating when the code changes
-  // useThrottle(code, 200);
+function Preview({ code, device, doUpdate }: Props) {
  const iframeRef = useRef<HTMLIFrameElement | null>(null);

+  // Don't update code more often than every 200ms.
+  const throttledCode = useThrottle(code, 200);
+
+  // Select and edit functionality
+  const [clickEvent, setClickEvent] = useState<MouseEvent | null>(null);
+
  useEffect(() => {
    const iframe = iframeRef.current;
-    if (iframe && iframe.contentDocument) {
-      iframe.contentDocument.open();
-      iframe.contentDocument.write(throttledCode);
-      iframe.contentDocument.close();
+    if (iframe) {
+      iframe.srcdoc = throttledCode;
+
+      // Set up click handler for select and edit funtionality
+      iframe.addEventListener("load", function () {
+        iframe.contentWindow?.document.body.addEventListener(
+          "click",
+          setClickEvent
+        );
+      });
    }
  }, [throttledCode]);

@ -37,6 +48,7 @@ function Preview({ code, device }: Props) {
          }
        )}
      ></iframe>
+      <EditPopup event={clickEvent} iframeRef={iframeRef} doUpdate={doUpdate} />
    </div>
  );
 }
--- a/frontend/src/components/SettingsDialog.tsx
+++ b/frontend/src/components/SettingsDialog.tsx
@ -22,7 +22,6 @@ import {
  AccordionItem,
  AccordionTrigger,
 } from "./ui/accordion";
-import AccessCodeSection from "./settings/AccessCodeSection";

 interface Props {
  settings: Settings;
@ -47,15 +46,10 @@ function SettingsDialog({ settings, setSettings }: Props) {
          <DialogTitle className="mb-4">Settings</DialogTitle>
        </DialogHeader>

-        {/* Access code */}
-        {IS_RUNNING_ON_CLOUD && (
-          <AccessCodeSection settings={settings} setSettings={setSettings} />
-        )}
-
        <div className="flex items-center space-x-2">
          <Label htmlFor="image-generation">
            <div>DALL-E Placeholder Image Generation</div>
-            <div className="font-light mt-2">
+            <div className="font-light mt-2 text-xs">
              More fun with it but if you want to save money, turn it off.
            </div>
          </Label>
@ -70,29 +64,31 @@ function SettingsDialog({ settings, setSettings }: Props) {
            }
          />
        </div>
-        <div className="flex flex-col space-y-4">
-          <Label htmlFor="openai-api-key">
-            <div>OpenAI API key</div>
-            <div className="font-light mt-2 leading-relaxed">
-              Only stored in your browser. Never stored on servers. Overrides
-              your .env config.
-            </div>
-          </Label>
+        <div className="flex flex-col space-y-6">
+          <div>
+            <Label htmlFor="openai-api-key">
+              <div>OpenAI API key</div>
+              <div className="font-light mt-1 mb-2 text-xs leading-relaxed">
+                Only stored in your browser. Never stored on servers. Overrides
+                your .env config.
+              </div>
+            </Label>

-          <Input
-            id="openai-api-key"
-            placeholder="OpenAI API key"
-            value={settings.openAiApiKey || ""}
-            onChange={(e) =>
-              setSettings((s) => ({
-                ...s,
-                openAiApiKey: e.target.value,
-              }))
-            }
-          />
+            <Input
+              id="openai-api-key"
+              placeholder="OpenAI API key"
+              value={settings.openAiApiKey || ""}
+              onChange={(e) =>
+                setSettings((s) => ({
+                  ...s,
+                  openAiApiKey: e.target.value,
+                }))
+              }
+            />
+          </div>

          {!IS_RUNNING_ON_CLOUD && (
-            <>
+            <div>
              <Label htmlFor="openai-api-key">
                <div>OpenAI Base URL (optional)</div>
                <div className="font-light mt-2 leading-relaxed">
@ -111,9 +107,31 @@ function SettingsDialog({ settings, setSettings }: Props) {
                  }))
                }
              />
-            </>
+            </div>
          )}

+          <div>
+            <Label htmlFor="anthropic-api-key">
+              <div>Anthropic API key</div>
+              <div className="font-light mt-1 text-xs leading-relaxed">
+                Only stored in your browser. Never stored on servers. Overrides
+                your .env config.
+              </div>
+            </Label>
+
+            <Input
+              id="anthropic-api-key"
+              placeholder="Anthropic API key"
+              value={settings.anthropicApiKey || ""}
+              onChange={(e) =>
+                setSettings((s) => ({
+                  ...s,
+                  anthropicApiKey: e.target.value,
+                }))
+              }
+            />
+          </div>
+
          <Accordion type="single" collapsible className="w-full">
            <AccordionItem value="item-1">
              <AccordionTrigger>Screenshot by URL Config</AccordionTrigger>
--- a/frontend/src/components/UrlInputSection.tsx
+++ b/frontend/src/components/UrlInputSection.tsx
@ -6,7 +6,7 @@ import { toast } from "react-hot-toast";

 interface Props {
  screenshotOneApiKey: string | null;
-  doCreate: (urls: string[]) => void;
+  doCreate: (urls: string[], inputMode: "image" | "video") => void;
 }

 export function UrlInputSection({ doCreate, screenshotOneApiKey }: Props) {
@ -46,7 +46,7 @@ export function UrlInputSection({ doCreate, screenshotOneApiKey }: Props) {
        }

        const res = await response.json();
-        doCreate([res.url]);
+        doCreate([res.url], "image");
      } catch (error) {
        console.error(error);
        toast.error(
@ -69,7 +69,7 @@ export function UrlInputSection({ doCreate, screenshotOneApiKey }: Props) {
      <Button
        onClick={takeScreenshot}
        disabled={isLoading}
-        className="bg-slate-400"
+        className="bg-slate-400 capture-btn"
      >
        {isLoading ? "Capturing..." : "Capture"}
      </Button>
--- a/frontend/src/components/core/TipLink.tsx
+++ b/frontend/src/components/core/TipLink.tsx
@ -0,0 +1,16 @@
+import { URLS } from "../../urls";
+
+function TipLink() {
+  return (
+    <a
+      className="text-xs underline text-gray-500 text-right"
+      href={URLS.tips}
+      target="_blank"
+      rel="noopener"
+    >
+      Tips for better results
+    </a>
+  );
+}
+
+export default TipLink;
--- a/frontend/src/components/evals/EvalsPage.tsx
+++ b/frontend/src/components/evals/EvalsPage.tsx
@ -4,7 +4,7 @@ import RatingPicker from "./RatingPicker";

 interface Eval {
  input: string;
-  output: string;
+  outputs: string[];
 }

 function EvalsPage() {
@ -38,18 +38,22 @@ function EvalsPage() {
      <div className="flex flex-col gap-y-4 mt-4 mx-auto justify-center">
        {evals.map((e, index) => (
          <div className="flex flex-col justify-center" key={index}>
-            <div className="flex gap-x-2 justify-center">
+            <h2 className="font-bold text-lg ml-4">{index}</h2>
+            <div className="flex gap-x-2 justify-center ml-4">
+              {/* Update w if N changes to a fixed number like w-[600px] */}
              <div className="w-1/2 p-1 border">
-                <img src={e.input} />
-              </div>
-              <div className="w-1/2 p-1 border">
-                {/* Put output into an iframe */}
-                <iframe
-                  srcDoc={e.output}
-                  className="w-[1200px] h-[800px] transform scale-[0.60]"
-                  style={{ transformOrigin: "top left" }}
-                ></iframe>
+                <img src={e.input} alt={`Input for eval ${index}`} />
              </div>
+              {e.outputs.map((output, outputIndex) => (
+                <div className="w-1/2 p-1 border" key={outputIndex}>
+                  {/* Put output into an iframe */}
+                  <iframe
+                    srcDoc={output}
+                    className="w-[1200px] h-[800px] transform scale-[0.60]"
+                    style={{ transformOrigin: "top left" }}
+                  ></iframe>
+                </div>
+              ))}
            </div>
            <div className="ml-8 mt-4 flex justify-center">
              <RatingPicker
--- a/frontend/src/components/history/utils.test.ts
+++ b/frontend/src/components/history/utils.test.ts
@ -1,4 +1,3 @@
-import { expect, test } from "vitest";
 import { extractHistoryTree, renderHistory } from "./utils";
 import type { History } from "./history_types";

@ -84,147 +83,149 @@ const basicBadHistory: History = [
  },
 ];

-test("should correctly extract the history tree", () => {
-  expect(extractHistoryTree(basicLinearHistory, 2)).toEqual([
-    "<html>1. create</html>",
-    "use better icons",
-    "<html>2. edit with better icons</html>",
-    "make text red",
-    "<html>3. edit with better icons and red text</html>",
-  ]);
+describe("History Utils", () => {
+  test("should correctly extract the history tree", () => {
+    expect(extractHistoryTree(basicLinearHistory, 2)).toEqual([
+      "<html>1. create</html>",
+      "use better icons",
+      "<html>2. edit with better icons</html>",
+      "make text red",
+      "<html>3. edit with better icons and red text</html>",
+    ]);

-  expect(extractHistoryTree(basicLinearHistory, 0)).toEqual([
-    "<html>1. create</html>",
-  ]);
+    expect(extractHistoryTree(basicLinearHistory, 0)).toEqual([
+      "<html>1. create</html>",
+    ]);

-  // Test branching
-  expect(extractHistoryTree(basicBranchingHistory, 3)).toEqual([
-    "<html>1. create</html>",
-    "use better icons",
-    "<html>2. edit with better icons</html>",
-    "make text green",
-    "<html>4. edit with better icons and green text</html>",
-  ]);
+    // Test branching
+    expect(extractHistoryTree(basicBranchingHistory, 3)).toEqual([
+      "<html>1. create</html>",
+      "use better icons",
+      "<html>2. edit with better icons</html>",
+      "make text green",
+      "<html>4. edit with better icons and green text</html>",
+    ]);

-  expect(extractHistoryTree(longerBranchingHistory, 4)).toEqual([
-    "<html>1. create</html>",
-    "use better icons",
-    "<html>2. edit with better icons</html>",
-    "make text green",
-    "<html>4. edit with better icons and green text</html>",
-    "make text bold",
-    "<html>5. edit with better icons and green, bold text</html>",
-  ]);
+    expect(extractHistoryTree(longerBranchingHistory, 4)).toEqual([
+      "<html>1. create</html>",
+      "use better icons",
+      "<html>2. edit with better icons</html>",
+      "make text green",
+      "<html>4. edit with better icons and green text</html>",
+      "make text bold",
+      "<html>5. edit with better icons and green, bold text</html>",
+    ]);

-  expect(extractHistoryTree(longerBranchingHistory, 2)).toEqual([
-    "<html>1. create</html>",
-    "use better icons",
-    "<html>2. edit with better icons</html>",
-    "make text red",
-    "<html>3. edit with better icons and red text</html>",
-  ]);
+    expect(extractHistoryTree(longerBranchingHistory, 2)).toEqual([
+      "<html>1. create</html>",
+      "use better icons",
+      "<html>2. edit with better icons</html>",
+      "make text red",
+      "<html>3. edit with better icons and red text</html>",
+    ]);

-  // Errors
+    // Errors

-  // Bad index
-  expect(() => extractHistoryTree(basicLinearHistory, 100)).toThrow();
-  expect(() => extractHistoryTree(basicLinearHistory, -2)).toThrow();
+    // Bad index
+    expect(() => extractHistoryTree(basicLinearHistory, 100)).toThrow();
+    expect(() => extractHistoryTree(basicLinearHistory, -2)).toThrow();

-  // Bad tree
-  expect(() => extractHistoryTree(basicBadHistory, 1)).toThrow();
-});
-
-test("should correctly render the history tree", () => {
-  expect(renderHistory(basicLinearHistory, 2)).toEqual([
-    {
-      isActive: false,
-      parentVersion: null,
-      summary: "Create",
-      type: "Create",
-    },
-    {
-      isActive: false,
-      parentVersion: null,
-      summary: "use better icons",
-      type: "Edit",
-    },
-    {
-      isActive: true,
-      parentVersion: null,
-      summary: "make text red",
-      type: "Edit",
-    },
-  ]);
-
-  // Current version is the first version
-  expect(renderHistory(basicLinearHistory, 0)).toEqual([
-    {
-      isActive: true,
-      parentVersion: null,
-      summary: "Create",
-      type: "Create",
-    },
-    {
-      isActive: false,
-      parentVersion: null,
-      summary: "use better icons",
-      type: "Edit",
-    },
-    {
-      isActive: false,
-      parentVersion: null,
-      summary: "make text red",
-      type: "Edit",
-    },
-  ]);
-
-  // Render a history with code
-  expect(renderHistory(basicLinearHistoryWithCode, 0)).toEqual([
-    {
-      isActive: true,
-      parentVersion: null,
-      summary: "Imported from code",
-      type: "Imported from code",
-    },
-    {
-      isActive: false,
-      parentVersion: null,
-      summary: "use better icons",
-      type: "Edit",
-    },
-    {
-      isActive: false,
-      parentVersion: null,
-      summary: "make text red",
-      type: "Edit",
-    },
-  ]);
-
-  // Render a non-linear history
-  expect(renderHistory(basicBranchingHistory, 3)).toEqual([
-    {
-      isActive: false,
-      parentVersion: null,
-      summary: "Create",
-      type: "Create",
-    },
-    {
-      isActive: false,
-      parentVersion: null,
-      summary: "use better icons",
-      type: "Edit",
-    },
-    {
-      isActive: false,
-      parentVersion: null,
-      summary: "make text red",
-      type: "Edit",
-    },
-    {
-      isActive: true,
-      parentVersion: "v2",
-      summary: "make text green",
-      type: "Edit",
-    },
-  ]);
+    // Bad tree
+    expect(() => extractHistoryTree(basicBadHistory, 1)).toThrow();
+  });
+
+  test("should correctly render the history tree", () => {
+    expect(renderHistory(basicLinearHistory, 2)).toEqual([
+      {
+        isActive: false,
+        parentVersion: null,
+        summary: "Create",
+        type: "Create",
+      },
+      {
+        isActive: false,
+        parentVersion: null,
+        summary: "use better icons",
+        type: "Edit",
+      },
+      {
+        isActive: true,
+        parentVersion: null,
+        summary: "make text red",
+        type: "Edit",
+      },
+    ]);
+
+    // Current version is the first version
+    expect(renderHistory(basicLinearHistory, 0)).toEqual([
+      {
+        isActive: true,
+        parentVersion: null,
+        summary: "Create",
+        type: "Create",
+      },
+      {
+        isActive: false,
+        parentVersion: null,
+        summary: "use better icons",
+        type: "Edit",
+      },
+      {
+        isActive: false,
+        parentVersion: null,
+        summary: "make text red",
+        type: "Edit",
+      },
+    ]);
+
+    // Render a history with code
+    expect(renderHistory(basicLinearHistoryWithCode, 0)).toEqual([
+      {
+        isActive: true,
+        parentVersion: null,
+        summary: "Imported from code",
+        type: "Imported from code",
+      },
+      {
+        isActive: false,
+        parentVersion: null,
+        summary: "use better icons",
+        type: "Edit",
+      },
+      {
+        isActive: false,
+        parentVersion: null,
+        summary: "make text red",
+        type: "Edit",
+      },
+    ]);
+
+    // Render a non-linear history
+    expect(renderHistory(basicBranchingHistory, 3)).toEqual([
+      {
+        isActive: false,
+        parentVersion: null,
+        summary: "Create",
+        type: "Create",
+      },
+      {
+        isActive: false,
+        parentVersion: null,
+        summary: "use better icons",
+        type: "Edit",
+      },
+      {
+        isActive: false,
+        parentVersion: null,
+        summary: "make text red",
+        type: "Edit",
+      },
+      {
+        isActive: true,
+        parentVersion: "v2",
+        summary: "make text green",
+        type: "Edit",
+      },
+    ]);
+  });
 });
--- a/frontend/src/components/preview/extractHtml.ts
+++ b/frontend/src/components/preview/extractHtml.ts
@ -0,0 +1,16 @@
+// Not robust enough to support <html lang='en'> for instance
+export function extractHtml(code: string): string {
+  const lastHtmlStartIndex = code.lastIndexOf("<html>");
+  let htmlEndIndex = code.indexOf("</html>", lastHtmlStartIndex);
+
+  if (lastHtmlStartIndex !== -1) {
+    // If "</html>" is found, adjust htmlEndIndex to include the "</html>" tag
+    if (htmlEndIndex !== -1) {
+      htmlEndIndex += "</html>".length;
+      return code.slice(lastHtmlStartIndex, htmlEndIndex);
+    }
+    // If "</html>" is not found, return the rest of the string starting from the last "<html>"
+    return code.slice(lastHtmlStartIndex);
+  }
+  return "";
+}
--- a/frontend/src/components/preview/simpleHash.ts
+++ b/frontend/src/components/preview/simpleHash.ts
@ -0,0 +1,10 @@
+
+export function simpleHash(str: string, seed = 0) {
+  let hash = seed;
+  for (let i = 0; i < str.length; i++) {
+    const char = str.charCodeAt(i);
+    hash = (hash << 5) - hash + char;
+    hash |= 0; // Convert to 32bit integer
+  }
+  return hash;
+}
--- a/frontend/src/components/recording/ScreenRecorder.tsx
+++ b/frontend/src/components/recording/ScreenRecorder.tsx
@ -0,0 +1,145 @@
+import { useState } from "react";
+import { Button } from "../ui/button";
+import { ScreenRecorderState } from "../../types";
+import { blobToBase64DataUrl } from "./utils";
+import fixWebmDuration from "webm-duration-fix";
+import toast from "react-hot-toast";
+
+interface Props {
+  screenRecorderState: ScreenRecorderState;
+  setScreenRecorderState: (state: ScreenRecorderState) => void;
+  generateCode: (
+    referenceImages: string[],
+    inputMode: "image" | "video"
+  ) => void;
+}
+
+function ScreenRecorder({
+  screenRecorderState,
+  setScreenRecorderState,
+  generateCode,
+}: Props) {
+  const [mediaStream, setMediaStream] = useState<MediaStream | null>(null);
+  const [mediaRecorder, setMediaRecorder] = useState<MediaRecorder | null>(
+    null
+  );
+  const [screenRecordingDataUrl, setScreenRecordingDataUrl] = useState<
+    string | null
+  >(null);
+
+  const startScreenRecording = async () => {
+    try {
+      // Get the screen recording stream
+      const stream = await navigator.mediaDevices.getDisplayMedia({
+        video: true,
+        audio: { echoCancellation: true },
+      });
+      setMediaStream(stream);
+
+      // TODO: Test across different browsers
+      // Create the media recorder
+      const options = { mimeType: "video/webm" };
+      const mediaRecorder = new MediaRecorder(stream, options);
+      setMediaRecorder(mediaRecorder);
+
+      const chunks: BlobPart[] = [];
+
+      // Accumalate chunks as data is available
+      mediaRecorder.ondataavailable = (e: BlobEvent) => chunks.push(e.data);
+
+      // When media recorder is stopped, create a data URL
+      mediaRecorder.onstop = async () => {
+        // TODO: Do I need to fix duration if it's not a webm?
+        const completeBlob = await fixWebmDuration(
+          new Blob(chunks, {
+            type: options.mimeType,
+          })
+        );
+
+        const dataUrl = await blobToBase64DataUrl(completeBlob);
+
+        setScreenRecordingDataUrl(dataUrl);
+        setScreenRecorderState(ScreenRecorderState.FINISHED);
+      };
+
+      // Start recording
+      mediaRecorder.start();
+      setScreenRecorderState(ScreenRecorderState.RECORDING);
+    } catch (error) {
+      toast.error("Could not start screen recording");
+      throw error;
+    }
+  };
+
+  const stopScreenRecording = () => {
+    // Stop the recorder
+    if (mediaRecorder) {
+      mediaRecorder.stop();
+      setMediaRecorder(null);
+    }
+
+    // Stop the screen sharing stream
+    if (mediaStream) {
+      mediaStream.getTracks().forEach((track) => {
+        track.stop();
+      });
+    }
+  };
+
+  const kickoffGeneration = () => {
+    if (screenRecordingDataUrl) {
+      generateCode([screenRecordingDataUrl], "video");
+    } else {
+      toast.error("Screen recording does not exist. Please try again.");
+      throw new Error("No screen recording data url");
+    }
+  };
+
+  return (
+    <div className="flex items-center justify-center my-3">
+      {screenRecorderState === ScreenRecorderState.INITIAL && (
+        <Button onClick={startScreenRecording}>Record Screen</Button>
+      )}
+
+      {screenRecorderState === ScreenRecorderState.RECORDING && (
+        <div className="flex items-center flex-col gap-y-4">
+          <div className="flex items-center mr-2 text-xl gap-x-1">
+            <span className="block h-10 w-10 bg-red-600 rounded-full mr-1 animate-pulse"></span>
+            <span>Recording...</span>
+          </div>
+          <Button onClick={stopScreenRecording}>Finish Recording</Button>
+        </div>
+      )}
+
+      {screenRecorderState === ScreenRecorderState.FINISHED && (
+        <div className="flex items-center flex-col gap-y-4">
+          <div className="flex items-center mr-2 text-xl gap-x-1">
+            <span>Screen Recording Captured.</span>
+          </div>
+          {screenRecordingDataUrl && (
+            <video
+              muted
+              autoPlay
+              loop
+              className="w-[340px] border border-gray-200 rounded-md"
+              src={screenRecordingDataUrl}
+            />
+          )}
+          <div className="flex gap-x-2">
+            <Button
+              variant="secondary"
+              onClick={() =>
+                setScreenRecorderState(ScreenRecorderState.INITIAL)
+              }
+            >
+              Re-record
+            </Button>
+            <Button onClick={kickoffGeneration}>Generate</Button>
+          </div>
+        </div>
+      )}
+    </div>
+  );
+}
+
+export default ScreenRecorder;
--- a/frontend/src/components/recording/utils.ts
+++ b/frontend/src/components/recording/utils.ts
@ -0,0 +1,31 @@
+export function downloadBlob(blob: Blob) {
+  // Create a URL for the blob object
+  const videoURL = URL.createObjectURL(blob);
+
+  // Create a temporary anchor element and trigger the download
+  const a = document.createElement("a");
+  a.href = videoURL;
+  a.download = "recording.webm";
+  document.body.appendChild(a);
+  a.click();
+  document.body.removeChild(a);
+
+  // Clear object URL
+  URL.revokeObjectURL(videoURL);
+}
+
+export function blobToBase64DataUrl(blob: Blob): Promise<string> {
+  return new Promise((resolve, reject) => {
+    const reader = new FileReader();
+    reader.onloadend = () => {
+      if (reader.result) {
+        resolve(reader.result as string);
+      } else {
+        reject(new Error("FileReader did not return a result."));
+      }
+    };
+    reader.onerror = () =>
+      reject(new Error("FileReader encountered an error."));
+    reader.readAsDataURL(blob);
+  });
+}
--- a/frontend/src/components/select-and-edit/EditPopup.tsx
+++ b/frontend/src/components/select-and-edit/EditPopup.tsx
@ -0,0 +1,143 @@
+import React, { useEffect, useRef, useState } from "react";
+import { Textarea } from "../ui/textarea";
+import { Button } from "../ui/button";
+import { addHighlight, getAdjustedCoordinates, removeHighlight } from "./utils";
+import { useAppStore } from "../../store/app-store";
+
+interface EditPopupProps {
+  event: MouseEvent | null;
+  iframeRef: React.RefObject<HTMLIFrameElement>;
+  doUpdate: (updateInstruction: string, selectedElement?: HTMLElement) => void;
+}
+
+const EditPopup: React.FC<EditPopupProps> = ({
+  event,
+  iframeRef,
+  doUpdate,
+}) => {
+  // App state
+  const { inSelectAndEditMode } = useAppStore();
+
+  // Create a wrapper ref to store inSelectAndEditMode so the value is not stale
+  // in a event listener
+  const inSelectAndEditModeRef = useRef(inSelectAndEditMode);
+
+  // Update the ref whenever the state changes
+  useEffect(() => {
+    inSelectAndEditModeRef.current = inSelectAndEditMode;
+  }, [inSelectAndEditMode]);
+
+  // Popup state
+  const [popupVisible, setPopupVisible] = useState(false);
+  const [popupPosition, setPopupPosition] = useState({ x: 0, y: 0 });
+
+  // Edit state
+  const [selectedElement, setSelectedElement] = useState<
+    HTMLElement | undefined
+  >(undefined);
+  const [updateText, setUpdateText] = useState("");
+
+  // Textarea ref for focusing
+  const textareaRef = useRef<HTMLTextAreaElement | null>(null);
+
+  function onUpdate(updateText: string) {
+    // Perform the update
+    doUpdate(
+      updateText,
+      selectedElement ? removeHighlight(selectedElement) : selectedElement
+    );
+
+    // Unselect the element
+    setSelectedElement(undefined);
+
+    // Hide the popup
+    setPopupVisible(false);
+  }
+
+  // Remove highlight and reset state when not in select and edit mode
+  useEffect(() => {
+    if (!inSelectAndEditMode) {
+      if (selectedElement) removeHighlight(selectedElement);
+      setSelectedElement(undefined);
+      setPopupVisible(false);
+    }
+  }, [inSelectAndEditMode, selectedElement]);
+
+  // Handle the click event
+  useEffect(() => {
+    // Return if not in select and edit mode
+    if (!inSelectAndEditModeRef.current || !event) {
+      return;
+    }
+
+    // Prevent default to avoid issues like label clicks triggering textareas, etc.
+    event.preventDefault();
+
+    const targetElement = event.target as HTMLElement;
+
+    // Return if no target element
+    if (!targetElement) return;
+
+    // Highlight and set the selected element
+    setSelectedElement((prev) => {
+      // Remove style from previous element
+      if (prev) {
+        removeHighlight(prev);
+      }
+      return addHighlight(targetElement);
+    });
+
+    // Calculate adjusted coordinates
+    const adjustedCoordinates = getAdjustedCoordinates(
+      event.clientX,
+      event.clientY,
+      iframeRef.current?.getBoundingClientRect()
+    );
+
+    // Show the popup at the click position
+    setPopupVisible(true);
+    setPopupPosition({ x: adjustedCoordinates.x, y: adjustedCoordinates.y });
+
+    // Reset the update text
+    setUpdateText("");
+
+    // Focus the textarea
+    textareaRef.current?.focus();
+  }, [event, iframeRef]);
+
+  // Focus the textarea when the popup is visible (we can't do this only when handling the click event
+  // because the textarea is not rendered yet)
+  // We need to also do it in the click event because popupVisible doesn't change values in that event
+  useEffect(() => {
+    if (popupVisible) {
+      textareaRef.current?.focus();
+    }
+  }, [popupVisible]);
+
+  if (!popupVisible) return;
+
+  return (
+    <div
+      className="absolute bg-white p-4 border border-gray-300 rounded shadow-lg w-60"
+      style={{ top: popupPosition.y, left: popupPosition.x }}
+    >
+      <Textarea
+        ref={textareaRef}
+        value={updateText}
+        onChange={(e) => setUpdateText(e.target.value)}
+        placeholder="Tell the AI what to change about this element..."
+        onKeyDown={(e) => {
+          if (e.key === "Enter") {
+            e.preventDefault();
+            onUpdate(updateText);
+          }
+        }}
+      />
+      <div className="flex justify-end mt-2">
+        <Button onClick={() => onUpdate(updateText)}>Update</Button>
+      </div>
+    </div>
+  );
+};
+
+export default EditPopup;
--- a/frontend/src/components/select-and-edit/SelectAndEditModeToggleButton.tsx
+++ b/frontend/src/components/select-and-edit/SelectAndEditModeToggleButton.tsx
@ -0,0 +1,22 @@
+import { GiClick } from "react-icons/gi";
+import { useAppStore } from "../../store/app-store";
+import { Button } from "../ui/button";
+
+function SelectAndEditModeToggleButton() {
+  const { inSelectAndEditMode, toggleInSelectAndEditMode } = useAppStore();
+
+  return (
+    <Button
+      onClick={toggleInSelectAndEditMode}
+      className="flex items-center gap-x-2 dark:text-white dark:bg-gray-700 regenerate-btn"
+      variant={inSelectAndEditMode ? "destructive" : "default"}
+    >
+      <GiClick className="text-lg" />
+      <span>
+        {inSelectAndEditMode ? "Exit selection mode" : "Select and update"}
+      </span>
+    </Button>
+  );
+}
+
+export default SelectAndEditModeToggleButton;
--- a/frontend/src/components/select-and-edit/utils.ts
+++ b/frontend/src/components/select-and-edit/utils.ts
@ -0,0 +1,22 @@
+export function removeHighlight(element: HTMLElement) {
+  element.style.outline = "";
+  element.style.backgroundColor = "";
+  return element;
+}
+
+export function addHighlight(element: HTMLElement) {
+  element.style.outline = "2px dashed #1846db";
+  element.style.backgroundColor = "#bfcbf5";
+  return element;
+}
+
+export function getAdjustedCoordinates(
+  x: number,
+  y: number,
+  rect: DOMRect | undefined
+) {
+  const offsetX = rect ? rect.left : 0;
+  const offsetY = rect ? rect.top : 0;
+
+  return { x: x + offsetX, y: y + offsetY };
+}
--- a/frontend/src/components/settings/AccessCodeSection.tsx
+++ b/frontend/src/components/settings/AccessCodeSection.tsx
@ -1,142 +0,0 @@
-import { useEffect, useState } from "react";
-import { Settings } from "../../types";
-import { Button } from "../ui/button";
-import { Input } from "../ui/input";
-import { Label } from "../ui/label";
-import useThrottle from "../../hooks/useThrottle";
-import { Progress } from "../ui/progress";
-import { PICO_BACKEND_FORM_SECRET } from "../../config";
-
-interface Props {
-  settings: Settings;
-  setSettings: React.Dispatch<React.SetStateAction<Settings>>;
-}
-
-interface UsageResponse {
-  used_credits: number;
-  total_credits: number;
-  is_valid: boolean;
-}
-
-enum FetchState {
-  EMPTY = "EMPTY",
-  LOADING = "LOADING",
-  INVALID = "INVALID",
-  VALID = "VALID",
-}
-
-function AccessCodeSection({ settings, setSettings }: Props) {
-  const [isLoading, setIsLoading] = useState(false);
-  const [isValid, setIsValid] = useState(false);
-  const [usedCredits, setUsedCredits] = useState(0);
-  const [totalCredits, setTotalCredits] = useState(0);
-  const throttledAccessCode = useThrottle(settings.accessCode || "", 500);
-
-  const fetchState = (() => {
-    if (!settings.accessCode) return FetchState.EMPTY;
-    if (isLoading) return FetchState.LOADING;
-    if (!isValid) return FetchState.INVALID;
-    return FetchState.VALID;
-  })();
-
-  async function fetchUsage(accessCode: string) {
-    const res = await fetch(
-      "https://backend.buildpicoapps.com/screenshot_to_code/get_access_code_usage",
-      {
-        method: "POST",
-        headers: {
-          "Content-Type": "application/json",
-        },
-        body: JSON.stringify({
-          access_code: accessCode,
-          secret: PICO_BACKEND_FORM_SECRET,
-        }),
-      }
-    );
-    const usage = (await res.json()) as UsageResponse;
-
-    if (!usage.is_valid) {
-      setIsValid(false);
-    } else {
-      setIsValid(true);
-      setUsedCredits(usage.used_credits);
-      setTotalCredits(usage.total_credits);
-    }
-
-    setIsLoading(false);
-  }
-
-  useEffect(() => {
-    // Don't do anything if access code is empty
-    if (!throttledAccessCode) return;
-
-    setIsLoading(true);
-    setIsValid(true);
-
-    // Wait for 500 ms before fetching usage
-    setTimeout(async () => {
-      await fetchUsage(throttledAccessCode);
-    }, 500);
-  }, [throttledAccessCode]);
-
-  return (
-    <div className="flex flex-col space-y-4 bg-slate-200 p-4 rounded dark:text-white dark:bg-slate-800">
-      <Label htmlFor="access-code">
-        <div>Access Code</div>
-      </Label>
-
-      <Input
-        id="access-code"
-        className="border-gray-700 dark:border-gray-700 dark:bg-gray-800 dark:text-white"
-        placeholder="Enter your Screenshot to Code access code"
-        value={settings.accessCode || ""}
-        onChange={(e) =>
-          setSettings((s) => ({
-            ...s,
-            accessCode: e.target.value,
-          }))
-        }
-      />
-
-      {fetchState === "EMPTY" && (
-        <div className="flex items-center justify-between">
-          <a href="https://buy.stripe.com/8wM6sre70gBW1nqaEE" target="_blank">
-            <Button size="sm" variant="secondary">
-              Buy credits
-            </Button>
-          </a>
-        </div>
-      )}
-
-      {fetchState === "LOADING" && (
-        <div className="flex items-center justify-between">
-          <span className="text-xs text-gray-700">Loading...</span>
-        </div>
-      )}
-
-      {fetchState === "INVALID" && (
-        <>
-          <div className="flex items-center justify-between">
-            <span className="text-xs text-gray-700">Invalid access code</span>
-          </div>
-        </>
-      )}
-
-      {fetchState === "VALID" && (
-        <>
-          <Progress value={(usedCredits / totalCredits) * 100} />
-          <div className="flex items-center justify-between">
-            <span className="text-xs text-gray-700">
-              {usedCredits} out of {totalCredits} credits used
-            </span>
-            <a href="https://buy.stripe.com/8wM6sre70gBW1nqaEE" target="_blank">
-              <Button size="sm">Add credits</Button>
-            </a>
-          </div>
-        </>
-      )}
-    </div>
-  );
-}
-
-export default AccessCodeSection;
--- a/frontend/src/components/ui/select.tsx
+++ b/frontend/src/components/ui/select.tsx
@ -159,4 +159,4 @@ export {
  SelectSeparator,
  SelectScrollUpButton,
  SelectScrollDownButton,
-}
+}
--- a/frontend/src/constants.ts
+++ b/frontend/src/constants.ts
@ -1 +1,3 @@
+//  WebSocket protocol (RFC 6455) allows for the use of custom close codes in the range 4000-4999
+export const APP_ERROR_WEB_SOCKET_CODE = 4332;
 export const USER_CLOSE_WEB_SOCKET_CODE = 4333;
--- a/frontend/src/generateCode.ts
+++ b/frontend/src/generateCode.ts
@ -1,6 +1,9 @@
 import toast from "react-hot-toast";
 import { WS_BACKEND_URL } from "./config";
-import { USER_CLOSE_WEB_SOCKET_CODE } from "./constants";
+import {
+  APP_ERROR_WEB_SOCKET_CODE,
+  USER_CLOSE_WEB_SOCKET_CODE,
+} from "./constants";
 import { FullGenerationSettings } from "./types";

 const ERROR_MESSAGE =
@ -46,9 +49,13 @@ export function generateCode(
    if (event.code === USER_CLOSE_WEB_SOCKET_CODE) {
      toast.success(CANCEL_MESSAGE);
      onCancel();
+    } else if (event.code === APP_ERROR_WEB_SOCKET_CODE) {
+      console.error("Known server error", event);
+      onCancel();
    } else if (event.code !== 1000) {
-      console.error("WebSocket error code", event);
+      console.error("Unknown server or connection error", event);
      toast.error(ERROR_MESSAGE);
+      onCancel();
    } else {
      onComplete();
    }
--- a/frontend/src/hooks/useBrowserTabIndicator.ts
+++ b/frontend/src/hooks/useBrowserTabIndicator.ts
@ -0,0 +1,29 @@
+import { useEffect } from "react";
+
+const CODING_SETTINGS = {
+  title: "Coding...",
+  favicon: "/favicon/coding.png",
+};
+const DEFAULT_SETTINGS = {
+  title: "Screenshot to Code",
+  favicon: "/favicon/main.png",
+};
+
+const useBrowserTabIndicator = (isCoding: boolean) => {
+  useEffect(() => {
+    const settings = isCoding ? CODING_SETTINGS : DEFAULT_SETTINGS;
+
+    // Set favicon
+    const faviconEl = document.querySelector(
+      "link[rel='icon']"
+    ) as HTMLLinkElement | null;
+    if (faviconEl) {
+      faviconEl.href = settings.favicon;
+    }
+
+    // Set title
+    document.title = settings.title;
+  }, [isCoding]);
+};
+
+export default useBrowserTabIndicator;
--- a/frontend/src/lib/models.ts
+++ b/frontend/src/lib/models.ts
@ -0,0 +1,20 @@
+// Keep in sync with backend (llm.py)
+// Order here matches dropdown order
+export enum CodeGenerationModel {
+  CLAUDE_3_5_SONNET_2024_06_20 = "claude-3-5-sonnet-20240620",
+  GPT_4O_2024_05_13 = "gpt-4o-2024-05-13",
+  GPT_4_TURBO_2024_04_09 = "gpt-4-turbo-2024-04-09",
+  GPT_4_VISION = "gpt_4_vision",
+  CLAUDE_3_SONNET = "claude_3_sonnet",
+}
+
+// Will generate a static error if a model in the enum above is not in the descriptions
+export const CODE_GENERATION_MODEL_DESCRIPTIONS: {
+  [key in CodeGenerationModel]: { name: string; inBeta: boolean };
+} = {
+  "gpt-4o-2024-05-13": { name: "GPT-4o 🌟", inBeta: false },
+  "claude-3-5-sonnet-20240620": { name: "Claude 3.5 Sonnet 🌟", inBeta: false },
+  "gpt-4-turbo-2024-04-09": { name: "GPT-4 Turbo (Apr 2024)", inBeta: false },
+  gpt_4_vision: { name: "GPT-4 Vision (Nov 2023)", inBeta: false },
+  claude_3_sonnet: { name: "Claude 3 Sonnet", inBeta: false },
+};
--- a/frontend/src/setupTests.ts
+++ b/frontend/src/setupTests.ts
@ -0,0 +1,3 @@
+// So jest test runner can read env vars from .env file
+import { config } from "dotenv";
+config({ path: ".env.jest" });
--- a/frontend/src/store/app-store.ts
+++ b/frontend/src/store/app-store.ts
@ -0,0 +1,15 @@
+import { create } from "zustand";
+
+// Store for app-wide state
+interface AppStore {
+  inSelectAndEditMode: boolean;
+  toggleInSelectAndEditMode: () => void;
+  disableInSelectAndEditMode: () => void;
+}
+
+export const useAppStore = create<AppStore>((set) => ({
+  inSelectAndEditMode: false,
+  toggleInSelectAndEditMode: () =>
+    set((state) => ({ inSelectAndEditMode: !state.inSelectAndEditMode })),
+  disableInSelectAndEditMode: () => set({ inSelectAndEditMode: false }),
+}));
--- a/frontend/src/tests/fixtures/simple_button.png
+++ b/frontend/src/tests/fixtures/simple_button.png
--- a/frontend/src/tests/fixtures/simple_ui_with_image.png
+++ b/frontend/src/tests/fixtures/simple_ui_with_image.png
--- a/frontend/src/tests/qa.test.ts
+++ b/frontend/src/tests/qa.test.ts
@ -0,0 +1,274 @@
+import puppeteer, { Browser, Page, ElementHandle } from "puppeteer";
+import { Stack } from "../lib/stacks";
+import { CodeGenerationModel } from "../lib/models";
+
+const TESTS_ROOT_PATH = process.env.TEST_ROOT_PATH;
+
+// Fixtures
+const FIXTURES_PATH = `${TESTS_ROOT_PATH}/fixtures`;
+const SIMPLE_SCREENSHOT = FIXTURES_PATH + "/simple_button.png";
+const SCREENSHOT_WITH_IMAGES = `${FIXTURES_PATH}/simple_ui_with_image.png`;
+
+// Results
+const RESULTS_DIR = `${TESTS_ROOT_PATH}/results`;
+
+describe("e2e tests", () => {
+  let browser: Browser;
+  let page: Page;
+
+  const DEBUG = false;
+  const IS_HEADLESS = true;
+
+  const stacks = Object.values(Stack).slice(0, DEBUG ? 1 : undefined);
+  const models = Object.values(CodeGenerationModel).slice(
+    0,
+    DEBUG ? 1 : undefined
+  );
+
+  beforeAll(async () => {
+    browser = await puppeteer.launch({ headless: IS_HEADLESS });
+    page = await browser.newPage();
+    await page.goto("http://localhost:5173/");
+
+    // Set screen size
+    await page.setViewport({ width: 1080, height: 1024 });
+
+    // TODO: Does this need to be moved?
+    // const client = await page.createCDPSession();
+    // Set download behavior path
+    // await client.send("Page.setDownloadBehavior", {
+    //   behavior: "allow",
+    //   downloadPath: DOWNLOAD_PATH,
+    // });
+  });
+
+  afterAll(async () => {
+    await browser.close();
+  });
+
+  // Create tests
+  models.forEach((model) => {
+    stacks.forEach((stack) => {
+      it(
+        `Create for : ${model} & ${stack}`,
+        async () => {
+          const app = new App(
+            page,
+            stack,
+            model,
+            `create_screenshot_${model}_${stack}`
+          );
+          await app.init();
+          // Generate from screenshot
+          await app.uploadImage(SCREENSHOT_WITH_IMAGES);
+        },
+        60 * 1000
+      );
+
+      it(
+        `Create from URL for : ${model} & ${stack}`,
+        async () => {
+          const app = new App(
+            page,
+            stack,
+            model,
+            `create_url_${model}_${stack}`
+          );
+          await app.init();
+          // Generate from screenshot
+          await app.generateFromUrl("https://a.picoapps.xyz/design-fear");
+        },
+        60 * 1000
+      );
+    });
+  });
+
+  // Update tests - for every model (doesn’t need to be repeated for each stack - fix to HTML Tailwind only)
+  models.forEach((model) => {
+    ["html_tailwind"].forEach((stack) => {
+      it(
+        `update: ${model}`,
+        async () => {
+          const app = new App(page, stack, model, `update_${model}_${stack}`);
+          await app.init();
+
+          // Generate from screenshot
+          await app.uploadImage(SIMPLE_SCREENSHOT);
+          // Regenerate works for v1
+          await app.regenerate();
+          // Make an update
+          await app.edit("make the button background blue", "v2");
+          // Make another update
+          await app.edit("make the text italic", "v3");
+          // Branch off v2 and make an update
+          await app.clickVersion("v2");
+          await app.edit("make the text yellow", "v4");
+        },
+        90 * 1000
+      );
+    });
+  });
+
+  // Start from code tests - for every model
+  models.forEach((model) => {
+    ["html_tailwind"].forEach((stack) => {
+      it.skip(
+        `Start from code: ${model}`,
+        async () => {
+          const app = new App(
+            page,
+            stack,
+            model,
+            `start_from_code_${model}_${stack}`
+          );
+          await app.init();
+
+          await app.importFromCode();
+
+          // Regenerate works for v1
+          // await app.regenerate();
+          // // Make an update
+          // await app.edit("make the header blue", "v2");
+          // // Make another update
+          // await app.edit("make all text italic", "v3");
+          // // Branch off v2 and make an update
+          // await app.clickVersion("v2");
+          // await app.edit("make all text red", "v4");
+        },
+        90 * 1000
+      );
+    });
+  });
+});
+
+class App {
+  private screenshotPathPrefix: string;
+  private page: Page;
+  private stack: string;
+  private model: string;
+
+  constructor(page: Page, stack: string, model: string, testId: string) {
+    this.page = page;
+    this.stack = stack;
+    this.model = model;
+    this.screenshotPathPrefix = `${RESULTS_DIR}/${testId}`;
+  }
+
+  async init() {
+    await this.setupLocalStorage();
+  }
+
+  async setupLocalStorage() {
+    const setting = {
+      openAiApiKey: null,
+      openAiBaseURL: null,
+      screenshotOneApiKey: process.env.TEST_SCREENSHOTONE_API_KEY,
+      isImageGenerationEnabled: true,
+      editorTheme: "cobalt",
+      generatedCodeConfig: this.stack,
+      codeGenerationModel: this.model,
+      isTermOfServiceAccepted: false,
+      accessCode: null,
+    };
+
+    await this.page.evaluate((setting) => {
+      localStorage.setItem("setting", JSON.stringify(setting));
+    }, setting);
+
+    // Reload the page to apply the local storage
+    await this.page.reload();
+  }
+
+  async _screenshot(step: string) {
+    await this.page.screenshot({
+      path: `${this.screenshotPathPrefix}_${step}.png`,
+    });
+  }
+
+  async _waitUntilVersionIsReady(version: string) {
+    await this.page.waitForNetworkIdle();
+    await this.page.waitForFunction(
+      (version) => document.body.innerText.includes(version),
+      {
+        timeout: 30000,
+      },
+      version
+    );
+    // Wait for 3s so that the HTML and JS has time to render before screenshotting
+    await new Promise((resolve) => setTimeout(resolve, 3000));
+  }
+
+  async generateFromUrl(url: string) {
+    // Type in the URL
+    await this.page.type('input[placeholder="Enter URL"]', url);
+    await this._screenshot("typed_url");
+
+    // Click the capture button and wait for the code to be generated
+    await this.page.click("button.capture-btn");
+    await this._waitUntilVersionIsReady("v1");
+    await this._screenshot("url_result");
+  }
+
+  // Uploads a screenshot and generates the image
+  async uploadImage(screenshotPath: string) {
+    // Upload file
+    const fileInput = (await this.page.$(
+      ".file-input"
+    )) as ElementHandle<HTMLInputElement>;
+    if (!fileInput) {
+      throw new Error("File input element not found");
+    }
+    await fileInput.uploadFile(screenshotPath);
+    await this._screenshot("image_uploaded");
+
+    // Click the generate button and wait for the code to be generated
+    await this._waitUntilVersionIsReady("v1");
+    await this._screenshot("image_results");
+  }
+
+  // Makes a text edit and waits for a new version
+  async edit(edit: string, version: string) {
+    // Type in the edit
+    await this.page.type(
+      'textarea[placeholder="Tell the AI what to change..."]',
+      edit
+    );
+    await this._screenshot(`typed_${version}`);
+
+    // Click the update button and wait for the code to be generated
+    await this.page.click(".update-btn");
+    await this._waitUntilVersionIsReady(version);
+    await this._screenshot(`done_${version}`);
+  }
+
+  async clickVersion(version: string) {
+    await this.page.evaluate((version) => {
+      document.querySelectorAll("div").forEach((div) => {
+        if (div.innerText.includes(version)) {
+          div.click();
+        }
+      });
+    }, version);
+  }
+
+  async regenerate() {
+    await this.page.click(".regenerate-btn");
+    await this._waitUntilVersionIsReady("v1");
+    await this._screenshot("regenerate_results");
+  }
+
+  // Work in progress
+  async importFromCode() {
+    await this.page.click(".import-from-code-btn");
+
+    await this.page.type("textarea", "<html>hello world</html>");
+
+    await this.page.select("#output-settings-js", "HTML + Tailwind");
+
+    await this._screenshot("typed_code");
+
+    await this.page.click(".import-btn");
+
+    await this._waitUntilVersionIsReady("v1");
+  }
+}
--- a/frontend/src/types.ts
+++ b/frontend/src/types.ts
@ -1,4 +1,5 @@
 import { Stack } from "./lib/stacks";
+import { CodeGenerationModel } from "./lib/models";

 export enum EditorTheme {
  ESPRESSO = "espresso",
@ -12,9 +13,10 @@ export interface Settings {
  isImageGenerationEnabled: boolean;
  editorTheme: EditorTheme;
  generatedCodeConfig: Stack;
+  codeGenerationModel: CodeGenerationModel;
  // Only relevant for hosted version
  isTermOfServiceAccepted: boolean;
-  accessCode: string | null;
+  anthropicApiKey: string | null; // Added property for anthropic API key
 }

 export enum AppState {
@ -23,8 +25,15 @@ export enum AppState {
  CODE_READY = "CODE_READY",
 }

+export enum ScreenRecorderState {
+  INITIAL = "initial",
+  RECORDING = "recording",
+  FINISHED = "finished",
+}
+
 export interface CodeGenerationParams {
  generationType: "create" | "update";
+  inputMode: "image" | "video";
  image: string;
  resultImage?: string;
  history?: string[];
--- a/frontend/src/urls.ts
+++ b/frontend/src/urls.ts
@ -0,0 +1,5 @@
+export const URLS = {
+  "intro-to-video":
+    "https://github.com/abi/screenshot-to-code/wiki/Screen-Recording-to-Code",
+  tips: "https://git.new/s5ywP0e",
+};
--- a/frontend/yarn.lock
+++ b/frontend/yarn.lock
--- a/sweep.yaml
+++ b/sweep.yaml
@ -1,42 +0,0 @@
-# Sweep AI turns bugs & feature requests into code changes (https://sweep.dev)
-# For details on our config file, check out our docs at https://docs.sweep.dev/usage/config
-
-# This setting contains a list of rules that Sweep will check for. If any of these rules are broken in a new commit, Sweep will create an pull request to fix the broken rule.
-rules:
-  - "All docstrings and comments should be up to date."
-['All new business logic should have corresponding unit tests.', 'Refactor large functions to be more modular.', 'Add docstrings to all functions and file headers.']
-
-# This is the branch that Sweep will develop from and make pull requests to. Most people use 'main' or 'master' but some users also use 'dev' or 'staging'.
-branch: 'main'
-
-# By default Sweep will read the logs and outputs from your existing Github Actions. To disable this, set this to false.
-gha_enabled: True
-
-# This is the description of your project. It will be used by sweep when creating PRs. You can tell Sweep what's unique about your project, what frameworks you use, or anything else you want.
-#
-# Example:
-#
-# description: sweepai/sweep is a python project. The main api endpoints are in sweepai/api.py. Write code that adheres to PEP8.
-description: ''
-
-# This sets whether to create pull requests as drafts. If this is set to True, then all pull requests will be created as drafts and GitHub Actions will not be triggered.
-draft: False
-
-# This is a list of directories that Sweep will not be able to edit.
-blocked_dirs: []
-
-# This is a list of documentation links that Sweep will use to help it understand your code. You can add links to documentation for any packages you use here.
-#
-# Example:
-#
-# docs:
-#   - PyGitHub: ["https://pygithub.readthedocs.io/en/latest/", "We use pygithub to interact with the GitHub API"]
-docs: []
-
-# Sandbox executes commands in a sandboxed environment to validate code changes after every edit to guarantee pristine code. For more details, see the [Sandbox](./sandbox) page.
-sandbox:
-  install:
-    - trunk init
-  check:
-    - trunk fmt {file_path} || return 0
-    - trunk check --fix --print-failures {file_path}