Merge branch 'main' into qa-testing

2024-05-20 16:11:55 -04:00 · 2024-05-20 16:11:55 -04:00 · f01403d480
commit f01403d480
parent df38041e77 1f61d02da6
13 changed files with 151 additions and 39 deletions
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@ -0,0 +1,21 @@
 ---
 name: Bug report
 about: Create a report to help us improve
 title: ''
 labels: ''
 assignees: ''
 ---
 **Describe the bug**
 A clear and concise description of what the bug is.
 **To Reproduce**
 Steps to reproduce the behavior:
 1. Go to '...'
 2. Click on '....'
 3. Scroll down to '....'
 4. See error
 **Screenshots of backend AND frontend terminal logs**
 If applicable, add screenshots to help explain your problem.
--- a/.github/ISSUE_TEMPLATE/custom.md
+++ b/.github/ISSUE_TEMPLATE/custom.md
@ -0,0 +1,10 @@
 ---
 name: Custom issue template
 about: Describe this issue template's purpose here.
 title: ''
 labels: ''
 assignees: ''
 ---
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@ -0,0 +1,20 @@
 ---
 name: Feature request
 about: Suggest an idea for this project
 title: ''
 labels: ''
 assignees: ''
 ---
 **Is your feature request related to a problem? Please describe.**
 A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
 **Describe the solution you'd like**
 A clear and concise description of what you want to happen.
 **Describe alternatives you've considered**
 A clear and concise description of any alternative solutions or features you've considered.
 **Additional context**
 Add any other context or screenshots about the feature request here.
--- a/README.md
+++ b/README.md
@ -1,6 +1,6 @@
 # screenshot-to-code
-A simple tool to convert screenshots, mockups and Figma designs into clean, functional code using AI.
+A simple tool to convert screenshots, mockups and Figma designs into clean, functional code using AI. **Now supporting GPT-4O!**
 https://github.com/abi/screenshot-to-code/assets/23818/6cebadae-2fe3-4986-ac6a-8fb9db030045
@ -15,8 +15,10 @@ Supported stacks:
 Supported AI models:
- GPT-4 Vision
+- GPT-4O - Best model!
- Claude 3 Sonnet (faster, and on par or better than GPT-4 vision for many inputs)
+- GPT-4 Turbo (Apr 2024)
 - GPT-4 Vision (Nov 2023)
 - Claude 3 Sonnet
 - DALL-E 3 for image generation
 See the [Examples](#-examples) section below for more demos.
@ -82,7 +84,7 @@ The app will be up and running at http://localhost:5173. Note that you can't dev
 - **I'm running into an error when setting up the backend. How can I fix it?** [Try this](https://github.com/abi/screenshot-to-code/issues/3#issuecomment-1814777959). If that still doesn't work, open an issue.
 - **How do I get an OpenAI API key?** See https://github.com/abi/screenshot-to-code/blob/main/Troubleshooting.md
- **How can I configure an OpenAI proxy?** - you can configure the OpenAI base URL if you need to use a proxy: Set OPENAI_BASE_URL in the `backend/.env` or directly in the UI in the settings dialog
+- **How can I configure an OpenAI proxy?** - If you're not able to access the OpenAI API directly (due to e.g. country restrictions), you can try a VPN or you can configure the OpenAI base URL to use a proxy: Set OPENAI_BASE_URL in the `backend/.env` or directly in the UI in the settings dialog. Make sure the URL has "v1" in the path so it should look like this:  `https://xxx.xxxxx.xxx/v1`
 - **How can I update the backend host that my front-end connects to?** - Configure VITE_HTTP_BACKEND_URL and VITE_WS_BACKEND_URL in front/.env.local For example, set VITE_HTTP_BACKEND_URL=http://124.10.20.1:7001
 - **Seeing UTF-8 errors when running the backend?** - On windows, open the .env file with notepad++, then go to Encoding and select UTF-8. 
 - **How can I provide feedback?** For feedback, feature requests and bug reports, open an issue or ping me on [Twitter](https://twitter.com/_abi_).
--- a/Troubleshooting.md
+++ b/Troubleshooting.md
@ -11,7 +11,8 @@ You don't need a ChatGPT Pro account. Screenshot to code uses API keys from your
 5. Go to Settings > Limits and check at the bottom of the page, your current tier has to be "Tier 1" to have GPT4 access
 <img width="900" alt="285636973-da38bd4d-8a78-4904-8027-ca67d729b933" src="https://github.com/abi/screenshot-to-code/assets/23818/8d07cd84-0cf9-4f88-bc00-80eba492eadf">
-6. Go to Screenshot to code and paste it in the Settings dialog under OpenAI key (gear icon). Your key is only stored in your browser. Never stored on our servers.
+6. Navigate to OpenAI [api keys](https://platform.openai.com/api-keys) page and create and copy a new secret key.
 7. Go to Screenshot to code and paste it in the Settings dialog under OpenAI key (gear icon). Your key is only stored in your browser. Never stored on our servers.
 ## Still not working?
--- a/backend/llm.py
+++ b/backend/llm.py
@ -13,6 +13,7 @@ from utils import pprint_prompt
 class Llm(Enum):
    GPT_4_VISION = "gpt-4-vision-preview"
    GPT_4_TURBO_2024_04_09 = "gpt-4-turbo-2024-04-09"
    GPT_4O_2024_05_13 = "gpt-4o-2024-05-13"
    CLAUDE_3_SONNET = "claude-3-sonnet-20240229"
    CLAUDE_3_OPUS = "claude-3-opus-20240229"
    CLAUDE_3_HAIKU = "claude-3-haiku-20240307"
@ -47,7 +48,11 @@ async def stream_openai_response(
    }
    # Add 'max_tokens' only if the model is a GPT4 vision or Turbo model
-    if model == Llm.GPT_4_VISION or model == Llm.GPT_4_TURBO_2024_04_09:
+    if (
        model == Llm.GPT_4_VISION
        or model == Llm.GPT_4_TURBO_2024_04_09
        or model == Llm.GPT_4O_2024_05_13
    ):
        params["max_tokens"] = 4096
    stream = await client.chat.completions.create(**params)  # type: ignore
--- a/backend/routes/evals.py
+++ b/backend/routes/evals.py
@ -7,10 +7,13 @@ from evals.config import EVALS_DIR
 router = APIRouter()
 # Update this if the number of outputs generated per input changes
 N = 1
 class Eval(BaseModel):
    input: str
-    output: str
+    outputs: list[str]
@router.get("/evals")
@ -25,21 +28,27 @@ async def get_evals():
            input_file_path = os.path.join(input_dir, file)
            input_file = await image_to_data_url(input_file_path)
-            # Construct the corresponding output file name
+            # Construct the corresponding output file names
-            output_file_name = file.replace(".png", ".html")
+            output_file_names = [
-            output_file_path = os.path.join(output_dir, output_file_name)
+                file.replace(".png", f"_{i}.html") for i in range(0, N)
            ]  # Assuming 3 outputs for each input
-            # Check if the output file exists
+            output_files_data: list[str] = []
-            if os.path.exists(output_file_path):
+            for output_file_name in output_file_names:
-                with open(output_file_path, "r") as f:
+                output_file_path = os.path.join(output_dir, output_file_name)
-                    output_file_data = f.read()
+                # Check if the output file exists
-            else:
+                if os.path.exists(output_file_path):
-                output_file_data = "Output file not found."
+                    with open(output_file_path, "r") as f:
                        output_files_data.append(f.read())
                else:
                    output_files_data.append(
                        "<html><h1>Output file not found.</h1></html>"
                    )
            evals.append(
                Eval(
                    input=input_file,
-                    output=output_file_data,
+                    outputs=output_files_data,
                )
            )
--- a/backend/routes/generate_code.py
+++ b/backend/routes/generate_code.py
@ -85,7 +85,7 @@ async def stream_code(websocket: WebSocket):
    # Read the model from the request. Fall back to default if not provided.
    code_generation_model_str = params.get(
-        "codeGenerationModel", Llm.GPT_4_VISION.value
+        "codeGenerationModel", Llm.GPT_4O_2024_05_13.value
    )
    try:
        code_generation_model = convert_frontend_str_to_llm(code_generation_model_str)
@ -112,6 +112,7 @@ async def stream_code(websocket: WebSocket):
    if not openai_api_key and (
        code_generation_model == Llm.GPT_4_VISION
        or code_generation_model == Llm.GPT_4_TURBO_2024_04_09
        or code_generation_model == Llm.GPT_4O_2024_05_13
    ):
        print("OpenAI API key not found")
        await throw_error(
--- a/backend/run_evals.py
+++ b/backend/run_evals.py
@ -13,8 +13,9 @@ from evals.config import EVALS_DIR
 from evals.core import generate_code_core
 from evals.utils import image_to_data_url
-STACK = "html_tailwind"
+STACK = "ionic_tailwind"
-MODEL = Llm.CLAUDE_3_SONNET
+MODEL = Llm.GPT_4O_2024_05_13
 N = 1  # Number of outputs to generate
 async def main():
@ -28,16 +29,21 @@ async def main():
    for filename in evals:
        filepath = os.path.join(INPUT_DIR, filename)
        data_url = await image_to_data_url(filepath)
-        task = generate_code_core(image_url=data_url, stack=STACK, model=MODEL)
+        for _ in range(N):  # Generate N tasks for each input
-        tasks.append(task)
+            task = generate_code_core(image_url=data_url, stack=STACK, model=MODEL)
            tasks.append(task)
    results = await asyncio.gather(*tasks)
    os.makedirs(OUTPUT_DIR, exist_ok=True)
-    for filename, content in zip(evals, results):
+    for i, content in enumerate(results):
-        # File name is derived from the original filename in evals
+        # Calculate index for filename and output number
-        output_filename = f"{os.path.splitext(filename)[0]}.html"
+        eval_index = i // N
        output_number = i % N
        filename = evals[eval_index]
        # File name is derived from the original filename in evals with an added output number
        output_filename = f"{os.path.splitext(filename)[0]}_{output_number}.html"
        output_filepath = os.path.join(OUTPUT_DIR, output_filename)
        with open(output_filepath, "w") as file:
            file.write(content)
--- a/backend/test_llm.py
+++ b/backend/test_llm.py
@ -24,6 +24,11 @@ class TestConvertFrontendStrToLlm(unittest.TestCase):
            Llm.GPT_4_TURBO_2024_04_09,
            "Should convert 'gpt-4-turbo-2024-04-09' to Llm.GPT_4_TURBO_2024_04_09",
        )
        self.assertEqual(
            convert_frontend_str_to_llm("gpt-4o-2024-05-13"),
            Llm.GPT_4O_2024_05_13,
            "Should convert 'gpt-4o-2024-05-13' to Llm.GPT_4O_2024_05_13",
        )
    def test_convert_invalid_string_raises_exception(self):
        with self.assertRaises(ValueError):
--- a/frontend/src/App.tsx
+++ b/frontend/src/App.tsx
@ -63,7 +63,7 @@ function App() {
      isImageGenerationEnabled: true,
      editorTheme: EditorTheme.COBALT,
      generatedCodeConfig: Stack.HTML_TAILWIND,
-      codeGenerationModel: CodeGenerationModel.GPT_4_VISION,
+      codeGenerationModel: CodeGenerationModel.GPT_4O_2024_05_13,
      // Only relevant for hosted version
      isTermOfServiceAccepted: false,
    },
@ -84,6 +84,15 @@ function App() {
  const wsRef = useRef<WebSocket>(null);
  const showReactWarning =
    selectedCodeGenerationModel ===
      CodeGenerationModel.GPT_4_TURBO_2024_04_09 &&
    settings.generatedCodeConfig === Stack.REACT_TAILWIND;
  const showGpt4OMessage =
    selectedCodeGenerationModel !== CodeGenerationModel.GPT_4O_2024_05_13 &&
    appState === AppState.INITIAL;
  // Indicate coding state using the browser tab's favicon and title
  useBrowserTabIndicator(appState === AppState.CODING);
@ -391,6 +400,22 @@ function App() {
            }
          />
          {showReactWarning && (
            <div className="text-sm bg-yellow-200 rounded p-2">
              Sorry - React is not currently working with GPT-4 Turbo. Please
              use GPT-4 Vision or Claude Sonnet. We are working on a fix.
            </div>
          )}
          {showGpt4OMessage && (
            <div className="rounded-lg p-2 bg-fuchsia-200">
              <p className="text-gray-800 text-sm">
                Now supporting GPT-4o. Higher quality and 2x faster. Give it a
                try!
              </p>
            </div>
          )}
          {appState !== AppState.CODE_READY && <TipLink />}
          {IS_RUNNING_ON_CLOUD && !settings.openAiApiKey && <OnboardingNote />}
--- a/frontend/src/components/evals/EvalsPage.tsx
+++ b/frontend/src/components/evals/EvalsPage.tsx
@ -4,7 +4,7 @@ import RatingPicker from "./RatingPicker";
 interface Eval {
  input: string;
-  output: string;
+  outputs: string[];
 }
 function EvalsPage() {
@ -38,18 +38,22 @@ function EvalsPage() {
      <div className="flex flex-col gap-y-4 mt-4 mx-auto justify-center">
        {evals.map((e, index) => (
          <div className="flex flex-col justify-center" key={index}>
-            <div className="flex gap-x-2 justify-center">
+            <h2 className="font-bold text-lg ml-4">{index}</h2>
            <div className="flex gap-x-2 justify-center ml-4">
              {/* Update w if N changes to a fixed number like w-[600px] */}
              <div className="w-1/2 p-1 border">
-                <img src={e.input} />
+                <img src={e.input} alt={`Input for eval ${index}`} />
              </div>
              <div className="w-1/2 p-1 border">
                {/* Put output into an iframe */}
                <iframe
                  srcDoc={e.output}
                  className="w-[1200px] h-[800px] transform scale-[0.60]"
                  style={{ transformOrigin: "top left" }}
                ></iframe>
              </div>
              {e.outputs.map((output, outputIndex) => (
                <div className="w-1/2 p-1 border" key={outputIndex}>
                  {/* Put output into an iframe */}
                  <iframe
                    srcDoc={output}
                    className="w-[1200px] h-[800px] transform scale-[0.60]"
                    style={{ transformOrigin: "top left" }}
                  ></iframe>
                </div>
              ))}
            </div>
            <div className="ml-8 mt-4 flex justify-center">
              <RatingPicker
--- a/frontend/src/lib/models.ts
+++ b/frontend/src/lib/models.ts
@ -1,7 +1,9 @@
 // Keep in sync with backend (llm.py)
 // Order here matches dropdown order
 export enum CodeGenerationModel {
-  GPT_4_VISION = "gpt_4_vision",
+  GPT_4O_2024_05_13 = "gpt-4o-2024-05-13",
  GPT_4_TURBO_2024_04_09 = "gpt-4-turbo-2024-04-09",
  GPT_4_VISION = "gpt_4_vision",
  CLAUDE_3_SONNET = "claude_3_sonnet",
 }
@ -9,7 +11,8 @@ export enum CodeGenerationModel {
 export const CODE_GENERATION_MODEL_DESCRIPTIONS: {
  [key in CodeGenerationModel]: { name: string; inBeta: boolean };
 } = {
  "gpt-4o-2024-05-13": { name: "GPT-4o 🌟", inBeta: false },
  "gpt-4-turbo-2024-04-09": { name: "GPT-4 Turbo (Apr 2024)", inBeta: false },
  gpt_4_vision: { name: "GPT-4 Vision (Nov 2023)", inBeta: false },
  claude_3_sonnet: { name: "Claude 3 Sonnet", inBeta: false },
  "gpt-4-turbo-2024-04-09": { name: "GPT-4 Turbo (Apr 2024)", inBeta: false },
 };