From cd7cd841ed9ac110fb98cae1bf4548a6ecab22a9 Mon Sep 17 00:00:00 2001 From: Abi Raja Date: Tue, 5 Mar 2024 21:52:24 -0500 Subject: [PATCH] Update evaluating-claude.md --- blog/evaluating-claude.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/blog/evaluating-claude.md b/blog/evaluating-claude.md index 99dc94d..373a343 100644 --- a/blog/evaluating-claude.md +++ b/blog/evaluating-claude.md @@ -2,6 +2,8 @@ Claude 3 dropped yesterday, claiming to rival GPT-4 on a wide variety of tasks. I maintain a very popular open source project called “screenshot-to-code” (this one!) that uses GPT-4 vision to convert screenshots/designs into clean code. Naturally, I was excited to see how good Claude 3 was at this task. +**TLDR:** Claude 3 is on par with GPT-4 vision for screenshot to code, better in some ways but worse in others. + ## Evaluation Setup I don’t know of a public benchmark for “screenshot to code” so I created simple evaluation setup for the purposes of testing: