Synthetic Souls

_posts

Improving GPT-4's codebase understanding with ctags GPT code editing benchmarks Building a better repository map with tree sitter Code editing benchmarks for OpenAI's "1106" models Speed benchmarks of GPT-4 Turbo and gpt-3.5-turbo-1106 Unified diffs make GPT-4 Turbo 3X less lazy The January GPT-4 Turbo is lazier than the last version Claude 3 beats GPT-4 on Aider's code editing benchmark GPT-4 Turbo with Vision is a step backwards for coding Aider in your browser Drawing graphs with aider, GPT-4o and matplotlib A draft post.Linting code for LLMs with tree-sitter How aider scored SOTA 26.3% on SWE Bench Lite Aider has written 7% of its own code Aider is SOTA for both SWE Bench and SWE Bench Lite Sonnet is the opposite of lazy Coding with Llama 3.1, new DeepSeek Coder & Mistral Large LLMs are bad at returning code in JSON Sonnet seems as good as ever

Previousworks-best NextImproving GPT-4's codebase understanding with ctags

Last updated 5 days ago