_posts
Improving GPT-4's codebase understanding with ctagsGPT code editing benchmarksBuilding a better repository map with tree sitterCode editing benchmarks for OpenAI's "1106" modelsSpeed benchmarks of GPT-4 Turbo and gpt-3.5-turbo-1106Unified diffs make GPT-4 Turbo 3X less lazyThe January GPT-4 Turbo is lazier than the last versionClaude 3 beats GPT-4 on Aider's code editing benchmarkGPT-4 Turbo with Vision is a step backwards for codingAider in your browserDrawing graphs with aider, GPT-4o and matplotlibA draft post.Linting code for LLMs with tree-sitterHow aider scored SOTA 26.3% on SWE Bench LiteAider has written 7% of its own codeAider is SOTA for both SWE Bench and SWE Bench LiteSonnet is the opposite of lazyCoding with Llama 3.1, new DeepSeek Coder & Mistral LargeLLMs are bad at returning code in JSONSonnet seems as good as ever
Last updated