This market predicts what the highest score on the SWE-Bench Pro public dataset leaderboard will be as of January 1, 2026.
Current top performers on SWE-Bench Pro public dataset (as of September 24 2025):
OpenAI GPT-5: 23.26%
Claude Opus 4.1: 22.71%
Resolution Criteria: This market will resolve to the score range that contains the highest score on the official SWE-Bench Pro public dataset leaderboard (https://scale.com/leaderboard/swe_bench_pro_public) as of January 1, 2026.
Update 2025-12-12 (PST) (AI summary of creator comment): The market will resolve based on Scale AI's verified scores on the official SWE-Bench Pro public dataset leaderboard, not self-reported scores from model creators.
Self-reported scores (like Claude Opus 4.5's 52.0% or GPT 5.2 Thinking's 55.6%) will only count if Scale AI independently verifies them
Example: Claude Opus 4.5 reported 52.0% but Scale AI evaluated it at 45.89%, so it would resolve to the 45.89% range
People are also trading
created a market for half 2026 https://manifold.markets/RenanCunha/best-swebench-pro-public-score-by-j
Market might be a little scuffed because I'm cheap with mana. Feel free to make another market for SWE-Bench Pro that is more precise