Kimi K3 Thinking's METR 50% time horizon | Manifold

Kimi K3 Thinking's METR 50% time horizon

1

1.1kṀ1345

Dec 14

3%

<1h

7%

1h - 1.5h

12%

1.5h - 2h

14%

2h - 2.5h

16%

2.5h - 3h

15%

3h - 3.5h

10%

3.5h - 4h

7%

4h - 4.5h

6%

4.5h - 5h

6%

5h - 5.5h

6%

Other

This market will resolve to the first 50% time horizon, as reported by METR, of Moonshot AI's Kimi K3 Thinking. If a model in the Kimi K3 family of models is evaluated by METR that is able to reason before providing an answer, like a reasoning model, but it doesn't contain "Thinking" in its name (like Kimi K2 Thinking did), this still counts as Kimi K3 Thinking for the purpose of this market. Kimi K3 Code, Kimi K3 Heavy, these all count if they are the first such model to be evaluated and reported on by METR.

50% time horizon is a measure of AI autonomy based on the length of tasks that AI can do: roughly, it is the time that humans take to complete tasks that an AI system can successfully do 50% of the time. See METR's "Measuring AI Ability to Complete Long Tasks" for the technical definition. Claude 3.7 Sonnet, released in February 2025, was the leading model with a 50% horizon of 59 minutes.

Left bounds inclusive, right bounds exclusive.

See also:

/jim/gpt-52-metr

/jim/claude-45-opuss-metr50-horizon (jim's version)

/Bayesian/claude-opus-45s-metr50-time-horizon (my version)
/Bayesian/gemini-3s-50-time-horizon-per-metr

/Bayesian/grok-420s-metr-50-time-horizon

/Bayesian/grok-5s-50-time-horizon-per-metr

/Bayesian/r2s-50-time-horizon-per-metr

/Bayesian/kimi-k3-thinkings-metr-50-time-hori

Technical AI Timelines

Get

1,000

to start trading!

Sort by:

I think METR has stopped operating just to spite you

@bens or opus has a 30 days time horizon so it takes them weeks to run its tests

People are also trading

Gemini 3's METR 50% time horizon

Grok 4.20's METR 50% time horizon

Will a Kimi reasoning model top LMArena by EOY2025?

Claude Opus 4.5's METR-50 time horizon

Gemini 3.0 Pro outperforms GPT-5 on METR 50% time horizon?

-9% 1d80% chance

Will GPT-5.2's METR 50% time horizon exceed 3 hours 30 minutes?

R2's METR 50% time horizon

Will a model achieve a METR 50% time-horizon of 4+ hours by the end of 2025?

+6% 1d31% chance

Grok 5's METR 50% time horizon

Will the METR long-horizons have a >6 month doubling time for at least a 4 month period before 2026?

Related questions

Gemini 3's METR 50% time horizon

Grok 4.20's METR 50% time horizon

Will a Kimi reasoning model top LMArena by EOY2025?

Claude Opus 4.5's METR-50 time horizon

Gemini 3.0 Pro outperforms GPT-5 on METR 50% time horizon?

Will GPT-5.2's METR 50% time horizon exceed 3 hours 30 minutes?

R2's METR 50% time horizon

Will a model achieve a METR 50% time-horizon of 4+ hours by the end of 2025?

Grok 5's METR 50% time horizon

Will the METR long-horizons have a >6 month doubling time for at least a 4 month period before 2026?

© Manifold Markets, Inc.•Terms + Mana-only Terms•Privacy•Rules