热点
关于我们
xx
xx
"
Evals
" 相关文章
Measuring what matters: An intro to AI evals
Braintrust Blog
2025-10-10T23:05:13.000000Z
OpenAI Debuts Agent Builder and AgentKit: A Visual-First Stack for Building, Deploying, and Evaluating AI Agents
MarkTechPost@AI
2025-10-07T03:00:34.000000Z
OpenAI Debuts Agent Builder and AgentKit: A Visual-First Stack for Building, Deploying, and Evaluating AI Agents
MarkTechPost@AI
2025-10-07T03:00:34.000000Z
A/B testing can't keep up with AI
Braintrust Blog
2025-10-02T12:51:38.000000Z
播客推荐 | 打造你无法预知能力的产品
孔某人的低维认知
2025-09-25T10:02:01.000000Z
No Title
OpenAI Cookbook
2025-06-25T07:08:39.000000Z