热点
"Evals" 相关文章
Measuring what matters: An intro to AI evals
Braintrust Blog 2025-10-10T23:05:13.000000Z
OpenAI Debuts Agent Builder and AgentKit: A Visual-First Stack for Building, Deploying, and Evaluating AI Agents
MarkTechPost@AI 2025-10-07T03:00:34.000000Z
OpenAI Debuts Agent Builder and AgentKit: A Visual-First Stack for Building, Deploying, and Evaluating AI Agents
MarkTechPost@AI 2025-10-07T03:00:34.000000Z
A/B testing can't keep up with AI
Braintrust Blog 2025-10-02T12:51:38.000000Z
播客推荐 | 打造你无法预知能力的产品
孔某人的低维认知 2025-09-25T10:02:01.000000Z
No Title
OpenAI Cookbook 2025-06-25T07:08:39.000000Z