热点
"实证研究" 相关文章
Remote Labor Index: Measuring AI Automation of Remote Work
cs.AI updates on arXiv.org 2025-10-31T04:09:47.000000Z
Reflections on the Reproducibility of Commercial LLM Performance in Empirical Software Engineering Studies
cs.AI updates on arXiv.org 2025-10-30T04:19:56.000000Z
A Comparison of Conversational Models and Humans in Answering Technical Questions: the Firefox Case
cs.AI updates on arXiv.org 2025-10-28T04:12:22.000000Z
Convergence and Generalization of Anti-Regularization for Parametric Models
cs.AI updates on arXiv.org 2025-10-27T06:35:02.000000Z
PTFA: An LLM-based Agent that Facilitates Online Consensus Building through Parallel Thinking
cs.AI updates on arXiv.org 2025-10-23T04:46:10.000000Z
Benchmarking Large Language Models for Personalized Guidance in AI-Enhanced Learning
cs.AI updates on arXiv.org 2025-10-23T04:24:11.000000Z
"Over-the-Hood" AI Inclusivity Bugs and How 3 AI Product Teams Found and Fixed Them
cs.AI updates on arXiv.org 2025-10-23T04:15:08.000000Z
Illusions of reflection: open-ended task reveals systematic failures in Large Language Models' reflective reasoning
cs.AI updates on arXiv.org 2025-10-22T04:13:19.000000Z
Beyond Final Code: A Process-Oriented Error Analysis of Software Development Agents in Real-World GitHub Scenarios
cs.AI updates on arXiv.org 2025-10-20T04:15:05.000000Z
Paper Copilot: Tracking the Evolution of Peer Review in AI Conferences
cs.AI updates on arXiv.org 2025-10-16T04:26:16.000000Z
Generative AI and Firm Productivity: Field Experiments in Online Retail
cs.AI updates on arXiv.org 2025-10-15T04:55:24.000000Z
Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning
cs.AI updates on arXiv.org 2025-10-15T04:51:45.000000Z
Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning
cs.AI updates on arXiv.org 2025-10-15T04:51:45.000000Z
AI Adoption Across Mission-Driven Organizations
cs.AI updates on arXiv.org 2025-10-07T04:15:52.000000Z
Designing Empirical Studies on LLM-Based Code Generation: Towards a Reference Framework
cs.AI updates on arXiv.org 2025-10-07T04:15:50.000000Z
Designing Empirical Studies on LLM-Based Code Generation: Towards a Reference Framework
cs.AI updates on arXiv.org 2025-10-07T04:15:50.000000Z
FaithCoT-Bench: Benchmarking Instance-Level Faithfulness of Chain-of-Thought Reasoning
cs.AI updates on arXiv.org 2025-10-07T04:07:24.000000Z
The Three Regimes of Offline-to-Online Reinforcement Learning
cs.AI updates on arXiv.org 2025-10-03T04:14:21.000000Z
Memorize or Generalize? Evaluating LLM Code Generation with Code Rewriting
cs.AI updates on arXiv.org 2025-10-01T06:02:12.000000Z
A Measurement Study of Model Context Protocol
cs.AI updates on arXiv.org 2025-10-01T06:00:07.000000Z