本期的 15 篇论文如下:
00:23 🎯 AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders(AdaSPEC:面向高效推测解码的选择性知识蒸馏)
00:57 🤖 Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1(低成本人机协作论文一键成页:低于0.1美元)
01:35 🔍 Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence(Open-o3视频:显式时空证据支撑的开放域视频推理)
02:06 🎬 HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives(HoloCine:端到端生成多镜头长时电影级叙事视频)
02:52 🌀 Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall(绕过离散扩散采样墙的确定性捷径)
03:33 💎 Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values(每个问题都有它的价值:显式人类价值驱动的强化学习)
04:06 ⚖ The Massive Legal Embedding Benchmark (MLEB)(大规模法律嵌入评测基准(MLEB))
04:48 🔍 DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion(DyPE:面向超高分辨率扩散模型的动态位置外推方法)
05:33 🕵 Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence(柯南:像侦探一样在多尺度视觉证据上渐进式推理)
06:12 🤖 Search Self-play: Pushing the Frontier of Agent Capability without Supervision(搜索自博弈:无需监督即可拓展智能体能力边界)
06:56 🎭 Investigating Safety Vulnerabilities of Large Audio-Language Models Under Speaker Emotional Variations(探究大音频语言模型在说话人情绪变化下的安全漏洞)
07:42 🖼 LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas(LayerComposer:基于空间感知分层画布的交互式个性化文生图)
08:10 🎧 SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language Models(SAKE:面向大型音频-语言模型听觉属性知识编辑的探索)
08:51 🖼 ARGenSeg: Image Segmentation with Autoregressive Image Generation Model(ARGenSeg:基于自回归图像生成的图像分割)
09:39 🧩 Seed3D 1.0: From Images to High-Fidelity Simulation-Ready 3D Assets(Seed3D 1.0:从单张图像生成高保真、可仿真的3D资产)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
