
0:000:00
<p>本期的 15 篇论文如下:</p><p>[00:24] 🧠 Qwen3-VL Technical Report(Qwen3-VL 技术报告)</p><p>[00:57] 🧠 PretrainZero: Reinforcement Active Pretraining(PretrainZero:强化主动预训练)</p><p>[01:36] 🎬 ViDiC: Video Difference Captioning(ViDiC:视频差异描述)</p><p>[02:24] 🧠 OneThinker: All-in-one Reasoning Model for Image and Video(OneThinker:面向图像与视频的全能推理模型)</p><p>[03:07] 🔄 Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation(重新思考文本到视觉生成中推理时扩展的提示设计)</p><p>[03:59] ⚙ Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach(引导视觉-语言-动作模型作为反探索:一种测试时缩放方法)</p><p>[04:46] 🤖 SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL(SpaceTools:通过双重交互式强化学习实现工具增强的空间推理)</p><p>[05:22] 🔧 Thinking with Programming Vision: Towards a Unified View for Thinking with Images(以编程视觉思考:迈向图像思维的统一视角)</p><p>[06:01] 🔄 Flowing Backwards: Improving Normalizing Flows via Reverse Representation Alignment(逆向流动:通过反向表征对齐改进标准化流)</p><p>[06:51] 🎮 RELIC: Interactive Video World M...