2025.11.24 | 开源7B模型刷新多模态推理;GeoVista小模型精准地理定位

2025.11.24 | 开源7B模型刷新多模态推理;GeoVista小模型精准地理定位

Published on Nov 24
10分钟
HuggingFace 每日AI论文速递
0:00
0:00
<p>本期的 15 篇论文如下:</p><p>[00:21] 🧠 OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe(OpenMMReasoner:以开放通用方案推动多模态推理前沿)</p><p>[01:04] 🌍 GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization(GeoVista:用于地理定位的Web增强智能视觉推理)</p><p>[01:41] 🎯 SAM 3: Segment Anything with Concepts(SAM 3:基于概念的通用分割模型)</p><p>[02:31] 📊 Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story(揭示文本的内在维度:从学术摘要到创意故事)</p><p>[03:09] 🧠 O-Mem: Omni Memory System for Personalized, Long Horizon, Self-Evolving Agents(O-Mem:面向个性化、长周期、自进化智能体的全能记忆系统)</p><p>[03:43] 🦜 Parrot: Persuasion and Agreement Robustness Rating of Output Truth -- A Sycophancy Robustness Benchmark for LLMs(鹦鹉:输出真相的说服与一致性鲁棒性评级——一个面向大语言模型的谄媚鲁棒性基准)</p><p>[04:26] 🧠 RynnVLA-002: A Unified Vision-Language-Action and World Model(RynnVLA-002:统一的视觉-语言-动作与世界模型)</p><p>[05:19] 🧠 VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models(VisMem:潜在视觉记忆解锁视觉语言模型潜力)</p><p>[0...