Schwertlilien
As a recoder: notes and ideas.

2026-1-17

ACM MM25相关

Segmentation

5039 Simple but Effective: Sub-Volume Contrastive Learning for Class-Imbalanced Semi-Supervised 3D Medical Image Segmentation
Xianrun Xu, Baoyao Yang, Wanyun Li, Jingsong Lin, Yufei Xu

3459 OFFSET: Segmentation-based Focus Shift Revision for Composed Image Retrieval
Zhiwei Chen, Yupeng Hu, Zixu Li, Zhiheng Fu, Xuemeng Song, Liqiang Nie

5849 Textual and Visual Guided Task Adaptation for Source-free Cross-Domain Few-Shot Segmentation
Jianming Liu, Wenlong Qiu, haitao wei

660 Epipolar Consistency-based Network for Structure-Aware LF Semantic Segmentation
Chen Gao, Youfang Lin, Wenbin Wang, Shuo Zhang

4430 PRIME: Prototype-Driven Class Incremental Learning for Medical Image Segmentation
Shengqian Zhu, yu chengrong, Wenbo Qi, Jiafei Wu, Ying Song, Guangjun Li, Zhang Yi, Xiaogang Xu, Junjie Hu

4371 Graph-Guided Dual-Level Augmentation for 3D Scene Segmentation
Hongbin Lin, Yifan Jiang, Juangui Xu, Jesse Xu, Yi Lu, Zhengyu Hu, Ying-Cong Chen, Hao Wang

5332 Symmetrical Awareness Generation for Pelvic Image Segmentation
Yize Song, Yunqing Chen, Zhou Wang, Cheng Chen, Ruoxiu Xiao

3763 RoDeCon-Net: Medical Image Segmentation via Robust Decoupling and Contrast-Enhanced Fusion
Yongquan Xue, Zhaoru Guo, Zhaozhao Su, Chong Peng, Jun Feng, Pan Zhou, Marcin Pietron, Xiyuan Wang, Liejun Wang, Panpan Zheng

3577 EMIFS: Efficient Multi-scale Information Fusion Self-supervision for Medical Image Segmentation
Luyao Ren, Wenxin Yu, Zhiqiang Zhang, Chang Liu

963 Novel Category Discovery with X-Agent Attention for Open-Vocabulary Semantic Segmentation
Jiahao Li, Yang Lu, Yachao Zhang, FangyongWang FangyongWang, Yuan Xie, Yanyun Qu

713 What You Perceive Is What You Conceive: A Cognition-Inspired Framework for Open Vocabulary Image Segmentation
Jianghang Lin, Yue Hu, Jiangtao Shen, Yunhang Shen, Liujuan Cao, Shengchuan Zhang, Rongrong Ji

2708 BrainSegDMlF: A Dynamic Fusion-enhanced SAM for BrainLesion Segmentation
Hongming Wang, Yifeng Wu, Huimin Huang, Hongtao Wu, Jia-Xuan Jiang, Xiaodong Zhang, Hao Zheng, Yawen Huang, Xian Wu, Yefeng Zheng, Jinping Xu, Jing Cheng

1690 CFSSeg: Closed-Form Solution for Class-Incremental Semantic Segmentation of 2D Images and 3D Point Clouds
Jiaxu Li, Rui Li, Jianyu Qi, Songning Lai, Linpu Lv, Kejia Fan, Jianheng Tang, Yutao Yue, Dongzhan Zhou, Yunhuai Liu, Huiping Zhuang

5392 DiffuSeg: Diffusion-Enhanced Cross-Modal Semantic Segmentation for RGB-D
Jun Yang, MAOYU MAO

6349 Diffusion-Guided Knowledge Distillation for Weakly-Supervised Low-light Semantic Segmentation
Chunyan Wang, Dong Zhang, Jinhui Tang

2569 EIR-SDG: Explore Invariant Representation for Single-source Domain Generalization in Medical Image Segmentation
Ziwei Niu, Shiao Xie, Ziyue Wang, Yen Chen, Yueming Jin, Lanfen Lin

1368 MM-Prompt: Multi-modality and Multi-granularity Prompts for Few-Shot Segmentation
Hang Xiong, Runmin Cong, Jinpeng Chen, Chen Zhang, Feng Li, Huihui Bai, Sam Kwong

2301 Unified Medical Image Segmentation with State Space Modeling Snake
Ruicheng Zhang, Haowei Guo, Kanghui Tian, Jun Zhou, Mingliang Yan, Zeyu Zhang, Shen Zhao

4294 Discovering Maximum Frequency Consensus: Lightweight Federated Learning for Medical Image Segmentation
Lingren Wang, Wenxuan Tu, Jieren Cheng, Jianan Wang, Xiangyan Tang, Chenchen Wang

2851 Mitigating Query Selection Bias in Referring Video Object Segmentation
Dingwei Zhang, Dong Zhang, Jinhui Tang

2138 StitchFusion: Weaving Any Visual Modalities to Enhance Multimodal Semantic Segmentation
Bingyu Li, Da Zhang, Zhiyuan Zhao, Junyu Gao, Xuelong Li

3027 SAM-TTT: Segment Anything Model via Reverse Parameter Configuration and Test-Time Training for Camouflaged Object Detection
Zhenni Yu, LiZhao LiZhao, Guobao Xiao, Xiaoqin Zhang

985 VLHP: Learning Discriminative Vision-Language Hybrid Prototypes for Weakly Supervised Semantic Segmentation
Jingyuan Fang, Yang Ning, Xiushan Nie, Xinfeng Liu, Zhiyong Cheng

5993 PLATO-TTA: Prototype-Guided Pseudo-Labeling and Adaptive Tuning for Multi-Modal Test-Time Adaptation of 3D Segmentation
Jianxiang Xie, Yao Wu, Yachao Zhang, Xiaopei Zhang, Yuan Xie, Yanyun Qu

1971 Mobile U-ViT: Revisiting large kernel and U-shaped ViT for efficient medical image segmentation
Fenghe Tang, Bingkun Nian, Jianrui Ding, Wenxin ma, Quan Quan, Chengqi Dong, JIE YANG, Wei Liu, S Kevin Zhou

2297 MARL-MambaContour: Unleashing Multi-Agent Deep Reinforcement Learning for Active Contour Optimization in Medical Image Segmentation
Ruicheng Zhang, Yu Sun, Zeyu Zhang, Jinai Li, Xiaofan Liu, Hoi Au, Haowei Guo, Puxin Yan

Food

7082 Ingredients-Guided and Nutrients-Prompted Network for Food Nutrition Estimation
Donglin Zhang, Boyuan Ma, Xiaojun Wu, Josef Kittler

6477 Like or Not to Like: An Usecase of Vietnamese Street Food Videos on YouTube
Nguyen Duy, Hoang Hoan, Thanh-Trung Phan

6239 DSDGF-Nutri: A Decoupled Self-Distillation Network with Gating Fusion For Food Nutritional Assessment
Sujuan Hou, Zhihui Feng, Hao Xiong, Weiqing Min, Peng Li, Shuqiang Jiang

5720 Spatial-Aware Multi-Modal Information Fusion for Food Nutrition Estimation
Dongjian Yu, Weiqing Min, Xin Jin, Qian Jiang, Shuqiang Jiang

Reinforce Learning

6726 RAIDX: A Retrieval-Augmented Generation and GRPO Reinforcement Learning Framework for Explainable Deepfake Detection
Tianxiao Li, Zhenglin Huang, Haiquan Wen, Yiwei He, Shuchang Lyu, Baoyuan Wu, Guangliang Cheng

5378 Embodied-R: Collaborative Framework for Activating Embodied Spatial Reasoning in Foundation Models via Reinforcement Learning
Baining Zhao, Ziyou Wang, Jianjie Fang, Chen Gao, Fanhang Man, Jinqiang Cui, Xin Wang, Xinlei Chen, Yong Li, Wenwu Zhu

5301 From Pixels to Temporal Correlations: Learning Informative Representations for Reinforcement Learning Pre-training
Jinwen Wang, Youfang Lin, Xiaobo Hu, Siyu Yang, Sheng Han, Shuo Wang, Kai Lv

3346 PRE-MAP: Personalized Reinforced Eye-tracking and Multimodal LLM for High-Resolution Multi-Attribute Point Prediction
Hanbing Wu, Ping Jiang, Anyang Su, Chenxu Zhao, Tianyu Fu, Minghui Wu, Beiping Tan, huiying li

3208 Collaboration Wins More: Dual-Modal Collaborative Attention Reinforcement for Mitigating Large Vision Language Models Hallucination
Jiye Xie, Yifei Gao, Liangliang You, Xiang Xu, Haoran Xu, Zhiqiang Kou, Kexue Fu, Youyang Qu, Wenjie Yang, Jianwei Guo, Weiliang Meng, Longxiang Gao, Haoran Yang, Changwei Wang, Yu Zhang

7168 Dark Side of Modalities: Reinforced Multimodal Distillation for Multimodal Knowledge Graph Reasoning
Yu Zhao, Ying Zhang, Xuhui Sui, Baohang Zhou, Haoze Zhu, Jeff Pan, Xiaojie Yuan

5642 RecipeRAG: Advancing Recipe Generation with Reinforced Retrieval Augmented Generation
Jinghan Yang, Zhenbo Xu, Dehua Ma, Liu Liu, Fei Liu, Gong Huang, Zhaofeng He

2600 Multimodal Dual Population Evolutionary Reinforcement Learning
Yao Zhang, Ping Huang, Rui Zhang

1252 Graph-based Approximate Nearest Neighbor Search by Deep Reinforcement Routing
Mingjie Li, Junhao Lin, Dian Ouyang, Ying Zhang, Wei Wang

2732 Secure Tug-of-War (SecTOW): Iterative Defense-Attack Training with Reinforcement Learning for Multimodal Model Security
Mz Dai, Shixuan Liu, Zhiyuan Zhao, Junyu Gao, Hao Sun, Xuelong Li

2297 MARL-MambaContour: Unleashing Multi-Agent Deep Reinforcement Learning for Active Contour Optimization in Medical Image Segmentation
Ruicheng Zhang, Yu Sun, Zeyu Zhang, Jinai Li, Xiaofan Liu, Hoi Au, Haowei Guo, Puxin Yan

MARL-MambaContour

主要处理的三个问题

1. 掩码空洞(mask cavities)

  • 场景:脑部MRI肿瘤分割(对应论文BraTS2023数据集)
    肿瘤区域因水肿、坏死,内部信号强度不均匀——有些坏死区的像素灰度和正常脑组织接近。
  • 像素级方法的问题:只判断单个像素“是不是肿瘤”,会把肿瘤内部信号异常的像素误判为“非肿瘤”,导致最终分割出的肿瘤掩码(白色区域)中间出现一个黑色“空洞”,就像一块带孔的饼干。
  • 实际影响:医生无法准确计算肿瘤体积,可能低估病灶大小,影响治疗方案制定。

2. 连接中断(disrupted connectivity)

  • 场景:脊柱CT分割(对应论文VerSe数据集)
    脊柱由多个椎体连续排列组成,但CT图像中可能因扫描层厚、伪影导致部分椎体连接处的像素灰度变弱。
  • 像素级方法的问题:只看单个像素的局部特征,会把连接处灰度变弱的像素误判为“非椎体”,导致原本连续的脊柱分割结果被“断开”——比如第3和第4椎体之间出现间隙,变成两个孤立的掩码。
  • 实际影响:无法准确判断脊柱的连续性,可能遗漏椎体融合、骨折等病理情况。

3. 错误的区域合并(erroneous regional mergers)

  • 场景:腹部CT器官分割(对应论文RAOS数据集)
    肝脏和胆囊相邻,且部分病例中胆囊壁较薄、与肝脏的密度差异小,再加上图像噪声干扰。
  • 像素级方法的问题:只看单个像素的局部特征,无法区分“肝脏边缘像素”和“胆囊边缘像素”,会把两个独立器官的像素判为同一类,导致分割结果中肝脏和胆囊的掩码“粘在一起”,变成一个合并的区域。
  • 实际影响:混淆不同器官的边界,可能导致手术规划时误判器官位置,增加手术风险。
搜索
匹配结果数:
未搜索到匹配的文章。