返回论文列表
Paper Detail
VIGIL: Part-Grounded Structured Reasoning for Generalizable Deepfake DetectionVIGIL:基于部件的结构化推理实现可泛化的Deepfake检测
cs.CVCVTransformer热门获取具身智能多模态
VIGIL Authors
2026年03月23日
arXiv: 2603.21526v1

作者人数

1

标签数量

5

内容状态

含 PDF

原文 + 中文

同页查看标题和摘要的双语信息

PDF 预览

直接在详情页阅读或下载论文全文

深度分析

继续下钻到 AI 生成的结构化解读

摘要 / Abstract

This paper presents VIGIL, a novel part-centric structured forensic framework for deepfake detection using multimodal large language models. The approach employs a plan-then-examine pipeline where the model first plans which facial parts warrant inspection based on global visual cues, then examines each part with independently sourced forensic evidence. A stage-gated injection mechanism delivers part-level forensic evidence only during examination to ensure unbiased part selection. The framework is inspired by expert forensic practice and aims to improve the reliability of deepfake detection by separating evidence generation from manipulation localization, addressing the issue of hallucinated explanations in current MLLM-based methods.

本文提出了VIGIL,一个基于部件的新型结构化取证框架,利用多模态大语言模型进行Deepfake检测。该方法采用先规划后检验的流程,模型首先根据全局视觉线索规划需要检查的面部部件,随后对每个部件进行独立来源的取证证据检验。一种阶段门控注入机制仅在检验阶段提供部件级取证证据,以确保无偏的部件选择。该框架借鉴了专家取证实践,通过将证据生成与篡改定位分离来提高Deepfake检测的可靠性,解决了当前基于MLLM方法中存在幻觉解释的问题。

PDF 预览
1
在 arXiv 查看下载 PDF

分类 / Categories

cs.CVcs.AI

深度分析

AI 深度理解论文内容,生成具有洞见性的总结