作者人数
标签数量
内容状态
原文 + 中文
同页查看标题和摘要的双语信息
PDF 预览
直接在详情页阅读或下载论文全文
深度分析
继续下钻到 AI 生成的结构化解读
摘要 / Abstract
Metacognition, the ability to assess one's own cognitive performance, is a fundamental capability documented across species where internal confidence estimates guide adaptive behavior. This research investigates whether Large Language Models (LLMs) actively utilize confidence signals to regulate their behavior through a four-phase abstention paradigm. The study first establishes internal confidence estimates without abstention options, then reveals that LLMs apply implicit thresholds to these estimates when deciding whether to answer or abstain. Findings demonstrate that confidence serves as the dominant predictor of behavior, with effect sizes an order of magnitude larger than knowledge retrieval accessibility or semantic features. Causal evidence is provided through activation steering experiments, where manipulating internal confidence signals correspondingly shifts abstention rates, demonstrating a direct causal relationship between confidence estimation and behavioral regulation.
元认知是一种跨物种的基本能力,其内部置信度估计能够指导适应性行为。本研究通过四阶段放弃范式探究大型语言模型是否利用置信度信号调节自身行为。研究发现,置信度是预测行为的主导因素,其效应量比知识检索可及性或语义特征高出一个数量级。通过激活导向实验操纵内部置信度信号可相应改变放弃率,从而为置信度估计与行为调节之间的直接因果关系提供了证据。
分类 / Categories
深度分析
AI 深度理解论文内容,生成具有洞见性的总结