作者人数
标签数量
内容状态
原文 + 中文
同页查看标题和摘要的双语信息
PDF 预览
直接在详情页阅读或下载论文全文
深度分析
继续下钻到 AI 生成的结构化解读
摘要 / Abstract
Large language models (LLMs) are increasingly deployed in human-AI teams as support agents for complex tasks such as information retrieval, programming, and decision-making assistance. This paper studies the potential role of LLMs as defensive supervisors within mixed human-AI teams to detect malicious behavior. Using a dataset consisting of multi-party conversations and decisions over a 25-round horizon, we formulate the problem of malicious behavior detection from interaction traces. We find that LLMs are capable of identifying malicious behavior effectively, demonstrating their potential as defensive actors in collaborative environments.
大语言模型(LLMs)正日益部署于人机团队中,作为信息检索、编程和决策辅助等复杂任务的支持代理。本文研究了大型语言模型作为防御性监督者在混合人机团队中检测恶意行为的潜在作用。基于包含多方对话和25轮决策的数据集,我们从交互轨迹中提出了恶意行为检测问题。研究发现,大语言模型能够有效识别恶意行为,展示了其在协作环境中作为防御性角色的潜力。
分类 / Categories
深度分析
AI 深度理解论文内容,生成具有洞见性的总结