返回论文列表
Paper Detail
Reasoning or Rhetoric? An Empirical Analysis of Moral Reasoning Explanations in Large Language Models推理还是修辞?大型语言模型道德推理解释的实证分析
cs.CL大语言模型Transformer热门获取
Researchers
2026年03月23日
arXiv: 2603.21854v1

作者人数

1

标签数量

3

内容状态

含 PDF

原文 + 中文

同页查看标题和摘要的双语信息

PDF 预览

直接在详情页阅读或下载论文全文

深度分析

继续下钻到 AI 生成的结构化解读

摘要 / Abstract

This paper investigates whether large language models demonstrate genuine moral reasoning capabilities or merely produce superficially convincing reasoning-like outputs. The study analyzes responses from 13 different LLMs across six classical moral dilemmas using Kohlberg's stages of moral development as an evaluation framework. Through an LLM-as-judge scoring pipeline validated across three judge models, over 600 responses were classified and analyzed. The research reveals a significant finding that LLM responses predominantly exhibit post-conventional reasoning patterns (Stages 5-6), which contradicts typical human developmental trajectories where such reasoning emerges later in moral development. This inversion suggests that alignment training may produce outputs that mimic advanced moral reasoning without the underlying developmental progression characteristic of human moral cognition.

本文探讨大型语言模型是否具备真正的道德推理能力,还是仅产生表面上有说服力的推理类输出。研究以科尔伯格道德发展阶段为评估框架,分析了13种不同LLM在六个经典道德困境中的响应,并通过对三个评判模型验证的LLM-as-judge评分流程,对超过600条响应进行了分类分析。研究发现,LLM响应主要呈现后习俗推理模式(第5-6阶段),这与人类道德发展轨迹相反——在人类中此类推理通常出现在道德发展的较晚阶段。这种反转表明,对齐训练可能产生了模仿高级道德推理的输出,却缺乏人类道德认知所具有的底层发展过程。

PDF 预览
1
在 arXiv 查看下载 PDF

分类 / Categories

cs.CLcs.AI

深度分析

AI 深度理解论文内容,生成具有洞见性的总结