作者人数
标签数量
内容状态
原文 + 中文
同页查看标题和摘要的双语信息
PDF 预览
直接在详情页阅读或下载论文全文
深度分析
继续下钻到 AI 生成的结构化解读
摘要 / Abstract
This paper introduces dynActivation, a per-layer trainable activation function that dynamically interpolates between base nonlinearities and linear paths using lightweight learned scalars. The proposed method is evaluated across vision tasks (CIFAR-10, MNIST) and language modeling tasks, demonstrating significant improvements in training efficiency (up to 54% faster) and performance. On CIFAR-10, dynActivation(Mish) achieves up to 14.02% improvement over static Mish, with 24% reduction in convergence time. In deep network scaling experiments (up to 75 layers), dynActivation maintains robust performance (95.3-99.3% accuracy) while ReLU collapses below 80%, demonstrating that adaptive nonlinearity linearization in deep layers enhances both training stability and final model quality.
本文提出dynActivation,一种逐层可训练激活函数,通过轻量级学习标量在基础非线性函数与线性路径之间动态插值。该方法在视觉任务(CIFAR-10、MNIST)和语言建模任务上进行了评估,显著提升了训练效率(最高54%)和性能。在CIFAR-10上,dynActivation(Mish)相比静态Mish提升最高14.02%,收敛时间减少24%。在深层网络扩展实验(最多75层)中,dynActivation保持稳健性能(95.3%-99.3%准确率),而ReLU性能跌至80%以下,表明深层网络中的自适应非线性线性化可增强训练稳定性和最终模型质量。
分类 / Categories
深度分析
AI 深度理解论文内容,生成具有洞见性的总结