基于异步联邦学习框架的自适应梯度裁剪方法研究
    点此下载全文
引用本文:李苗苗,胡小明,白双杰,刘琰.基于异步联邦学习框架的自适应梯度裁剪方法研究[J].上海第二工业大学(中文版),2025,42(1):51-58
摘要点击次数: 237
全文下载次数: 29
作者单位
李苗苗 上海第二工业大学a. 计算机与信息工程学院
b. 人工智能研究院, 上海201209 
胡小明 上海第二工业大学a. 计算机与信息工程学院
b. 人工智能研究院, 上海201209 
白双杰 上海第二工业大学a. 计算机与信息工程学院
b. 人工智能研究院, 上海201209 
刘琰 上海第二工业大学a. 计算机与信息工程学院
b. 人工智能研究院, 上海201209 
中文摘要:异步联邦学习框架作为一种分布式机器学习范式, 由于节点之间的更新步伐不一致, 导致收敛变慢, 从而影响训练过程的稳定性和模型的最终性能。为了提升异步联邦学习的性能并解决这些问题, 提出了一种新的自适应梯度裁剪(new adaptive gradient clipping, AGC) 方法, 采用动态调整的裁剪阈值, 通过在不同的场景中来调整阈值, 可以避免梯度爆炸问题, 同时在梯度较小时避免过度压缩, 从而提高模型训练的稳定性。并且在此基础上提出了AGC 方法的异步联邦学习框架(AGC-FedAsync)。实验结果表明, 在CIFAR-10 数据集上, AGC-FedAsync 中的ResNet34 模型对应的最佳准确率从17.55% 提高到26.96%, 增幅为9.41%; 最终准确率从10.00% 提高到24.15%,增幅为14.15%。与此同时, ResNet18 和ResNet50 模型的准确率均实现了显著提升。甚至在更具挑战性、复杂性的CIFAR-100 数据集上, AGC-FedAsync 不仅提高了模型的识别准确率, 还增强了模型对不同客户端特性的适应能力,在保护数据隐私的同时优化了模型训练效率。
中文关键词:异步联邦学习  自适应梯度裁剪  戏差网络
 
Research on Adaptive Gradient Clipping Methods Based on Asynchronous Federated Learning Framework
Abstract:As a distributed machine learning paradigm, asynchronous Federated learning framework faces challenges due to the inconsistent update pace among nodes, which can slow down convergence and affect the stability of the training process and the final performance of the model. To enhance the performance of asynchronous Federated learning and address these issues, a new adaptive gradient clipping (AGC) method has been proposed. This method employs a dynamically adjusted clipping threshold that can be modified according to different scenarios to prevent gradient explosion and avoid excessive compression when gradients are small, thereby improving the stability of model training. On this basis, the asynchronous Federated learning framework with AGC (AGC-FedAsync) has been introduced. Experimental results show that on the CIFAR-10 datasets, the best accuracy of the ResNet34 model in AGCFedAsync improves from 17.55% to 26.96%, an increase of 9.41%; the last accuracy improves from 10.00% to 24.15%, an increase of 14.15%. At the same time, the accuracy of the ResNet18 and ResNet50 models also sees significant improvements. Even on the more challenging and complex CIFAR-100 datasets, AGC-FedAsync not only improves the model’s recognition accuracy, but also enhances the model’s adaptability to different client characteristics, optimizing model training efficiency while protecting data privacy.
keywords:asynchronous Federated learning  adaptive gradient clipping  residual network
查看全文  查看/发表评论  下载PDF阅读器
上海第二工业大学学报编辑部 版权所有
地址:中国 上海市 浦东新区金海路2360号 邮编:201209
电话:021-50216814,传真:021-50216005  京ICP备09084417号