5G/6G时代的安全分布式机器学习

By Huawei Huang, Dec. 31, 2019

2020年8月11日最新消息：本研究一篇题为《PIRATE: A Blockchain-based Secure Framework of Distributed Machine Learning in 5G Networks》的论文已被 IEEE Network 接收，该期刊是计算机网络与通信领域顶级期刊，发表计算机网络社区的热门研究课题、关键问题以及最新研究进展。IEEE Network的2019年影响因子为 9.590 (2020年最新数据)，为中科院一区。

论文简介：

随着AI芯片的制造成本逐渐下降，越来越多的移动设备逐渐具备机器学习的能力。同时，网络连接质量作为分布式机器学习的瓶颈，将会随着5G/6G时代的到来大大地得到提升。为了应对即将到来的5G/6G时代新型分布式机器学习的需求，我们需要一个能支持大规模的分布式机器学习的框架。尤其重要的是，在大规模用户参与的情况下，分布式机器学习的安全问题应该引起足够的重视。

图 1. The proposed PIRATE framework has two critical components: 1) reliability assessment; 2) blockchain (BC) systems for distributed SGD (D-SGD). Gradient aggregations and model parameters are protected by blockchains. Meanwhile, reliability assessment determines the participants of distributed learning tasks.

图 2. While adopting the “Ring Allreduce” mechanism, malicious attackers can perform attacking from both inside and outside. (1) Attackers from the outside can contaminate training models in target nodes. (2) The outside attackers can also attack partial gradient-aggregation. (3) Byzantine computing nodes can send harmful aggregations that damage the convergence of learning tasks.

理论上来讲，恶意攻击者可以攻击分布式机器学习的各个环节。针对妨碍训练收敛的任意攻击行为，我们提出了图1所示的 PIRATE：一个基于区块链的安全分布式学习框架。由于区块链技术验证的灵活性，此框架可不限于保护本文中所述的训练收敛性，而是在其他安全保护方面也具有巨大的潜能。基于本框架，更多的保护机制可以被开发出来，如针对参与分布式学习的设备进行隐私保护，针对 Model Poisoning Attack 的保护，为所有参与者提供激励机制等。

如图1所示，PIRATE主要由两部分组成，一是设备可靠性分析，用以分析设备的可靠性，进而决定设备能否能参与学习任务；二是基于多个分片链的安全SGD（Stochastic Gradient Descent）框架。

我们采取了去中心化的 Ring AllReduce 结构（图2）。这种架构可以更好地分担网络压力，并且可以在验证计算结果的同时进行梯度计算。同时，为了让节点在 Ring AllReduce 结构下高效、可验证地沟通，我们运用了基于分片的区块链。其中，我们把所有节点分成多个委员会，每个节点只需要验证自己委员会内的梯度计算。这种分布式验证极大地降低了广播所带来的时延。

经过模拟实验，在5G网络条件、较大训练模型、以及大规模用户参与的条件下，论文提出的PIRATE框架比同类框架 LearningChain 更节省存储空间，分布式机器学习训练速度方面更加高效。

该项研究的前期工作已经上传到 arXiv。论文一作为本实验室研究生1年级学生周思聪同学，第一篇论文写得很有前瞻性，可圈可点。

论文链接：

arXiv 链接：https://arxiv.org/abs/1912.07860

ResearchGate 链接： researchGate 页面

直接下载：IEEE-Network_Pirate 下载

===============================================
【English Version】

News: The paper titled “PIRATE: A Blockchain-based Secure Framework of Distributed Machine Learning in 5G Networks” has been accepted by IEEE Network (IF: 9.59) on Aug. 11th, 2020.

Introduction:

With the production cost of AI chips gradually reduced to an acceptable level, mobile devices are better equipped with computational resources for machine learning. Meanwhile, as the bottleneck of distributed machine learning, network conditions would be substantially improved as we march into the 5G era. To exploit the merits of 5G/6G networks, a large-scale distributed learning framework is in need. Particularly, in the large-scale scenario, security problems become even more critical.

To protect against arbitrary convergence hindrance attacks, we propose PIRATE, a blockchain-based secure distributed learning framework. The framework has great potential utilizing the verification flexibility of blockchain techniques. Such flexibility enables more protection mechanisms to be built on top of the framework, e.g., privacy protection, Model Poisoning Attack protection, incentive mechanism, etc.

As shown in Figure 1, PIRATE has two components: 1) reliability assessment, which decides whether a device could take part in a learning task; 2) a secure SGD framework based on multiple shard chains.

We utilize the decentralized architecture, Ring AllReduce (Figure 2), which can better leverage network resources, and enables devices to verify computation results while computing gradients.

Furthermore, in order to conduct efficient and verifiable communication under the Ring AllReduce setting, we utilize a sharding-based blockchain technique. In particular, we divide nodes into multiple committees, in which nodes are only required to verify gradients within their committee. Such division greatly reduces the latency of broadcasting.

Simulation experiments show that, under the condition of 5G/6G networks, relatively large training models, and large-scale participants, PIRATE outperforms a similar framework, LearningChain, in terms of storage complexity and latency.

Please feel free to download and read from the following URLs:

arXiv: https://arxiv.org/abs/1912.07860

ResearchGate: PDF
————————————
作者：黄华威

Leave a Reply Cancel reply