实验室关于区块链性能优化论文被ICDCS接收

近日,实验室在区块链性能优化领域的研究取得新进展,论文《MVCom: Scheduling Most Valuable Committees for the Large-Scale Sharded Blockchain》被分布式计算顶级学术会议The 41st IEEE International Conference on Distributed Computing Systems (ICDCS 2021) 录用为长文。


会议介绍

ICDCS是分布式计算系统领域享有盛誉和具有重要学术影响力的顶级国际学术会议,本届 ICDCS 会议 Research Track 论文全球投稿共489篇,仅有97篇被录用,录用率为19.8%


论文介绍

Huawei Huang, Zhenyi Huang, Xiaowen Peng, Zibin Zheng, Song Guo, “MVCom: Scheduling Most Valuable Committees for the Large-Scale Sharded Blockchain”, ICDCS, 2021. [RG-Page & PDF]

针对经典的区块链分片协议,该论文提出一种可以加速主链区块上链的机制,从而可以提高大规模基于分片技术的区块链的吞吐量。具体来讲,在区块链分片协议的每一轮执行的开始阶段,由于组成分片委员会的节点的异构性,会导致花费在分片委员会的构建阶段与片内共识阶段的时间呈现出不均衡分布。这种不均衡的时延将会为某些分片内的交易带来很大的时延。因此,本论文提出一种为分片协议在每一轮的开始阶段选取最有价值的一组分片委员会,优先提前参与到每一轮的主链区块的生成阶段的方法。通过这种方法,本论文在大规模分片区块链的背景下,可以为交易的吞吐量与分片内的等待时延之间找到一种平衡。

下载链接:

https://www.researchgate.net/publication/350152541_MVCom_Scheduling_Most_Valuable_Committees_for_the_Large  Scale_Sharded_Blockchain

论文发表 Slides:

实验数据与处理方法:请前往 Datasets 页面下载。

[Paper Sharing] OptChain: Optimal Transactions Placement for Scalable Blockchain Sharding

今天分享一篇刚刚读的关于区块链分片理论的论文,题目是 OptChain: Optimal Transactions Placement for Scalable Blockchain Sharding,发表在 IEEE ICDCS 2019,属于分布式并行计算的顶会之一。

这篇论文的出发点是:分片区块链网络中大部分的交易 (trasactions) 都是跨片 (cross-shard) 的,这些 cross-shard trasactions 既降低了系统吞吐量 (throughput),而且增加了交易的跨片确认时间 (confirmation time)。那么是否可以通过合理地部署这些跨片的交易,使得 cross-shard transactions 的数量降低从而既可以提高系统吞吐量又可以降低跨片确认时延呢?答案是肯定的,详情请细读这篇 OptChain,它提出了一种轻量级的实时的交易放置策略,可以将已经产生关联或者即将产生关联的交易部署到相同的分片中。此外,OptChain 还可以维护分片之间的负载平衡来保障分片机制的并发性。

PS: 如果想了解更多的类似于这篇以提升区块链本身性能为目标的研究论文,请参照综述 “A Survey of State-of-the-Art on Blockchains: Theories, Modelings, and Tools” [ arXiv Page: https://arxiv.org/abs/2007.03520 ].

A New Survey on Blockchains’ Theories, Modelings, and Tools

Dear all,

I would like to share our latest blockchain survey titled “A Survey of State-of-the-Art on Blockchains: Theories, Modelings, and Tools”. This survey is focusing on the theoretical modelings, analytical models, and evaluation tools of blockchains.

#========= Chinese Version:

近日,我们在 arXiv 公开了最新的一篇区块链综述论文,论文题目为 “A Survey of State-of-the-Art on Blockchains: Theories, Modelings, and Tools”. 比起现有的其他区块链的综述论文,这篇综述主要从理论建模、分析模型、实验评估工具的角度对区块链本身的基础运行机制进行了探讨。希望这篇综述论文可以为研究者、工程开发者、以及从事区块链教育的业内人士提供一个具有参考价值手册。

arXiv link: https://arxiv.org/abs/2007.03520

Resilient Routing for the Control-Channel of Software-Defined Networks – A Revisit of a JSAC Article

By Huawei Huang, Feb. 16th, 2020


=============== English Version ================

This blog introduces the motivation and background of one of my previous research articles, which has the following publish information:

Huawei Huang, Song Guo, Weifa Liang, Keqiu Li, Baoliu Ye, and Weihua Zhuang, “Near-Optimal Routing Protection for In-Band Software-Defined Heterogeneous Networks”, IEEE Journal on Selected Areas in Communications (JSAC), vol. 16, no. 20, pp. 7421-7432, November 2016. (CCF-A, Computer Networks)
Photo by Thomas Jensen on Unsplash

Perspectives

  • Writing this article was a great pleasure because the proposed algorithm provides optimal routing protection for control-plane traffic in the in-band fashioned software-defined networks. Importantly, the proposed approach can be extended to the general routing protection in the data plane of Software-define Networks.

What is it about?

  • In the software-defined heterogeneous networks, we study a weighted cost minimization problem, in which the control-plane traffic load balancing and control-channel setup cost are jointly considered when selecting the protection paths for control channels. Since the multiple resource-constrained routing is proved to be NP-complete, we propose a near-optimal algorithm, using the Markov approximation technique. Particularly, we extend our solution to an online case that can handle dynamic single-link failures. The incurred performance fluctuation is also theoretically analyzed.

Why is it important?

  • Even though SDN brings quantities of advantages to the software-defined heterogeneous network (HetNet), it comes with many challenges. One particular concern is the resilience of the control traffic, i.e., the communications between data-plane and control-plane. In an in-band fashioned software-defined heterogeneous network, where control-plane traffic shares medium with the data plane traffic, even a single link failure may disconnect a large number of packet-switching devices from their controllers, resulting in much worse damages than those of the out-of-band fashion. For example, in case of failures caused by disaster scenarios such as earthquake and tsunami, the core network links between switches and controllers may be disconnected. That would result in severe performance degradation, including packet loss, loop routing, suboptimal or infeasible routing actions, high network latency, and even service unavailability. The consequence becomes even worse in wide-area software-defined HetNets. Therefore, to deal with routing protection at the control plane for in-band HetNets is a fundamental issue.

=============== Chinese Version ================

为软件定义网络的控制信道提供可靠的路由策略 – 回顾一篇发表在JSAC的代表作

这篇blog介绍我之前的一篇发表在JSAC的技术论文,论文信息如下:

Huawei Huang, Song Guo, Weifa Liang, Keqiu Li, Baoliu Ye, and Weihua Zhuang, “Near-Optimal Routing Protection for In-Band Software-Defined Heterogeneous Networks”, IEEE Journal on Selected Areas in Communications (JSAC), vol. 16, no. 20, pp. 7421-7432, November 2016.(CCF-A类, 计算机网络)

观点 [Perspectives]

  • 很高兴看到这篇组成我博士毕业论文三分之一分量的论文,可以为软件定义网络的控制信道提供可靠的路由保护策略。

论文亮点 [What is it about?]

  • 我们在此文针对软件定义网络的异构网络,研究一个既考虑到铺设控制信道代价,又考虑到控制信道由于链路失效的弹性恢复的可靠性的联合优化问题。为了解决这个NP-complete的难题,我们提出采用基于Markov approximation技术设计一个接近最优性能的算法。此算法可以实时有效地处理单链路失效。而且,我们还针对算法的动荡性给出了理论证明分析。

为何这个课题重要 [Why is it important?]

  • 在基于软件定义网络的异构网络,如5G边缘社区网络,为控制信道提供可靠的路由保护策略至关重要,因为控制信道是服务流量的背后控制通道。特别是以“in-band”, 即“带内”方式组建的SDN控制信道,服务流量与控制流量“穿行”在同样的网络链路上。所以,一个简单的单链路失效事件就会使得很大一部分控制流与服务流丢包,从而对用户的服务体验造成灾难性的后果。为此,如何为软件定义网络的控制信道提供高可靠、具有快速恢复能力的路由保护策略是一个至关重要的研究课题。

=============== 黄华威 (Huawei Huang) ================

5G/6G时代的安全分布式机器学习

By Huawei Huang, Dec. 31, 2019

      2020年8月11日最新消息:本研究一篇题为《PIRATE: A Blockchain-based Secure Framework of Distributed Machine Learning in 5G Networks》的论文已被 IEEE Network 接收,该期刊是计算机网络与通信领域顶级期刊,发表计算机网络社区的热门研究课题、关键问题以及最新研究进展。IEEE Network的2019年影响因子为 9.590 (2020年最新数据),为中科院一区。

论文简介:

随着AI芯片的制造成本逐渐下降,越来越多的移动设备逐渐具备机器学习的能力。同时,网络连接质量作为分布式机器学习的瓶颈,将会随着5G/6G时代的到来大大地得到提升。为了应对即将到来的5G/6G时代新型分布式机器学习的需求,我们需要一个能支持大规模的分布式机器学习的框架。尤其重要的是,在大规模用户参与的情况下,分布式机器学习的安全问题应该引起足够的重视。

图 1. The proposed PIRATE framework has two critical components: 1) reliability assessment; 2) blockchain (BC) systems for distributed SGD (D-SGD). Gradient aggregations and model parameters are protected by blockchains. Meanwhile, reliability assessment determines the participants of distributed learning tasks.
图 2. While adopting the “Ring Allreduce” mechanism, malicious attackers can perform attacking from both inside and outside. (1) Attackers from the outside can contaminate training models in target nodes. (2) The outside attackers can also attack partial gradient-aggregation. (3) Byzantine computing nodes can send harmful aggregations that damage the convergence of learning tasks.

      理论上来讲,恶意攻击者可以攻击分布式机器学习的各个环节。针对妨碍训练收敛的任意攻击行为,我们提出了图1所示的 PIRATE:一个基于区块链的安全分布式学习框架。由于区块链技术验证的灵活性,此框架可不限于保护本文中所述的训练收敛性,而是在其他安全保护方面也具有巨大的潜能。基于本框架,更多的保护机制可以被开发出来,如针对参与分布式学习的设备进行隐私保护,针对 Model Poisoning Attack 的保护,为所有参与者提供激励机制等。

      如图1所示,PIRATE主要由两部分组成,一是设备可靠性分析,用以分析设备的可靠性,进而决定设备能否能参与学习任务;二是基于多个分片链的安全SGD(Stochastic Gradient Descent)框架。

      我们采取了去中心化的 Ring AllReduce 结构(图2)。这种架构可以更好地分担网络压力,并且可以在验证计算结果的同时进行梯度计算。同时,为了让节点在 Ring AllReduce 结构下高效、可验证地沟通,我们运用了基于分片的区块链。其中,我们把所有节点分成多个委员会,每个节点只需要验证自己委员会内的梯度计算。这种分布式验证极大地降低了广播所带来的时延。

       经过模拟实验,在5G网络条件、较大训练模型、以及大规模用户参与的条件下, 论文提出的PIRATE框架比同类框架 LearningChain 更节省存储空间,分布式机器学习训练速度方面更加高效。

       该项研究的前期工作已经上传到 arXiv。论文一作为本实验室研究生1年级学生周思聪同学,第一篇论文写得很有前瞻性,可圈可点。

论文链接:       

arXiv 链接:https://arxiv.org/abs/1912.07860

ResearchGate 链接: researchGate 页面

===============================================
【English Version】

News: The paper titled “PIRATE: A Blockchain-based Secure Framework of Distributed Machine Learning in 5G Networks” has been accepted by IEEE Network (IF: 9.59) on Aug. 11th, 2020.

Introduction:

With the production cost of AI chips gradually reduced to an acceptable level, mobile devices are better equipped with computational resources for machine learning. Meanwhile, as the bottleneck of distributed machine learning, network conditions would be substantially improved as we march into the 5G era. To exploit the merits of 5G/6G networks, a large-scale distributed learning framework is in need. Particularly, in the large-scale scenario, security problems become even more critical.

  To protect against arbitrary convergence hindrance attacks, we propose PIRATE, a blockchain-based secure distributed learning framework. The framework has great potential utilizing the verification flexibility of blockchain techniques. Such flexibility enables more protection mechanisms to be built on top of the framework, e.g., privacy protection, Model Poisoning Attack protection, incentive mechanism, etc.

  As shown in Figure 1, PIRATE has two components: 1) reliability assessment, which decides whether a device could take part in a learning task; 2) a secure SGD framework based on multiple shard chains.

  We utilize the decentralized architecture, Ring AllReduce (Figure 2), which can better leverage network resources, and enables devices to verify computation results while computing gradients.

  Furthermore, in order to conduct efficient and verifiable communication under the Ring AllReduce setting, we utilize a sharding-based blockchain technique. In particular, we divide nodes into multiple committees, in which nodes are only required to verify gradients within their committee. Such division greatly reduces the latency of broadcasting.

  Simulation experiments show that, under the condition of 5G/6G networks, relatively large training models, and large-scale participants, PIRATE outperforms a similar framework, LearningChain, in terms of storage complexity and latency.

  Please feel free to download and read from the following URLs:

  arXiv: https://arxiv.org/abs/1912.07860

  ResearchGate: PDF
————————————
作者:黄华威

A recent paper on Blockchain has been submitted to arXiv

By Huawei Huang, Dec. 23, 2019

Topic: Consensus of Blockchain Systems

1. Paper TitlePIRATE: A Blockchain-based Secure Framework of Distributed Machine Learning in 5G Networks.

SummaryA sharding-based blockchain framework, for byzantine-resilient distributed-learning under the decentralized 5G computing environment.

Authors: Sicong Zhou*, Huawei Huang*, Wuhui Chen*, Zibin Zheng*, and Song Guo†,

AffiliationsSun Yat-sen University and † Hong Kong Polytechnic University.

旧论文整理:对基于低轨卫星的大数据存储的展望

By Huawei Huang, Dec. 17th, 2019

整理论文发现,2018年2月我们有一篇发表在 IEEE Wireless Communications (中科院一区期刊,IF=11.0) 题目为 “Envision of Wireless Big Data Storage for Low-Earth-Orbit Satellite-based Cloud” 的mini综述论文。

    [附件下载:IEEE-WCM-2018Huang-Envisioned.pdf ] [ResearchGate page’s URL ]

    这篇论文的背景与出发点总结如下:

    美国一家初创公司 Cloud Constellation 于2016年推出了 SpaceBelt  计划,主导开发基于低轨道卫星(Low-earth-oribit, LEO)云存储系统。目标是为企业与政府建立一个与地面互联网完全隔离的、运转在低轨道上可保证数据绝对安全的数据中心(暂且称为“空间数据中心”,Space-based Datacenter)。

    受以上卫星通讯业界的最新业务所启发,可以看出:在未来几年6G研究被逐渐展开的过程中,基于低轨道卫星的全球互联网将是一个非常关键的方向。目前学术界已经出现了一些研究成果。比如,有些文献(详见论文中 [5-7])提出了应用基于低轨道卫星通讯系统进行数据的传输与转移地面网络的数据流量。然而,不难发现,卫星设备在这些现有研究中充当的角色只是数据中转设备。从本质上看,卫星系统仍然属于地面互联网或者地面核心网络的向空中延伸的“附属物”。

    另一方面,经调查发现,有关“空间数据中心”的课题尚且未被学术界提出过。因此,受 SpaceBelt 计划所启发,本论文大胆推测:在未来5年内,关于这个方向的相关研究应该会陆续出现,并将呈现出较快的增长趋势。为了填补学术界对“空间数据中心”研究的空缺,本文主要讨论并总结出一些有价值的科学研究问题与面临的技术挑战。

    我们相信,本文将会照亮一点点6G研究的曙光。
————————————
作者:黄华威

一篇关于NFV的长文正式发表在IEEE TCC

Dec. 10, 2019, by Huawei Huang

      很高兴看到我们一篇关于网络功能虚拟化(NFV)的长文正式发表在 IEEE Transactions on Cloud Computing (TCC) 期刊上。文章信息如下:

     现今来自终端用户的多种应用所产生的流量在到达数据中心服务器之前,需要经过不同种类的网络功能服务的处理。比如,网络流量需要经过防火墙、深度包检测、负载均衡器、视频编码解码器等网络功能虚拟化节点。这篇论文主要研究了基于混合类型的虚拟化网络功能的“服务链(Service Function Chain)”编排与部署问题,提出了能应对多种网络功能需求的快速服务链编排、具有部署灵活、运营效益最大化特点的解决方案。

     IEEE Transactions on Cloud Computing (TCC) 是云计算领域高水准期刊,影响因子为5.967 (中科院SCI期刊一区),每年仅仅刊载几十篇高质量论文。

      另外一篇相同课题的成果同样发表在 IEEE Transactions on Cloud Computing (TCC),已在线但尚未正式出版,论文信息如下:

       不同于传统的服务链编排方法,在第二篇论文中,作者提出在进行服务链编排时需要兼顾考虑以下两个特性: i)不同网络功能虚拟化服务节点之间“横向”流量的变化性,以及 ii)同种服务节点之间的协同操作会造成“纵向”服务能力下降的特点。

       此外,针对基于虚拟机与实体物理设备所组成的混合网络功能体系,作者计划提出基于机器学习预测结果的高容错性快速服务链编排方案。所提出的方法将会帮助服务提供商与网络运营商高效稳健地为 5G/Beyond 5G 用户提供可靠的服务。