Datasets

Datasets & Codes used in our papers. Welcome to use those datasets and cite our publications.


#1. Dataset & Codes for Predicting Machine Failures

Background: This dataset is to implement the failure prediction using machine learning methods and AI approaches such as SVM, random forest, or deep learning algorithms. Besides the original dataset, I also provide two reports written by two visiting students when they performed a visiting-study in my lab in July 2019.
Huawei Huang, and Song Guo, “Proactive Failure Recovery for NFV in Distributed Edge Computing”, IEEE Communications Magazine, vol. 57, no. 5, pp. 131-137, March 2019
  • The dataset after preprocessing:

  • The related technique reports and codes from two visiting students:


#2. Dataset & Codes for Predicting Server Failures

Background: This dataset is used to predict the failures of server machines that occurred on a datacenter. The related published papers are as follows.

Huakun Huang, Lingjun Zhao, Huawei Huang, Song Guo, "Machine Fault Detection for Intelligent Self-Driving Networks", IEEE Communications Magazine, Vol. 58 , Issue No. 1, pp. 40-46, January 2020  [RG-Page]
Huakun Huang, Shuxue Ding, Lingjun Zhao, Huawei Huang, et al., "Real-Time Fault-Detection for IIoT Facilities using GBRBM-based DNN", IEEE Internet of Things Journal, Oct. 21, 2019. DOI: 10.1109/JIOT.2019.2948396 [RG-Page]
  • Original dataset and cleaned dataset:
  • Processing codes:


#3. Dataset & Codes of MVCom (published at ICDCS 2021)

Background: This code-and-dataset shows how we implement the algorithms used in our paper, including the proposed SE algorithm and other 3 baselines (SA, DP, WOA).
Huawei Huang, Zhenyi Huang, Xiaowen Peng, Zibin Zheng, Song Guo, “MVCom: Scheduling Most Valuable Committees for the Large-Scale Sharded Blockchain”, ICDCS, July 2021 [RG-Page & PDF]

According to the requests of some readers, I provide all codes of the algorithms used in this paper such as SE algorithm (Markov-based Algorithm, MA), SA algorithm, DP algorithm, and WOA algorithm. Some figure-plotting codes and partial data are also included. (Updated on Nov. 7, 2022, by Huawei HUANG)




#4. Dataset of ContextFL (published at ICDCS 2022)

Background: This dataset is used to predict the ** CPU / Network Connection / App usages ** of mobile users when they are using their smartphones. The related published paper is as follows.
Huawei Huang, Ruixin Li, Jialiang Liu, Sicong Zhou, Kangying Lin, and Zibin Zheng, “ContextFL: Context-aware Federated Learning by Estimating the Training and Reporting Phases of Mobile Clients”, in proc. of IEEE International Conference on Distributed Computing Systems (ICDCS), 2022.  [RG-Page], [公众号介绍文章]