Wed Jul 10, 2024

PowerStorm: Energy-Aware Streaming Analytics Job Scheduling for Edge Computing

The Internet of Things (IoT) is rapidly evolving, connecting the physical and digital worlds through internet-enabled devices. This growth is leading to new applications like autonomous robotic swarms, intelligent transportation, and AR-powered wearables, resulting in an exponential increase in data, expected to exceed 79 ZB by 2025. To handle this data influx, Distributed Stream Processing Engines (DSPEs) like Storm, Spark-Streaming, and Flink are being used for scalable, low-latency IoT applications by processing data near its source rather than in the cloud. However, edge computing faces challenges such as resource heterogeneity and varying network capabilities, which can create performance bottlenecks. To address these issues, advanced streaming job schedulers are improving performance by acknowledging heterogeneity and optimizing task allocation. Additionally, the growing energy footprint of ICT, driven by AI advancements, has surpassed 1% of global energy demand. With most enterprise data expected to be processed outside traditional data centers by 2025, balancing energy efficiency and performance in edge computing is becoming increasingly critical.

To tackle these issues, we presented PowerStorm in CloudCom2023. PowerStorm provides a comprehensive approach for optimizing streaming analytics jobs within Distributed Stream Processing Engines (DSPEs), focusing on balancing performance optimization with energy efficiency. PowerStorm highlighted a generalized problem description and an algorithmic framework that DSPEs can utilize to navigate the trade-offs between these two critical aspects when assigning streaming analytics job operators to worker nodes. PowerStorm provides a scheduler specifically designed for Apache Storm, which incorporates our energy-aware algorithmic framework. 

The main goal of the collaboration between LInC lab and 6G-SANDBOX is to demonstrate the effectiveness of our scheduler by conducting a set of experiments on 6G-SANDBOX infrastructures. Specifically: 

  • Multiple streaming benchmarks will be performed on Berlin’s testbed, enriching it with a set of 5G-enabled IoT physical devices. The results of our experimentation would be beneficial for our research but also will highlight new opportunities for energy-aware analytic job scheduling on 5G infrastructures which should consist of multiple edge devices characterized by diverse resource capacities, network conditions, and energy limitations. 

  • In our experimental studies, we will showcase PowerStorm's performance in comparison to Apache Storm's default scheduler and R-Storm, which is the most widely used open-source resource-aware scheduler for Storm.

  • Lastly, the extracted monitoring metrics will be combined in a comprehensive dataset, which will be provided publicly available.

Keywords: Streaming Processing, 5G/6G Networks, Edge Computing, Apache Storm

Raised Image