Bayesian-Driven Automated Scaling in Stream Computing with Multiple QoS Targets

Abstract

Stream processing systems commonly work withauto-scaling to ensure resource efficiency and quality of service(QoS). Existing auto-scaling solutions lack accuracy in resource al-location because they rely on static QoS-resource models that fail toaccount for high workload variability and use indirect metrics withmuch distractive information. Moreover, different types of QoSmetrics present different characteristics and thus need individualauto-scaling methods. In this paper, we propose a versatile auto-scaling solution for operator-level parallelism configuration, called AuTraScale+, to meet the throughput, processing-time latency,and event-time latency targets. AuTraScale+ follows the Bayesianoptimization framework to make scaling decisions. First, it usesthe Gaussian process model to eliminate the negative influenceof uncertain factors on the performance model accuracy. Second,it leverages the expected improvement-based (EI-based) acquisi-tion function to search and recommend the optimal configurationquickly. Besides, to make a more accurate scaling decision when thenew model is not ready, AuTraScale+ proposes a transfer learningalgorithm to estimate the benefits of all configurations at a newrate based on existing models and then recommend the optimalone. We implement and evaluate AuTraScale+ on the Flink plat-form. The experimental results on three representative workloadsdemonstrate that compared with the state-of-the-art methods, Au-TraScale+ can reduce 66.6% and 36.7% resource consumption, re-spectively, in the scale-down and scale-up scenarios while achievingtheir throughput and processing-time latency targets. Comparedwith other methods of optimizing event-time latency, AuTraScale+saves 26.9% of resources on average.

Publication
IEEE Transactions on Parallel and Distributed Systems(TPDS)
Zhang Liang
Zhang Liang
Ph.D. Student

My research interests include resource scaling and task scheduling in stream computing and edge computing scenarios.

Related