Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
skip to main content
10.1145/3468264.3473936acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Intelligent container reallocation at Microsoft 365

Published: 18 August 2021 Publication History
  • Get Citation Alerts
  • Abstract

    The use of containers in microservices has gained popularity as it facilitates agile development, resource governance, and software maintenance. Container reallocation aims to achieve workload balance via reallocating containers over physical machines. It affects the overall performance of microservice-based systems. However, container scheduling and reallocation remain an open issue due to their complexity in real-world scenarios. In this paper, we propose a novel Multi-Phase Local Search (MPLS) algorithm to optimize container reallocation. The experimental results show that our optimization algorithm outperforms state-of-the-art methods. In practice, it has been successfully applied to Microsoft 365 system to mitigate hotspot machines and balance workloads across the entire system.

    References

    [1]
    Emile Aarts and Jan Karel Lenstra. 2003. Local search in combinatorial optimization. Princeton University Press.
    [2]
    Anton Beloglazov and Rajkumar Buyya. 2012. Optimal online deterministic algorithms and adaptive heuristics for energy and performance efficient dynamic consolidation of virtual machines in cloud data centers. Concurrency and Computation: Practice and Experience, 24, 13 (2012), 1397–1420.
    [3]
    Brendan Burns, Joe Beda, and Kelsey Hightower. 2019. Kubernetes: up and running: dive into the future of infrastructure. O’Reilly Media.
    [4]
    Shaowei Cai, Wenying Hou, Yiyuan Wang, Chuan Luo, and Qingwei Lin. 2020. Two-goal Local Search and Inference Rules for Minimum Dominating Set. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI. 1467–1473.
    [5]
    Chun Yew Cheong, Kay Chen Tan, DK Liu, and CJ Lin. 2010. Multi-objective and prioritized berth allocation in container ports. Annals of Operations Research, 180, 1 (2010), 63–103.
    [6]
    Marco Dorigo and Luca Maria Gambardella. 1997. Ant colony system: a cooperative learning approach to the traveling salesman problem. IEEE Transactions on evolutionary computation, 1, 1 (1997), 53–66.
    [7]
    Rajdeep Dua, A Reddy Raja, and Dharmesh Kakadia. 2014. Virtualization vs containerization to support paas. In 2014 IEEE International Conference on Cloud Engineering. 610–614.
    [8]
    Brian Everitt and Anders Skrondal. 2002. The Cambridge dictionary of statistics. 106, Cambridge University Press Cambridge.
    [9]
    Sören Frey, Florian Fittkau, and Wilhelm Hasselbring. 2013. Search-based genetic optimization for deployment and reconfiguration of software in the cloud. In 2013 35th international conference on software engineering (ICSE). 512–521.
    [10]
    Chuangen Gao, Hua Wang, Linbo Zhai, Yanqing Gao, and Shanwen Yi. 2016. An energy-aware ant colony algorithm for network-aware virtual machine placement in cloud computing. In 2016 IEEE 22nd international conference on parallel and distributed systems (ICPADS). 669–676.
    [11]
    Mitsuo Gen and Runwei Cheng. 1999. Genetic algorithms and engineering optimization. 7, John Wiley & Sons.
    [12]
    Sandip Kumar Goyal and Manpreet Singh. 2012. Adaptive and dynamic load balancing in grid using ant colony optimization. International Journal of Engineering and Technology, 4, 4 (2012), 167–174.
    [13]
    Xinjie Guan, Xili Wan, Baek-Young Choi, Sejun Song, and Jiafeng Zhu. 2016. Application oriented dynamic resource allocation for data centers using docker containers. IEEE Communications Letters, 21, 3 (2016), 504–507.
    [14]
    Carlos Guerrero, Isaac Lera, and Carlos Juiz. 2018. Genetic algorithm for multi-objective optimization of container allocation in cloud architecture. Journal of Grid Computing, 16, 1 (2018), 113–135.
    [15]
    Carlos Guerrero, Isaac Lera, and Carlos Juiz. 2018. Resource optimization of container orchestration: a case study in multi-cloud microservices-based applications. The Journal of Supercomputing, 74, 7 (2018), 2956–2983.
    [16]
    Rui Han, Li Guo, Moustafa M Ghanem, and Yike Guo. 2012. Lightweight resource scaling for cloud applications. In 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012). 644–651.
    [17]
    Sijin He, Li Guo, Yike Guo, Chao Wu, Moustafa Ghanem, and Rui Han. 2012. Elastic application container: A lightweight approach for cloud resource provisioning. In 2012 IEEE 26th International Conference on Advanced Information Networking and Applications. 15–22.
    [18]
    Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D Joseph, Randy H Katz, Scott Shenker, and Ion Stoica. 2011. Mesos: A platform for fine-grained resource sharing in the data center. In NSDI. 11, 22–22.
    [19]
    Mohamed K Hussein, Mohamed H Mousa, and Mohamed A Alqarni. 2019. A placement architecture for a container as a service (CaaS) in a cloud environment. Journal of Cloud Computing, 8, 1 (2019), 1–15.
    [20]
    Andrzej Jaszkiewicz. 2002. Genetic local search for multi-objective combinatorial optimization. European journal of operational research, 137, 1 (2002), 50–71.
    [21]
    Mohamed Amine Kaaouache and Sadok Bouamama. 2015. Solving bin packing problem with a hybrid genetic algorithm for VM placement in cloud. Procedia Computer Science, 60 (2015), 1061–1069.
    [22]
    Chanwit Kaewkasi and Kornrathak Chuenmuneewong. 2017. Improvement of container scheduling for docker using ant colony optimization. In 2017 9th international conference on knowledge and smart technology (KST). 254–259.
    [23]
    Kuljeet Kaur, Tanya Dhand, Neeraj Kumar, and Sherali Zeadally. 2017. Container-as-a-service at the edge: Trade-off between energy efficiency and service availability at fog nano data centers. IEEE wireless communications, 24, 3 (2017), 48–56.
    [24]
    Kubernetes. [n.d.]. Intro to Windows support in Kubernetes. https://kubernetes.io/docs/setup/production-environment/windows/intro-windows-in-kubernetes/ Accessed: 2021-05-06.
    [25]
    Jan Karel Lenstra, AHG Rinnooy Kan, and Peter Brucker. 1977. Complexity of machine scheduling problems. In Annals of discrete mathematics. 1, Elsevier, 343–362.
    [26]
    Yusen Li, Xueyan Tang, and Wentong Cai. 2014. On dynamic bin packing for resource allocation in the cloud. In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures. 2–11.
    [27]
    Ze Li, Qian Cheng, Ken Hsieh, Yingnong Dang, Peng Huang, Pankaj Singh, Xinsheng Yang, Qingwei Lin, Youjiang Wu, Sebastien Levy, and Murali Chintalapati. 2020. Gandalf: An Intelligent, End-To-End Analytics Service for Safe Deployment in Large-Scale Cloud Infrastructure. In Proceedings of NSDI 2020. 389–402.
    [28]
    Roberto Morabito, Jimmy Kjällman, and Miika Komu. 2015. Hypervisors vs. lightweight virtualization: a performance comparison. In 2015 IEEE International Conference on Cloud Engineering. 386–393.
    [29]
    Irakli Nadareishvili, Ronnie Mitra, Matt McLarty, and Mike Amundsen. 2016. Microservice architecture: aligning principles, practices, and culture. " O’Reilly Media, Inc.".
    [30]
    Nitin Naik. 2016. Building a virtual system of systems using docker swarm in multiple clouds. In 2016 IEEE International Symposium on Systems Engineering (ISSE). 1–3.
    [31]
    Claus Pahl. 2015. Containerization and the paas cloud. IEEE Cloud Computing, 2, 3 (2015), 24–31.
    [32]
    Claus Pahl, Antonio Brogi, Jacopo Soldani, and Pooyan Jamshidi. 2017. Cloud container technologies: a state-of-the-art review. IEEE Transactions on Cloud Computing, 7, 3 (2017), 677–692.
    [33]
    Sareh Fotuhi Piraghaj, Amir Vahid Dastjerdi, Rodrigo N Calheiros, and Rajkumar Buyya. 2015. A framework and algorithm for energy efficient container consolidation in cloud data centers. In 2015 IEEE International Conference on Data Science and Data Intensive Systems. 368–375.
    [34]
    Weijia Song, Zhen Xiao, Qi Chen, and Haipeng Luo. 2013. Adaptive resource provisioning for the cloud using online bin packing. IEEE Trans. Comput., 63, 11 (2013), 2647–2660.
    [35]
    Boxiong Tan, Hui Ma, and Yi Mei. 2019. Novel genetic algorithm with dual chromosome representation for resource allocation in container-based clouds. In 2019 IEEE 12th International Conference on Cloud Computing (CLOUD). 452–456.
    [36]
    Medhat A Tawfeek, Ashraf El-Sisi, Arabi E Keshk, and Fawzy A Torkey. 2013. Cloud task scheduling based on ant colony optimization. In 2013 8th international conference on computer engineering & systems (ICCES). 64–69.
    [37]
    Andrea Tosatto, Pietro Ruiu, and Antonio Attanasio. 2015. Container-based orchestration in cloud: state of the art and challenges. In 2015 Ninth International Conference on Complex, Intelligent, and Software Intensive Systems. 70–75.
    [38]
    Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, and John Wilkes. 2015. Large-scale cluster management at Google with Borg. In Proceedings of the Tenth European Conference on Computer Systems. 1–17.
    [39]
    Shu-Ching Wang, Kuo-Qin Yan, Wen-Pin Liao, and Shun-Sheng Wang. 2010. Towards a load balancing in a three-level cloud computing network. In 2010 3rd international conference on computer science and information technology. 1, 108–113.
    [40]
    Guiyi Wei, Athanasios V Vasilakos, Yao Zheng, and Naixue Xiong. 2010. A game-theoretic method of fair resource allocation for cloud computing services. The journal of supercomputing, 54, 2 (2010), 252–269.
    [41]
    Thomas Weise, Yuezhong Wu, Raymond Chiong, Ke Tang, and Jörg Lässig. 2016. Global versus local search: the impact of population sizes on evolutionary algorithm performance. Journal of Global Optimization, 66, 3 (2016), 511–534.
    [42]
    Timothy Wood, Prashant J Shenoy, Arun Venkataramani, and Mazin S Yousif. 2007. Black-box and Gray-box Strategies for Virtual Machine Migration. In NSDI. 7, 17–17.
    [43]
    Timothy Wood, Gabriel Tarasuk-Levin, Prashant Shenoy, Peter Desnoyers, Emmanuel Cecchet, and Mark D Corner. 2009. Memory buddies: exploiting page sharing for smart colocation in virtualized data centers. ACM SIGOPS Operating Systems Review, 43, 3 (2009), 27–36.
    [44]
    Gaochao Xu, Junjie Pang, and Xiaodong Fu. 2013. A load balancing model based on cloud partitioning for the public cloud. Tsinghua Science and Technology, 18, 1 (2013), 34–39.
    [45]
    Yong Xu, Kaixin Sui, Randolph Yao, Hongyu Zhang, Qingwei Lin, Yingnong Dang, Peng Li, Keceng Jiang, Wenchi Zhang, Jian-Guang Lou, Murali Chintalapati, and Dongmei Zhang. 2018. Improving Service Availability of Cloud Systems by Predicting Disk Error. In Proceedings of USENIX ATC 2018. 481–494.
    [46]
    Zhi-Hui Zhan, Xiao-Fang Liu, Yue-Jiao Gong, Jun Zhang, Henry Shu-Hung Chung, and Yun Li. 2015. Cloud computing resource scheduling and a survey of its evolutionary approaches. ACM Computing Surveys (CSUR), 47, 4 (2015), 1–33.
    [47]
    Xiaodong Zhang, Yanxia Qu, and Li Xiao. 2000. Improving distributed workload performance by sharing both CPU and memory resources. In Proceedings 20th IEEE International Conference on Distributed Computing Systems. 233–241.

    Cited By

    View all
    • (2023)Understanding and Addressing the Allocation of Microservices into Containers: A ReviewIETE Journal of Research10.1080/03772063.2023.2205864(1-14)Online publication date: 30-Apr-2023
    • (2022)SamplingCA: effective and efficient sampling-based pairwise testing for highly configurable software systemsProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3540250.3549155(1185-1197)Online publication date: 7-Nov-2022

    Index Terms

    1. Intelligent container reallocation at Microsoft 365

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      ESEC/FSE 2021: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
      August 2021
      1690 pages
      ISBN:9781450385626
      DOI:10.1145/3468264
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 August 2021

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Container reallocation
      2. local search optimization
      3. workload balance

      Qualifiers

      • Research-article

      Conference

      ESEC/FSE '21
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 112 of 543 submissions, 21%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)43
      • Downloads (Last 6 weeks)3

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Understanding and Addressing the Allocation of Microservices into Containers: A ReviewIETE Journal of Research10.1080/03772063.2023.2205864(1-14)Online publication date: 30-Apr-2023
      • (2022)SamplingCA: effective and efficient sampling-based pairwise testing for highly configurable software systemsProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3540250.3549155(1185-1197)Online publication date: 7-Nov-2022

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media