/Wraps/wrap09/controls/api/event_system/render has no handler for Purdue Event Documents of type 'Gilbreth Fellowship Research Proposal'

Beyond Model-Free: Leveraging Model-Based Reinforcement Learning for Software Systems and Computer Networks Optimization

Project Description

Reinforcement Learning (RL) is increasingly used to solve control problems beyond traditional domains such as robotics, autonomous driving, and power management. In software systems and networking, RL has been applied to cloud job scheduling, cellular network resource allocation, video streaming rate adaptation, and DVFS control in mobile devices. Almost all existing efforts rely on model-free RL due to its simplicity and broad applicability: policies are learned directly through trial-and-error without explicit environment modeling. However, model-free RL often suffers from sample inefficiency and slow convergence, limiting its practicality in real-world systems.
This project investigates the potential of model-based RL for practical control problems. A key observation is that in many systems scenarios, accurate state-transition models can be constructed and trained efficiently. Leveraging such models allows RL to first learn the system dynamics and then derive optimal control policies through dynamic programming techniques (e.g., Value Iteration), which can often be computed within seconds for realistic state spaces.
We will investigate model-based RL across multiple applications, including cloud job scheduling, resource management in cloud clusters, and power optimization in edge devices. The success of this project will expand the applicability of RL in systems design and contribute to more efficient, intelligent infrastructure.

Start Date

January 1, 2026

Postdoc Qualifications

1. PH.D. in Computer Science or Electrical and Computer Engineering

2. Strong background in Reinforcement Learning (RL): Demonstrated expertise in both model-free and model-based RL, with a solid grasp of dynamic programming, value iteration, and policy optimization methods.

3. Systems and Networking Knowledge: Experience in at least one relevant domain such as cloud computing, mobile/edge systems, computer networks, or operating systems, with an interest in applying ML to real-world systems control problems.

4. Applied Machine Learning Experience: Proficiency in developing and implementing ML algorithms using frameworks such as PyTorch or TensorFlow, with a track record of applying them to practical problems.

5. Programming and Experimentation Skills: Strong skills in Python/C++ (or similar), experience with simulation environments and system-level benchmarking, and the ability to design reproducible experiments.

6. Research and Communication Ability: Proven research record (e.g., publications in ML, systems, or networking venues) and strong written/oral communication skills for disseminating results and collaborating across disciplines.

Co-advisors

Prof. Y. Charlie Hu, [email protected], ECE, https://engineering.purdue.edu/~ychu/

Prof. ShaoShuai Mou, Aeronautics and Astronautics, https://engineering.purdue.edu/AAE/people/ptProfile?resource_id=124981

Bibliography

P. C. Heredia, J. George, S. Mou. Distributed Offline Reinforcement Learning. Proceedings of IEEE Conference on Decision and Control, Cancun, Mexico, 2022.

Y. Xie, S. Mou, S. Sundaram. Communication-Efficient and Resilient Distributed Q-Learning. IEEE Transactions on Neural Networks and Learning Systems, 1-14, 2023.

A. Jajoo, Y. C. Hu, X. Lin, and Nan Deng. Slearn: A Case for Task Sampling based Learning for Cluster Job Scheduling. In IEEE Transactions on Cloud Computing, Vol. 11(3), pp. 2664-2680, July-September 2023.

Z. Jonny Kong, Qiang Xu, Jiayi Meng, and Y. Charlie Hu. AccuMO: Accuracy-Centric Multitask Offloading in Edge-Assisted Mobile Augmented Reality. In Proc. of ACM MobiCom, October 2-6, 2023.

Z. Jonny Kong, Qiamg Xu, Y. Charlie Hu. PPIPE: Efficient Video Analytics Serving on Heterogeneous GPU Clusters via Pool-Based Pipeline Parallelism. In Proc. of USENIX ATC, July 7-9, 2025.