Computer Science and Engineering, Department of

Computer Science, Computer Engineering, and Bioinformatics: Dissertations, Theses, and Student Research
First Advisor
Leen-Kiat Soh
Committee Members
Nirnimesh Ghose, Hau Chan
Date of this Version
7-23-2025
Document Type
Thesis
Citation
A thesis presented to the faculty of the Graduate College at the University of Nebraska in partial Fulfillment of requirements for the degree of Master of Science
Major: Computer Science
Under the supervision of Professor Leen-Kiat Soh
Lincoln, Nebraska, July 23, 2025
Abstract
Multi-agent systems (MAS) possess significant potential for modeling real-world scenarios requiring coordinated actions (like wildfire fighting or ridesharing) among autonomous entities or agents (e.g., wildfire fighting agents) in complex, dynamic environments. Effective decision-theoretic planning (where each agent must carefully consider both the immediate and the future situations or states, and coordinate with the other agents (neighbors) to evaluate what needs to be done at present) within MAS, especially multiagent planning, where the planning agent directly models its neighbors in order to estimate their optimal actions, is critical, yet challenged by factors like partial observability, openness, and diverse agent types with different goals, capabilities, or both.
Current state-of-the-art frameworks for this planning come with certain drawbacks. The standard Interactive Partially Observable Markov Decision Process (I-POMDP) suffers from scalability as it becomes intractable for the planning agent to maintain an interactive belief over beliefs for each of its neighbors. The relatively faster I-POMDP-Lite suffers from incorrectly assuming full observability to approximately solve a nested MDP for each neighbor. Although solvers like Partially Observable Monte Carlo Planning (POMCP) enhance I-POMDP by accounting for partial observability and approximating the resulting belief updates—planning ad hoc for each neighbor still renders real-time computation impractical for large domains. Offline solvers such as Point Based Value Iteration (PBVI) improve scalability as well as account for partial observability but have not been previously analyzed regarding agent openness.
Addressing these gaps, this thesis investigates PBVI as a scalable offline solver for partial observability settings and integrates PBVI-generated neighbor action policies into the I-POMDP-based planning, while simultaneously adopting a model-aware belief update algorithm for efficient belief updates. Our contributions include notation-level refinements and a reimplementation of the PBVI algorithm to better handle environments with both fully and partially observable components, along with efficient belief updates, adaptable implementations, and comprehensive empirical analysis highlighting agent behavior under varying agent openness levels in environments with different agent types (cooperating and competing). These modifications make effective MAS planning computationally feasible, fostering practical applicability across complex, real-world domains.
Advisor: Leen-Kiat Soh
Comments
A thesis presented to the faculty of the Graduate College at the University of Nebraska in partial fulfillment of requirements for the degree of Master of Science
Major: Computer Science
Under the supervision of Professor Leen-Kiat Soh
Lincoln, Nebraska, July 23, 2025