Abstract
Collaborative inference between large cloud-hosted models and small edge-deployed models offers a promising solution for balancing the accuracy and efficiency of ML-based applications in vehicular networks. Selecting the appropriate models and their hosting nodes for performing various inference tasks plays a crucial role in collaborative inference in vehicular networks. However, existing solutions, primarily based on deep reinforcement learning (DRL), suffer critical limitations, including delayed and suboptimal decisions on model and node selection in dynamic environments. To address these challenges, we propose a dynamic model and node selection strategy for a collaborative inference framework, grounded in active inference theory. Our strategy dynamically aligns task requirements with model capabilities and node capacities by considering factors such as vehicular mobility, latency constraints, task complexity, and model accuracy. Additionally, when significant drops in inference accuracy are detected, we fine-tune and update the models deployed on both the edge and cloud, ensuring reliable and up-to-date inference. By leveraging active inference to minimize free energy through Bayesian belief updates, our framework reduces average latency by 23.2%, lowers task failure rates by 67%, and achieves superior load balancing compared to existing methods. It also demonstrates robust dynamic performance with a 5.1% failure rate under 200% traffic surges, and its hybrid update strategy maintains 85.4% accuracy after 72 hours, effectively addressing the complex and dynamic conditions of vehicular networks.
| Original language | English (US) |
|---|---|
| Title of host publication | Proceedings - 2025 IEEE International Conference on Web Services, ICWS 2025 |
| Editors | Rong N. Chang, Carl K. Chang, Jingwei Yang, Nimanthi Atukorala, Dan Chen, Sumi Helal, Sasu Tarkoma, Qiang He, Tevfik Kosar, Claudio Agostino Ardagna, Amin Beheshti, Bo Cheng, Walid Gaaloul |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 232-243 |
| Number of pages | 12 |
| Edition | 2025 |
| ISBN (Electronic) | 9798331555634 |
| DOIs | |
| State | Published - 2025 |
| Event | 2025 IEEE International Conference on Web Services, ICWS 2025 - Helsinki, Finland Duration: Jul 7 2025 → Jul 12 2025 |
Conference
| Conference | 2025 IEEE International Conference on Web Services, ICWS 2025 |
|---|---|
| Country/Territory | Finland |
| City | Helsinki |
| Period | 7/7/25 → 7/12/25 |
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Science Applications
- Computer Networks and Communications
- Information Systems and Management
- Artificial Intelligence
Fingerprint
Dive into the research topics of 'Dynamic Model and Node Selection for Collaborative Inference of Large/Small Models in Vehicular Networks'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver