Dynamic Model and Node Selection for Collaborative Inference of Large/Small Models in Vehicular Networks

  • Mengke Zheng
  • , Zhihui Lu
  • , Qiang Duan
  • , Baoqi Huang
  • , Shijing Hut

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Collaborative inference between large cloud-hosted models and small edge-deployed models offers a promising solution for balancing the accuracy and efficiency of ML-based applications in vehicular networks. Selecting the appropriate models and their hosting nodes for performing various inference tasks plays a crucial role in collaborative inference in vehicular networks. However, existing solutions, primarily based on deep reinforcement learning (DRL), suffer critical limitations, including delayed and suboptimal decisions on model and node selection in dynamic environments. To address these challenges, we propose a dynamic model and node selection strategy for a collaborative inference framework, grounded in active inference theory. Our strategy dynamically aligns task requirements with model capabilities and node capacities by considering factors such as vehicular mobility, latency constraints, task complexity, and model accuracy. Additionally, when significant drops in inference accuracy are detected, we fine-tune and update the models deployed on both the edge and cloud, ensuring reliable and up-to-date inference. By leveraging active inference to minimize free energy through Bayesian belief updates, our framework reduces average latency by 23.2%, lowers task failure rates by 67%, and achieves superior load balancing compared to existing methods. It also demonstrates robust dynamic performance with a 5.1% failure rate under 200% traffic surges, and its hybrid update strategy maintains 85.4% accuracy after 72 hours, effectively addressing the complex and dynamic conditions of vehicular networks.

Original languageEnglish (US)
Title of host publicationProceedings - 2025 IEEE International Conference on Web Services, ICWS 2025
EditorsRong N. Chang, Carl K. Chang, Jingwei Yang, Nimanthi Atukorala, Dan Chen, Sumi Helal, Sasu Tarkoma, Qiang He, Tevfik Kosar, Claudio Agostino Ardagna, Amin Beheshti, Bo Cheng, Walid Gaaloul
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages232-243
Number of pages12
Edition2025
ISBN (Electronic)9798331555634
DOIs
StatePublished - 2025
Event2025 IEEE International Conference on Web Services, ICWS 2025 - Helsinki, Finland
Duration: Jul 7 2025Jul 12 2025

Conference

Conference2025 IEEE International Conference on Web Services, ICWS 2025
Country/TerritoryFinland
CityHelsinki
Period7/7/257/12/25

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Computer Networks and Communications
  • Information Systems and Management
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Dynamic Model and Node Selection for Collaborative Inference of Large/Small Models in Vehicular Networks'. Together they form a unique fingerprint.

Cite this