VidQ: Video Query Using Optimized Audio-Visual Processing

Noor Felemban, Fidan Mehmeti, Thomas F.La Porta

Research output: Contribution to journalArticlepeer-review


As mobile devices become more prevalent in everyday life and the amount of recorded and stored videos increases, efficient techniques for searching video content become more important. When a user sends a query searching for a specific action in a large amount of data, the goal is to respond to the query accurately and fast. In this paper, we address the problem of responding to queries which search for specific actions in mobile devices in a timely manner by utilizing both visual and audio processing approaches. We build a system, called VidQ, which consists of several stages, and that uses various Convolutional Neural Networks (CNNs) and Speech APIs to respond to such queries. As the state-of-the-art computer vision and speech algorithms are computationally intensive, we use servers with GPUs to assist mobile users in the process. After a query is issued, we identify the different stages of processing that will take place. Then, we identify the order of these stages. Finally, solving an optimization problem that captures the system behavior, we distribute the process among the available network resources to minimize the processing time. Results show that VidQ reduces the completion time by at least 50% compared to other approaches.

Original languageEnglish (US)
Pages (from-to)1338-1352
Number of pages15
JournalIEEE/ACM Transactions on Networking
Issue number3
StatePublished - Jun 1 2023

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Science Applications
  • Computer Networks and Communications
  • Electrical and Electronic Engineering


Dive into the research topics of 'VidQ: Video Query Using Optimized Audio-Visual Processing'. Together they form a unique fingerprint.

Cite this