Edge Guided Generation Network for Video Prediction

Kai Xu, Guorong Li, Huijuan Xu, Weigang Zhang, Qingming Huang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations


Video prediction is a challenging problem due to the highly complex variation of video appearance and motions. Traditional methods that directly predict pixel values often result in blurring and artifacts. Furthermore, cumulative errors can lead to a sharp drop of prediction quality in long-term prediction. To alleviate the above problems, we propose a novel edge guided video prediction network, which firstly models the dynamic of frame edges and predicts the future frame edges, then generates the future frames under the guidance of the obtained future frame edges. Specifically, our network consists of two modules that are ConvLSTM based edge prediction module and the edge guided frames generation module. The whole network is differentiable and can be trained end-to-end without any supervision effort. Extensive experiments on KTH human action dataset and challenging autonomous driving KITTI dataset demonstrate that our method achieves better results than state-of-the-art methods especially in long-term video predictions.

Original languageEnglish (US)
Title of host publication2018 IEEE International Conference on Multimedia and Expo, ICME 2018
PublisherIEEE Computer Society
ISBN (Electronic)9781538617373
StatePublished - Oct 8 2018
Event2018 IEEE International Conference on Multimedia and Expo, ICME 2018 - San Diego, United States
Duration: Jul 23 2018Jul 27 2018

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X


Conference2018 IEEE International Conference on Multimedia and Expo, ICME 2018
Country/TerritoryUnited States
CitySan Diego

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computer Science Applications


Dive into the research topics of 'Edge Guided Generation Network for Video Prediction'. Together they form a unique fingerprint.

Cite this