F.J.Chang, Y.Y.Lin, and K.-J. Hsu, “Multiple structured-instance learning for semantic segmentation with uncertain training data”, Proceedings of the IEEE Computer Vision and Pattern Recognition, pp. 360-367, 2014.
 X. Zhu, Y, Xiong, J, Dai, L, Yuan, and Y. Wei,“Deep feature flow for video recognition”, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2349–2358, 2017.
 D. Lin Y. Li J. Shi, “Low-Latency Video Semantic Segmentation”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
 P.Hu, F.Caba, O.Wang, Z.Lin, S.Sclaroff and F.Perazzi, “Temporally distributed networks for fast video semantic segmentation”, CVPR, pp. 8818–8827, 2020.
 H.Wang, W.Wang and J.Liu, “TEMPORAL MEMORY ATTENTION FOR VIDEO SEMANTIC SEGMENTATION”, CVPR, 2021.
 M.Khalooei, M.Fakhredanesh, M.Sabokrou, “Dominant and rare events detection and localization in video using Generative Adversarial Network”,Journal of Soft Computing and Information Technology (JSCIT), Volume 8, Number 3, pp. 40-51, 2019.
 M.Fakhredanesh, S.Roostaei, “Action Change Detection in Video Based on HOG”, Journal of Electrical and Computer Engineering Innovations (JECEI), pp. 135-144, 2020.
 M. Fayyaz, M. H. Saffar, M. Sabokrou, M. Fathy and R. Klette, “STFCN: spatio-temporal FCN for semantic video segmentation”, CoRR,2016.
 P. Fischer, A. Dosovitskiy, E. Ilg, P. Hausser, C. Hazırbas, V. Golkov, P. van der Smagt, D. Cremers, and T. Brox,“Flownet: Learning optical flow with convolutional networks”, IEEE International Conference on Computer Vision (ICCV), 2015.
 E. L. Denton, S. Chintala, R. Fergus, et al., “Deep generative image models using a laplacian pyramid of adversarial networks”, in Proc. Neural Information Processing Systems(NIPS), pp 1486-1494, 2017.
 F.Galasso, M.Keuper, T.Brox and B. Schiele, "Spectral graph reduction for efficient image and streaming video segmentation", IEEE Conference on Computer Vision and Pattern Recognition, pp. 49-56, 2014.
 A.Khoreva, F.Galasso, M.Hein and B.Schiele, "Classifier based graph construction for video segmentation", Computer Vision and Pattern Recognition (CVPR) 2015 IEEE Conference, pp. 951-960, 2015.
 S. Hickson, S. Birchfield, I. Essa, and H. Christensen, "Efficient hierarchical graph-based segmentation of RGBD videos", IEEE Conference on Computer Vision and Pattern Recognition, pp. 344-351, 2014.
 S.Ardeshir, K.Malcolm and M.Shah, "Geo-semantic segmentation", IEEE Conference on Computer Vision and Pattern Recognition, pp. 2792-2799, 2015.
 G.Bertasius, L.Torresani, S.X.Yu and J.Shi, "Convolutional Random Walk Networks for Semantic Image Segmentation" , arXiv:1605.07681, 2016.
 M.P.Kumar, H.Turki, D.Preston and D.Koller, "Parameter estimation and energy minimization for region-based semantic segmentation", IEEE transactions on pattern analysis and machine intelligence, vol. 37, pp. 1373-1386, 2015.
 M.Volpi and V.Ferrari, "Semantic segmentation of urban scenes by learning local class interactions", IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1-9, 2015.
 A.Sharma, O.Tuzel and D.W.Jacobs, "Deep hierarchical parsing for semantic segmentation", IEEE Conference on Computer Vision and Pattern Recognition, pp. 530- 538, 2015.
 Z.Liu, X. Li, P. Luo, C.-C. Loy and X. Tang, "Semantic image segmentation via deep parsing network", IEEE International Conference on Computer Vision, pp. 1377- 1385, 2015.
 B. Liu, X. He, and S. Gould, "Multi-class semantic video segmentation with exemplar-based object reasoning", IEEE Winter Conference on Applications of Computer Vision, pp. 1014- 1021, 2015.
 L. Sevilla-Lara, D. Sun, V. Jampani, and M. J. Black, "Optical flow with semantic segmentation and localized layers", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
 G. Csurka and F. Perronnin, "An efficient approach to semantic segmentation", International Journal of Computer Vision, vol. 95, pp. 198-212, 2011.
 C.-F. Tsai, K. McGarry, and J. Tait, "Image classification using hybrid neural networks", 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp. 431-432, 2003.
 T. Blaschke, C. Burnett, and A. Pekkarinen, "Image segmentation methods for object-based analysis and classification", Remote sensing image analysis: Including the spatial domain, ed: Springer, pp. 211-236, 2004.
 S.Hochreiter and J.Schmidhuber, “Long short-term memory”, Neural computation, pp. 1735–1780, 1997.
 K.Cho, B.Merrienboer, C.Gulc¸ F.Bougares, H.Schwenk and Y.Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation”, EMNLP, 2014.
 J.Long, E.Shelhamer, and T.Darrell, “Fully convolutional networks for semantic segmentation”, CVPR, pp. 3431– 3440, 2015.
 S.Zheng , “Conditional random fields as recurrent neural networks”, IEEE Int. Conf. Computer Vision, pp. 1529-1537, 2015.
 V.Badrinarayanan, A.Kendall and R.Cipolla, “Segnet: A deep convolutional encoder-decoder architecture for image segmentation”, CoRR, 2015.
 H. Zhao, J. Shi, X. Qi, X. Wang and J. Jia “Pyramid scene parsing network”, CVPR, 2017.
 A.Kundu, V.Vineet and V.Koltun, “Feature space optimization for semantic video segmentation”, CVPR, 2016.
 X.Jin, X.Li, H.Xiao, X.Shen, Z.Lin, J.Yang, Y.Chen, J.Dong, L.Liu and Z.Jie, “Video scene parsing with predictive feature learning”, ICCV, 2017.
 S.Jain, X.Wang and J.Gonzalez, “Accel: A corrective fusion network for efficient semantic segmentation on video”, CVPR, 2019.
 E. Shelhamer, K. Rakelly, J. Hoffman, and T,“Darrell. Clockwork convnets for video semantic segmentation”, European Conference on Computer Vision (ECCV) Workshops, pp. 852-868 , 2016.
 J.Carreira, V.Patraucean, L.Mazare, A.Zisserman and S.Osindero, “Massively parallel video networks”, ECCV, 2018.
 Y.He, W.Chiu, M.Keuper and Mario Fritz, “Std2p: Rgbd semantic segmentation using spatio-temporal data-driven pooling”, CVPR, 2017.
 G.Hinton, O.Vinyals and J.Dean, “Distilling the knowledge in a neural network”, arXiv:1503.02531, 2015.
 G.Huang, Z.Liu, L.V.Maaten and K.Weinberger, “Densely connected convolutional networks”, CVPR, 2017.
 S.Chandra, C.Couprie and I.Kokkinos, “Deep Spatio-Temporal Random Fields for Efficient Video Segmentation”, IEEE Conference of Computer Vision and Pattern Recognition, pp. 8915–8924, 2018.
 A.Handa, V.Patraucean and R.Cipolla, “Spatio-temporal video autoencoder with differentiable memory”, ICLR Workshop, 2016.
 N. Ballas, L. Yao, C. Pal, and A.Courville, “Delving deeper into convolutional networks for learning video representations”, 2016.
 R. Gadde, V. Jampani, and P. V. Gehler,“Semantic video cnns through representation warping”,IEEE International Conference on Computer Vision (ICCV), 2017.
 Yu and F.Koltun, “Multi-scale context aggregation by dilated convolutions”, ICLR, 2016.
, “LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation”, IEEE Conference on Computer Vision and Pattern Recognition (CVPR
 X.Li, A.You, Z.Zhu, H.Zhao, M.Yang, K.Yang, Sh.Tan andY.Tong, ‘Semantic Flow for Fast and Accurate Scene Parsing”, ECCV 2020
, pp. 775-793, 2020.
Semantic Segmentation”, CVPR, 2021.
 Ch.Yu, J.Wang, Ch.Peng and Ch.Gao, “BiSeNet: Bilateral Segmentation Network for Real-Time Semantic Segmentation”, ECCV 2018
, pp. 334-349, 2018.
, “Efficient Semantic Video Segmentation with Per-Frame Inference”, ECCV, pp.352-368, 2020.
, “Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes”, CVPR, 2021.
 L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A. L. Yuille, “Semantic image segmentation with deep convolutional nets and fully connected crfs”, ICLR, 2015.