In this paper, entitled "The Good, the Bad, and the Ugly: Neural Networks Straight from JPEG", we investigate whether the spatial resolution and JPEG quality affects the performance of CNNs fed with DCT coefficients. More specifically, we studied several aspects of a state-of-the-art CNN recently proposed by Gueguen et al. [1], which is a modified version of the ResNet-50 architecture [2]. Despite the speed-up obtained by partially decoding JPEG images, their architectural changes raised the computation complexity and the number of parameters of the network. To alleviate these drawbacks, we propose a Frequency Band Selection (FBS) technique to select the most relevant DCT coefficients before feeding them to the network. A comparison among the original ResNet-50 network [2], the modified ResNet-50 network proposed by Gueguen et al. [1], and our improved version with FBS is presented below.

Original ResNet-50 network

Original ResNet-50 network [2]

ResNet-50 using DCT as input

ResNet-50 using DCT as input [1]

ResNet-50 using DCT and FBS

ResNet-50 using DCT and FBS

Our experiments analyzed the impact on computational complexity and accuracy of the network when only the 64, 32, and 16 lowest frequency DCT coefficients of each color component were used as input. We also evaluated the effects on the performance of CNNs fed with DCT coefficients under different spatial resolutions and JPEG quality settings. Our results demonstrated that the evaluated networks were robust to changes in the JPEG quality, but susceptible to variations in the spatial resolution. In addition, our FBS method proved to be effective in reducing the computational complexity of the network.

This paper will be presented at the virtual 2020 IEEE International Conference on Image Processing (ICIP), 25-28 October! Join us to discuss the latest in image processing and emerging technology. #ICIP2020 https://l.feathr.co/IEEE-International-Conference-on-Image-Processing-2020-ICIP_Jurandy-Almeida

[1] L. Gueguen, A. Sergeev, B. Kadlec, R. Liu, and J. Yosin-ski,  "Faster Neural Networks Straight from JPEG,"  in Annual Conference on Neural Information Processing Sys-tems (NIPS'18), 2018, pp. 3937-3948.

[2] K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in IEEE International Conference on Computer Vision and Pattern Recognition(CVPR’16), 2016, pp. 770-778.


S. F. Santos, N. Sebe, and J. Almeida. "The Good, the Bad, and the Ugly: Neural Networks Straight from JPEG", in IEEE International Conference on Image Processing (ICIP'20), 2020, pp. 1-5.