发明名称 Objective assessment method for stereoscopic video quality based on wavelet transform
摘要 An objective assessment method for a stereoscopic video quality based on a wavelet transform fuses brightness values of pixels in a left viewpoint image and a right viewpoint image of a stereoscopic image in a manner of binocular brightness information fusion, and obtains a binocular fusion brightness image of the stereoscopic image. The manner of binocular brightness information fusion overcomes a difficulty in assessing a stereoscopic perception quality of a stereoscopic video quality assessment to some extent and effectively increases an accuracy of a stereoscopic video objective quality assessment. When weighing qualities of each frame group in a binocular fusion brightness image video corresponding to a distorted stereoscopic video, the objective assessment method fully considers a sensitivity degree of a human eye visual characteristic to various types of information in the video, and determines a weight of each frame group based on a motion intensity and a brightness difference.
申请公布号 US9460501(B1) 申请公布日期 2016.10.04
申请号 US201615054151 申请日期 2016.02.26
申请人 Ningbo University 发明人 Jiang Gangyi;Song Yang;Peng Zongju;Chen Fen;Zheng Kaihui;Liu Shanshan
分类号 G06K9/00;G06T7/00;H04N13/00;G06T7/20 主分类号 G06K9/00
代理机构 代理人
主权项 1. An objective assessment method for a stereoscopic video quality based on a wavelet transform, comprising steps of: {circle around (1)} representing an original undistorted stereoscopic video by Vorg, and representing a distorted stereoscopic video to-be-assessed by Vdis; {circle around (2)} calculating a binocular fusion brightness of each pixel in each frame of a stereoscopic image of the Vorg; denoting the binocular fusion brightness of a first pixel having coordinates of (u,v) in an f th frame of the stereoscopic image of the Vorg as Borgf(u,v), Borgf(u,v)=√{square root over ((IorgR,f(u,v))2+(IorgL,f(u,v))2+2(IorgR,f(u,v)×IorgL,f(u,v)×cos ∂)×λ)}; then according to the respective binocular fusion brightnesses of all the pixels in each frame of the stereoscopic image of the Vorg, obtaining a binocular fusion brightness image of each frame of the stereoscopic image in the Vorg; denoting the binocular fusion brightness image of the f th frame of the stereoscopic image in the Vorg as Borgf, wherein a second pixel having the coordinates of (u,v) in the Borgf has a pixel value of the Borgf(u,v); according to the respective binocular fusion brightness images of all the stereoscopic images in the Vorg, obtaining a binocular fusion brightness image video corresponding to the Vorg, denoted as Borg, wherein an f th frame of the binocular fusion brightness image in the Borg is the Borgf; and calculating a binocular fusion brightness of each pixel in each frame of a stereoscopic image of the Vdis; denoting the binocular fusion brightness of a third pixel having the coordinates of (u,v) in an f th frame of the stereoscopic image of the Vdis as Bdisf(u,v), Bdisf(u,v)=√{square root over ((IdisR,f(u,v))2+(IdisL,f(u,v))2+2(IdisR,f(u,v)×IdisL,f(u,v)×cos ∂)×λ)}; then according to the respective binocular fusion brightnesses of all the pixels in each frame of the stereoscopic image of the Vdis, obtaining a binocular fusion brightness image of each frame of the stereoscopic image in the Vdis; denoting the binocular fusion brightness image of the f th frame of the stereoscopic image in the Vdis as Bdisf, wherein a fourth pixel having the coordinates of (u,v) in the Bdisf has a pixel value of the Bdisf(u,v); according to the respective binocular fusion brightness images of all the stereoscopic images in the Vdis, obtaining a binocular fusion brightness image video corresponding to the Vdis, denoted as Bdis, wherein an f th frame of the binocular fusion brightness image in the Bdis is the Bdisf; wherein: 1≦f≦Nf, wherein the f has an initial value of 1; the Nf represents a total frame number of the stereoscopic images respectively in the Vorg and the Vdis; 1≦u≦U,1≦v≦V, wherein the U represents a width of the stereoscopic image respectively in the Vorg and the Vdis, and the V represents a height of the stereoscopic image respectively in the Vorg and the Vdis; the IorgR,f(u,v) represents a brightness value of a fifth pixel having the coordinates of (u,v) in a right viewpoint image of the f th frame of the stereoscopic image of the Vorg; the IorgL,f(u,v) represents a brightness value of a sixth pixel having the coordinates of (u,v) in a left viewpoint image of the f th frame of the stereoscopic image of the Vorg; the IdisR,f(u,v) represents a brightness value of a seventh pixel having the coordinates of (u,v) in a right viewpoint image of the f th frame of the stereoscopic image of the Vdis; the IdisL,f(u,v) represents a brightness value of an eighth pixel having the coordinates of (u,v) in a left viewpoint image of the f th frame of the stereoscopic image of the Vdis; the ∂ represents a fusion angle; and the λ represents a brightness parameter of a display; {circle around (3)} adopting 2n frames of the binocular fusion brightness images as a frame group; respectively dividing the Borg and the Bdis into nGoF frame groups; denoting an i th frame group in the Borg as Gorgi; and denoting an i th frame group in the Bdis as Gdisi; wherein: the n is an integer in a range of [3,5];nGoF=⌊Nf2n⌋,  wherein the └ ┘ is a round-down symbol; and 1≦i≦nGoF; {circle around (4)} processing each frame group in the Borg with a one-level three-dimensional wavelet transform, and obtaining eight groups of first sub-band sequences corresponding to each frame group in the Borg, wherein: the eight groups of the first sub-band sequences comprise four groups of first time-domain high-frequency sub-band sequences and four groups of first time-domain low-frequency sub-band sequences; and each group of the first sub-band sequence comprises2n2  first wavelet coefficient matrixes; and processing each frame group in the Bdis with the one-level three-dimensional wavelet transform, and obtaining eight groups of second sub-band sequences corresponding to each frame group in the Bdis, wherein: the eight groups of the second sub-band sequences comprise four groups of second time-domain high-frequency sub-band sequences and four groups of second time-domain low-frequency sub-band sequences; and each group of the second sub-band sequence comprises2n2  second wavelet coefficient matrixes; {circle around (5)} calculating respective qualities of two groups among the eight groups of the second sub-band sequences corresponding to each frame group in the Bdis; and denoting a quality of a j th group of the second sub-band sequence corresponding to the Gdisi asQi,j,Qi,j=∑k=1K⁢SSIM⁡(VIorgi,j,k,VIdisi,j,k)K,  wherein: j=1,5; the 1≦k≦K; the K represents a total number of the wavelet coefficient matrixes respectively in each group of the first sub-band sequence corresponding to each frame group in the Borg and each group of the second sub-band sequence corresponding to each frame group in the Bdis; andK=2n2;  the VIorgi,j,k represents a k th first wavelet coefficient matrix of a j th group of the first sub-band sequence corresponding to the Gorgi; the VIdisi,j,k represents a k th second wavelet coefficient matrix of the j th group of the second sub-band sequence corresponding to the Gdisi; and SSIM( ) is a structural similarity calculation function; {circle around (6)} according to the respective qualities of two groups among the eight groups of the second sub-band sequences corresponding to each frame group in the Bdis, calculating a quality of each frame group in the Bdis; and denoting the quality of the Gdisi as QGoFi, QGoFi=wG×Qi,1+(1−wG)×Qi,5, wherein: the wG is a weight of the Qi,1; the Qi,1 represents the quality of a first group of the second sub-band sequence corresponding to the Gdisi; and the Qi,5 represents the quality of a fifth group of the second sub-band sequence corresponding to the Gdisi; and {circle around (7)} according to the quality of each frame group in the Bdis, calculating an objective assessment quality of the Vdis and denoting the objective assessment quality of theVdis⁢⁢as⁢⁢Qv,Qv=∑i=1nGoF⁢wi×QGoFi∑i=1nGoF⁢wi,  wherein the wi is a weight of the QGoFi.
地址 Ningbo, Zhejiang CN