High coding efficiency is achieved when disparity-compensated prediction is performed on an encoding (decoding) target picture using depth information representing a three-dimensional position of an object in a reference picture. A correspondence point on the reference picture is set for each pixel of the encoding target picture. Object depth information which is depth information for a pixel at an integer pixel position on the encoding target picture indicated by the correspondence point is set. A tap length for pixel interpolation is determined using reference picture depth information for a pixel at an integer pixel position or an integer pixel position around a fractional pixel position on the reference picture indicated by the correspondence point and the object depth information. A pixel value at the integer pixel position or the fractional pixel position on the reference picture indicated by the correspondence point is generated using an interpolation filter in accordance with the tap length. Inter-view picture prediction is performed by setting the generated pixel value as a predicted value of the pixel at the integer pixel position on the encoding target picture indicated by the correspondence point.