An image processing method including outputting a predetermined region of one or more frames of video data as a two-dimensional (2D) image and other regions of the one or more frames as a three-dimensional (3D) image by using meta data of the video data, where the meta data includes information to classify the frames into predetermined units.