发明名称 Online coupled camera pose estimation and dense reconstruction from video
摘要 A product may receive each image in a stream of video image of a scene, and before processing the next image, generate information indicative of the position and orientation of an image capture device that captured the image at the time of capturing the image. The product may do so by identifying distinguishable image feature points in the image; determining a coordinate for each identified image feature point; and for each identified image feature point, attempting to identify one or more distinguishable model feature points in a three dimensional (3D) model of at least a portion of the scene that appears likely to correspond to the identified image feature point. Thereafter, the product may find each of the following that, in combination, produce a consistent projection transformation of the 3D model onto the image: a subset of the identified image feature points for which one or more corresponding model feature points were identified; and, for each image feature point that has multiple likely corresponding model feature points, one of the corresponding model feature points.;The product may update a 3D model of at least a portion of the scene following the receipt of each video image and before processing the next video image base on the generated information indicative of the position and orientation of the image capture device at the time of capturing the received image. The product may display the updated 3D model after each update to the model.
申请公布号 US9483703(B2) 申请公布日期 2016.11.01
申请号 US201414120370 申请日期 2014.05.14
申请人 UNIVERSITY OF SOUTHERN CALIFORNIA 发明人 Medioni Gerard;Kang Zhuoliang
分类号 H04N13/02;G06K9/46;G06T7/00;G06K9/00 主分类号 H04N13/02
代理机构 McDermott Will & Emery LLP 代理人 McDermott Will & Emery LLP
主权项 1. A product comprising a non-transitory, tangible, computer-readable storage medium containing a program of instructions that causes a computer system running the program of instructions to cause at least the following to occur: receive a stream of video images of a scene, each image having been captured by an image capture device while located at a particular position and having a particular orientation, at least two of the images having been captured by the image capture device while at different locations; after receiving each image and before processing the next image, generate information indicative of the position and orientation of the image capture device at the time of capturing the image and update a three dimensional (3D) model by performing at least the following: identifying distinguishable image feature points in the image;for each identified image feature point, attempting to identify one or more distinguishable model feature points in a three dimensional (3D) model of at least a portion of the scene that appears likely to correspond to the identified image feature point, where the correspondence is determined by a matching algorithm that performs at least the following: back-projects the feature point in the three dimensional (3D) model onto the previously-received image;finds an estimated pixel location on the current image using dense optical flow; andsearches near the estimate pixel location to find the matched image feature point;finding each of the following that, in combination, produce a consistent projection transformation of the 3D model onto the image: a subset of identified image feature points for which one or more corresponding model feature points were identified; andfor each image feature point that has multiple likely corresponding model feature points, one of the corresponding model feature points; andupdating the three dimensional (3D) model by using the projection transformation of the current image to estimate geometry information.
地址 Los Angeles CA US