Kyoto University Computer Vision Lab

Kyoto University Computer Vision Lab

Dept. of Intelligence Science and Technology, Graduate School of Informatics

Radiometric Scene Decomposition

Scene Reflectance, Illumination, and Geometry from RGB-D Images
Stephen Lombardi and Ko Nishino
Drexel University

fullrad header image — Recovering the radiometric properties of a scene (i.e., the reflectance, illumination, and geometry) is a longsought ability of computer vision that can provide invaluable information for a wide range of applications. Deciphering the radiometric ingredients from the appearance of a real-world scene, as opposed to a single isolated object, is particularly challenging as it generally consists of various objects with different material compositions exhibiting complex reflectance and light interactions that are also part of the illumination. We introduce the first method for radiometric decomposition of real-world scenes that handles those intricacies. We use RGB-D images to bootstrap geometry recovery and simultaneously recover the complex reflectance and natural illumination while refining the noisy initial geometry and segmenting the scene into different material regions. Most important, we handle real-world scenes consisting of multiple objects of unknown materials, which necessitates the modeling of spatially-varying complex re- flectance, natural illumination, texture, interreflection and shadows. We systematically evaluate the effectiveness of our method on synthetic scenes and demonstrate its application to real-world scenes. The results show that rich radiometric information can be recovered from RGB-D images and demonstrate a new role RGB-D sensors can play for general scene understanding tasks.

Radiometric Scence Decomposition: Scene Reflectance, Illumination, and Geometry from RGB-D Images
S. Lombardi and K. Nishino,
in Proc. of International Conference on 3D Vision 3DV’16, Oct., 2016.
[ paper ][ project ]

Results

fullrad synthetic image — We conducted an extensive quantitative evaluation using three different mesh geometry rendered with three different MERL BRDFs under three different illumination environments (total 27 scenes) from four viewpoints. These are the results for the St. Peters Illumination environment. For each reflectance function and mesh, we show a novel view of the scene and use our inferred reflectance, illumination, and geometry to predict the appearance of the novel view. We also quantitatively evaluate the reflectance estimates by computing the log-space root-mean-square error compared to the measured BRDF. We also quantitatively evaluate the geometry by computing the distance between each vertex in the estimated mesh to the closest point on the ground-truth geometry and average all distances. We also show the percentage the geometry error is reduced by our method in parentheses (a negative number indicates a reduction in error). The high-frequency quality of this illumination environment can make reflectance and illumination inference difficult when the true BRDF is sharply specular as in the second reflectance function.

fullrad synthetic1 image — Results on a synthetic scene. The first column shows the estimated appearance using the inferred reflectance, illumination, and segmentation of the scene, the inferred illumination, and the estimated segmentation map. The second column shows the input appearance and ground truth appearance from a novel view, ground truth illumination, and ground truth segmentation. The third column shows the inferred reflectance functions visualized as a series of spheres rendered with varying point-light directions. Our method successfully infers the reflectance, illumination, and texture, even for the brass teapot that reflects other objects in the scene.

fullrad synthetic2 image — Results on a synthetic scene. The first column shows the estimated appearance using the inferred reflectance, illumination, and segmentation of the scene, the inferred illumination, and the estimated segmentation map. The second column shows the input appearance and ground truth appearance from a novel view, ground truth illumination, and ground truth segmentation. The third column shows the inferred reflectance functions visualized as a series of spheres rendered with varying point-light directions. Our method is able to recover the main color and location of the illuminants in the scene, such as the blue light from the sky and the orange light from the buildings.

fullrad real1 image — Results on a real scene. (a) Three input RGB-D images. (b) Predicted appearance from a novel view and geometry, illumination, and segmentation estimates. The first column shows the estimated appearance using the inferred reflectance, illumination, and segmentation of the scene, the refined geometry, the inferred illumination, and the inferred segmentation map. The second column shows the input appearance and ground truth appearance from a novel view, the input geometry, the ground truth illumination. (c) Inferred reflectance functions visualized as a series of spheres rendered with varying point-light directions. Our method correctly recognizes the bright specular highlight on the table and uses other highlights in the scene to recover a detailed illumination map.

fullrad real2 image — Results on a real scene. (a) Four input RGB-D images. (b) Predicted appearance from a novel view and geometry, illumination, and segmentation estimates. The first column shows the estimated appearance using the inferred reflectance, illumination, and segmentation of the scene, the refined geometry, the inferred illumination, and the inferred segmentation map. The second column shows the input appearance and ground truth appearance from a novel view, the input geometry, the ground truth illumination. (c) Inferred reflectance functions visualized as a series of spheres rendered with varying point-light directions. Again, our method is able to recognize the bright light on the table as a specular highlight and is able to recover the texture of the penguin and salt shakers.

fullrad real3 image — Results on a real scene. (a) Three input RGB-D images. (b) Predicted appearance from a novel view and geometry, illumination, and segmentation estimates. The first column shows the estimated appearance using the inferred reflectance, illumination, and segmentation of the scene, the refined geometry, the inferred illumination, and the inferred segmentation map. The second column shows the input appearance and ground truth appearance from a novel view, the input geometry, the ground truth illumination. (c) Inferred reflectance functions visualized as a series of spheres rendered with varying point-light directions. This scene contains a complex spatial arrangement of objects yet we reproduce the appearance of the scene well.