This New Rendering Framework Lets Neural Networks Turn 2D Images 3D
Add to favorites
Researchers at Nvidia say they have created a rendering framework that can produce 3D objects from 2D images, with the correct shape, color, texture and lighting; a framework that can help machine learning models achieve depth perception.
The rendering framework called DIB-R - a differentiable interpolation-based renderer - produces 3D objects from 2D images and was presented this week at the annual conference on Neural Information Processing Systems in Vancouver, Canada.
The framework, when wrapped around a neural network, learns to predict shape, texture, and light from single images and generate 3D shapes from a photo.
In the paper presented this week the researchers (from Nvidia, the University of Toronto, Vector Institute, McGill University and Aalto Universit) noted: "Many machine learning models operate on images, but ignore the fact that images are 2D projections formed by 3D geometry interacting with light, in a process called rendering...
"Enabling machine learning models to understand the image formation process could facilitate disentanglement of geometry from the lighting effects, which is key in achieving invariance and robustness."
DIB-R 2D to 3D Rendering
DIB-R uses an encoder-decoder architecture to transform the input data from the 2D image into a feature map that is then used to predict the image outcome.
DIB-R takes a polygon sphere and alters it to the point that it represents the 2D image it is trying to reproduce in 3D. The researchers trained the model using a number of image datasets from a collection of bird photos to images of vehicles.
It could potentially be used by archaeological researchers to create 3D images of objects that have been discovered and imaged during excavations.
Using a single NVIDIA V100 GPU it takes just two days to train the model, once trained DIB-R can create a 3D object based on the data of a 2D image within a 100 milliseconds. DIB-R is built on the machine learning framework PyTorch.
The researchers noted that the: "Key to our approach is to view foreground rasterization as a weighted interpolation of local properties and background rasterization as an distance-based aggregation of global geometry. Our approach allows for accurate optimization over vertex positions, colors, normals, light directions and texture coordinates through a variety of lighting models."
More News in Technology
We have a full moon coming up this week and it is December's Cold Moon on December 12 th, yes that is 12/12, at 12:12 a.m. eastern...that is US! So that is a lot of 12s. Is
(Image: SIPA USA/PA Images) Los Angeles Clippers coach Doc Rivers was left "shocked" by the negative reception Paul George received from Indiana Pacers fans. George, who played for the Pacers between 2010
People across the country on Tuesday shared how rolling power outages were affecting their lives - from those struggling to keep jobs or businesses open, to others who depend on electricity for medical treatment. News24 asked
NEW ORLEANS, La. - NASA is one step closer to putting Americans back on the moon as the agency showed off the completed core stage of the first-ever Space Launch System (SLS) rocket - the
With the load shedding schedule implemented by Eskom, the City of Cape Town has warned that this may have an adverse impact on water supply. Load shedding: What stage are we in on Tuesday, 10 December? The
The Research Insights has added a new report to its source. The report is titled "Global Anesthesia Devices Market Research Report 2019" and accelerates a wide-ranging and focused look into this market. Market size is