Notes on Learning Generative Models of 3D Structures
Overview
I wanted to get an idea of where the research is at for using deep learning models to generate 3D models for applications in procedural generation tools and creating synthetic datasets. I came across a video going over the 2020 paper, Learning Generative Models of 3D Structures. Below are some notes I took while watching.
Motivation
- 3D Graphics are now critical to many industries
- Huge cost in data capture and human labeling leads to lack of training data
Generative models
generative: \[ P(X) \ vs \ discriminative: P(Y|X) \]
Instead of learning to predict some attribute Y given an input X, the generative model learns the entire input distribution, enabling them to sample objects directly from X
Can be useful in simulating real-world environments and synthetically generating training data
Structure-Aware Representations
- Scope: learned generative models of structured 3D content
Learned:
- Determined with data ↔︎ By hand or rules
Structured:
- 3D shapes and scenes that are decomposed into sub-structures ↔︎ a monolithic chunk of geometry
Structure-Aware
- Express 3D shapes and scenes using abstractions that allow manipulation of their high-level structure
- represent the geometry of the atomic structural elements
- represent the structural patterns
Structure-Aware Representations
- Representations of Part/Object Geometry
- Voxel Grid
- Point Cloud
- Implicit Surface
- A function that determines whether a point is inside or outside a surface
- Triangle Mesh
- Representations of Structure
- Segmented geometry
- Links a label to each part of the entity’s geometry
- Part sets
- an unordered set of atoms (pieces)
- Relationship graphs
- With edges between different parts of a scene or object
- Hierarchies (trees)
- Hierarchical Graphs
- Combine relationship graphs and hierarchies
- Deterministic Programs
- Can be made to output any of the above representations
- Beneficial for making patterns clear
- Allows editing by users
- Segmented geometry
Methodologies
Program synthesis
- Constrain-based program synthesis
- Used when only a few training examples are available
- Tries to find the minimum cost program while satisfying some constraints
Classical Probabilistic Models
- Probabilistic graphical models
- Input Type:
- Small dataset, not large enough to train a deep learning model
- Fixed structure
- Examples:
- Factor graph
- Bayesian network
- Markov random field
- Input Type:
- Probabilistic grammars
- Input Type:
- Small dataset, not large enough to train a deep learning model
- Dynamic, tree-like structure
- Examples:
- Context-free grammar (CFG)
- Used in natural language processing
- a start symbol
- a set of terminals and non-terminals
- a set of rules that map a non-terminal to another layout
- generates a tree where the leaf nodes are terminals
- Probabilistic CFG (PCFG)
- Adds a probability of each rule
- Context-free grammar (CFG)
- Input Type:
Deep Generative Models
- Input Type:
- Big dataset
- Autoregressive models
Input Type:
- Not globally-coherent
Iteratively consumes it’s output from one iteration as input for the next iteration
Weakness:
- If one step drifts from the training data, it can cause subsequent output to diverge further
- Deep latent variable models
- Input Type:
- Globally-coherent
- Variational AutoEncoders (VAE)
- Generative Adversarial Networks (GAN)
- Code Idea:
- Sample over a low dimensional latent space in a trained generator that maps latent vectors to actual 3D shapes which are hard to sample.
- Use a global latent variable to control the generation
- Trained with a reconstruction loss between the input and generated output
- Often perform better than autoregressive models in terms of global coherence
- Input Type:
Structure Type
- Recurrent Neural Network
- Data represented as a linear chain
- Recursive Neural Network RvNN
- Data represented as a tree
- Graph Convolutional Network
- Data represented as a graph
- Neural Program Synthesis
## Application
Synthesize a plausible program that recreates an existing piece of 3D content
Recover shape-generating programs from an existing 3D shape
Learning Shape Abstractions by Assembling Volumetric Primitives (2017)
- Learned to reconstruct 3D shapes with simple geometric primitives
- Decompose shapes into primitives and used chamfer distance as a loss function
- https://github.com/shubhtuls/volumetricPrimitives
Learning Shape Abstractions by Assembling Volumetric Primitives
Learning to Infer and Execute 3D Shape Programs (2019)
- Model can output a 3D shape program consisting of loops and other high level structures
- https://github.com/HobbitLong/shape2prog
Superquadrics Revisited: Learning 3D Shape Parsing beyond Cuboids
- https://github.com/paschalidoud/superquadric_parsing
Perform visual program induction directly from 2D images
- Liu et al. 2019 - Other Applications:
Part-based shape synthesis
Indoor scene synthesis
References: