Skip to content

Teaching

Winter 2023/2024

Lecture:
3D Computer Vision


Summary

Computer vision has led to many recent technology breakthroughs and is today one of the most demanded fields. 3D computer vision is becoming increasingly important, and the field has recently shown remarkable progress.

This lecture will teach you the fundamentals of 3D computer vision. After having given an intuition of the field and traditional methods, it will proceed to state-of-the-art approaches and provide you with a basis for working on any 3D computer vision approaches.

The lecture starts with fundamentals on projective geometry and sensor devices. It will then dive into correspondence and depth estimation techniques that provide the foundation of 3D reconstruction. Starting with feature extraction and matching, the lecture will continue with stereo depth estimation, optical flow and scene flow estimation. Having covered the fundamentals, the lecture will proceed to complete reconstruction pipelines ranging from SfM, COLMAP and KinectFusion to the modern deep learning techniques NeRF and NeuS.

The lecture will then advance to online reconstruction with simultaneous localization and mapping (SLAM) approaches and show applications in autonomous driving and AR technology. Finally, the lecture will conclude with the most challenging discipline of generic non-rigid reconstruction.

The lecture will be accompanied by hands-on exercises comprised of sheets and coding tasks. Towards the end of the lecture, there will be and challenging and fun project to set up a simple complete SLAM pipeline.

The lecture is the prerequisite for the 3D Real World Modeling and Inference lecture that will be held in summer and presents the foundation of modern deep learning techniques in 3D.

Requirements

  • As a prerequisite for this course, you must have taken either “High Level Computer Vision” or “Neural Networks: Theory and Implementation” with the computer vision project in the end. You must be familiar with CNNs and how to implement and train them with PyTorch.
  • Having taken “Computer Graphics” and “Image Processing and Computer Vision” is helpful, but not required.

Credit Points

9 ECTS (advanced lecture with project)

Lecturer

Prof. Eddy Ilg

CMS

https://cms.sic.saarland/compvis2324/

Syllabus

Lecture 1: Acquisition Devices (RGB-, ToF-, Event-Cameras; LIDARs; IMUs) and Calibration
Lecture 2: Feature Extraction and Matching
Lecture 3: Projective Geometry and Image Formation, Part 1
Lecture 4: Projective Geometry and Image Formation, Part 2
Lecture 5: Rotations and Spherical Harmonics
Lecture 6: Stereo Depth Estimation Methods
Lecture 7: Optical Flow Estimation Methods
Lecture 8: Scene Flow Estimation Methods
Lecture 9: Rigid 3D Reconstruction: SfM, COLMAP, KinectFusion, Sphere Tracing and Implicit Rep.
Lecture 10: Rigid 3D Reconstruction: Volumetric Representations (NeRF + NeuS)
Lecture 11: Visual Localization
Lecture 12: Simultaneous Localization and Mapping (SLAM) and Autonomous Driving
Lecture 13: Non-Rigid 3D Reconstruction (DynamicFusion, OcclusionFusion)
Lecture 14: Non-Rigid 3D Reconstruction (NeRFies, NeuS2, DynIBaR)

Excercises

Excercise 1: Geometry Fundamentals
Excercise 2: Estimating Poses with COLMAP and Visualizing Camera Rays
Excercise 3: Feature Matching with HoG and Deep-Learned Features
Excercise 4: Semi-Global Matching for Stereo Depth Estimation
Excercise 5: FlowNet for Optical Flow Estimation
Project: Implementing and Running a SLAM Pipeline

Summer 2024

Lecture:
3D Real World Modeling and Inference

Summary

Computer vision and deep learning have led to many recent technology breakthroughs, and computer vision today is one of the most demanded fields.

However, most deep-learning approaches for computer vision are constructed in 2D. One can argue that they do not understand the 3D world and have inherent limitations. Therefore, 3D computer vision is already becoming increasingly more important and will likely lead the next generation of algorithms. This lecture will teach you the fundamental generic 3D representations for creating models of the world and performing inference with these models. The material will set you up to work on the forefront of the new era of 3D computer vision in industry and academia.

The lecture will start with the intuitive differences of 2D and 3D models and then introduce the basic 3D representations ranging from point clouds to meshes, triplanes and voxel grids. Basic and modern deep learning techniques to train and perform inference on these representations will be introduced. Afterwards, the recent implicit representations with MLPs and how to encode 3D scenes with them efficiently will be covered. Many modern models involve generative models and subsequently the basics of generative models from autodecoders to GANs and finally diffusion models will be presented.

The lecture will continue with recent deep learning representations, such as signed distance fields (SDFs) and neural radiance fields (NeRFs) and cover advanced generative models for 3D in the form of GANs. Recent language models will be recapped and how they can be related to modern computer vision. Subsequently, diffusion models that generate 3D scenes from language will be introduced.

The last part of the lecture will cover 3D geometry and material (BRDF) representation and performing reconstruction from images. Then an overview of how 3D reconstructions of objects can be obtained just from 2D images in the wild and finally, the lecture will provide a glimpse of the future of deep learning on algorithms that evolve and learn incrementally.

Requirements

  • As a prerequisite for this course, you must have taken either “High Level Computer Vision” or “Neural Networks: Theory and Implementation” with the computer vision project in the end. You must be familiar with CNNs and how to implement and train them with PyTorch.
  • Having taken “3D Computer Vision” is recommended.
  • “Computer Graphics” and “Image Processing and Computer Vision” are helpful, but not required.

Credit Points

6 ECTS

Lecturers

Prof. Eddy Ilg, Dr. Jan Eric Lenssen

Syllabus

Lecture 1: Intro, 3D Inductive Bias Graphical Neural Networks, Point Clouds and PointNet
Lecture 2: Meshes, Triplanes, Voxel Grids, Sparse Voxel Grids, 3D CNNs
Lecture 3: Mixed Point-Voxel Representations and Continuous Convolutions
Lecture 4: MLPs, Position Encoding, Implicit Representations
Lecture 5: Generative Model Basics: Autodecoders, GANs and Diffusion Models
Lecture 6: Signed Distance Functions, DeepSDF, Occupancy Networks
Lecture 7: Volumetric Rendering, NeRF, Neural Feature Fields
Lecture 8: Generative Models: 3D GANs
Lecture 9: Language Models, from CLIP to LERF
Lecture 10: Generative Models: Diffusion Models
Lecture 11: Bidirectional Reflectance Distribution Functions (BRDFs)
Lecture 12: Material Reconstruction and Relighting
Lecture 13: Object Reconstruction in the Wild
Lecture 14: Incremental Learning

Summer 2023

Seminar:
Photorealistic 3D Reconstruction with Deep Learning

Summary

Computer vision has led to many recent technology breakthroughs and is one of the most demanded fields. Even more, 3D computer vision is becoming increasingly important, and the field has recently shown remarkable progress.

In this seminar, we will look at one of the most important aspects of understanding the 3D world: joint reconstruction of geometry and materials. While recent approaches based on NeRF, NeuS and DeepSDF have shown remarkable progress in reconstructing the geometry, material reconstruction is today still largely neglected. We expect that reconstructing materials will be key to the next generation of 3D computer vision algorithms.

The seminar will bring you up to speed with the concepts and state-of-the-art literature. After a few introductory lectures, the seminar will continue with presentations to review the most important and most recent papers in the field. Overall, the seminar will set you up to be familiar with distinguished literature and enable you to start research in the field.

Every student is expected to give a 30min presentation, followed by a 15min discussion and hand in a write-up at the end of the seminar. Apart from the technical content, we offer mentoring on how to hold compelling presentations. This provides you the opportunity to learn key skills for job applications and your later career.

The seminar is offered by the Computer Vision and Perception Lab (https://cvmp.cs.uni-saarland.de/) that focuses on building the next generation machine perception algorithms. The lab is offering Master’s thesis, Hiwis and PhD positions. Please contact ilg@cs.uni-saarland.de if you are interested.

Requirements

A background in deep learning and computer vision is required. A background in computer graphics is helpful.

Places

12

Syllabus

Introduction to NeRF and Material Estimation

Topic 1: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Topic 2: NeRD: Neural Reflectance Decomposition from Image Collections

Learning Priors for Material Estimation

Topic 3: NeRFactor: Neural Factorization of Shape and Reflectance Under an Unknown Illumination
Topic 4: Neural Radiance Transfer Fields for Relightable Novel-View Synthesis with Global Illumination

Hard Cases and Controlled Conditions

Topic 5: On Joint Estimation of Pose, Geometry and svBRDF from a Handheld Scanner
Topic 6: NEMTO: Neural Environment Matting for Novel View and Relighting Synthesis of Transparent Objects

Reconstruction in the Wild

Topic 7: De-rendering 3D Objects in the Wild
Topic 8: SAMURAI: Shape And Material from Unconstrained Real-world Arbitrary Image collections

Winter 2022/2023

Seminar:
3D Object Representation and Reconstruction
with Machine Learning

Summary

Computer vision has led to many recent technology breakthroughs and is currently one of the most demanded fields. Due to the statistical and complex nature of the world, it is also one of the hardest disciplines. Deep learning has proven to be the method of choice and nurtured the successes.

However, most deep-learning approaches for computer vision are constructed in 2D. One can argue that they do not understand the 3D world and have inherent limitations. Therefore, 3D computer vision is already becoming increasingly more important and will likely lead the next generation of algorithms.

The seminar will bring you up to speed with the concepts and state-of-the-art literature of 3D representations and 3D reconstruction with deep learning. After a few introductory lectures, the seminar will continue with presentations to review the most important and most recent papers in the field, covering SDF- and occupancy-based representations as well as neural radiance fields (NeRFs). Overall, the seminar will set you up to be familiar with distinguished literature and enable you to start research work in the field.

Every student is expected to give a 45min presentation with a write-up and an implementation.

The seminar is offered by the new Computer Vision and Perception Lab (https://cvmp.cs.uni-saarland.de/) that focuses on building the next generation machine perception algorithms, which are not rigid but able to adapt to their environment and evolve. The lab is currently offering Master’s and PhD positions.

Requirements

This seminar will focus on state-of-the-art research and is for advanced students who are already acquainted with machine learning. In particular, a basic understanding of projection and 3D geometry and prior experience in convolutional neural networks, as well as hands-on implementation of neural networks with pytorch is required. It is recommended to have attended the High Level Computer Vision Lecture. Prior attendance of Computer Graphics may be helpful, but is not required.

Places

20

Syllabus

Point Processing

Topic 1: PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
Topic 2: PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
Topic 3: Deep Parametric Continuous Convolutional Neural Networks

Voxel Grids

Topic 4: Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs
Topic 5: HoloGAN: Unsupervised Learning of 3D Representations From Natural Images

Mesh Estimation

Topic 6: Pixel2mesh: Generating 3d Mesh Models From Single RGB Images
Topic 7: Soft Rasterizer: A Differentiable Renderer for Image-based 3D Reasoning

Signed Distance Functions

Topic 8: DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation
Topic 9: Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction
Topic 10: Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance

Neural Radiance Fields

Topic 10: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Topic 11: NeRF++: Analyzing Neural Radiance Fields
Topic 12: NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction
Topic 13: PixelNeRF: Neural Radiance Fields from One or Few Images
Topic 14: Point-NeRF: Point-based Neural Radiance Fields
Topic 15: Plenoxels: Radiance Fields without Neural Networks
Topic 16: Instant Neural Graphics Primitives with a Multiresolution Hash Encoding

Motion Estimation in 3D

Topic 16: FlowNet3D: Learning Scene Flow in 3D Point Clouds

Non-Rigid 3D Reconstruction

Topic 17: Unbiased 4D: Monocular 4D Reconstruction with a Neural Deformation Model

Light Estimation and Editing

Topic 18: Physically-Based Editing of Indoor Scene Lighting from a Single Image
Topic 19: NeRD: Neural Reflectance Decomposition from Image Collections

Object Reconstruction in the Wild

Topic 20: Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency