4D Gaussian Splatting (4DGS)
Traditional 3D capture methods such as photogrammetry have excelled at preserving spatial detail, offering accurate reconstructions of static objects and environments. These methods have become essential in industries such as fashion, cultural heritage, and product visualization. However, they do not account for the temporal dimension: the way scenes evolve over time.
We explored Gaussian Splatting in a previous article, which can be found here. Now new advancements have been evolving in the field: 4D Gaussian Splatting presents a new direction for scene reconstruction by combining spatial detail with temporal coherence. Rather than capturing only form, it allows us to also model motion, deformation and behavioral dynamics in real time. This technique has the potential to complement static photogrammetric workflows and broaden the scope of digital reconstruction. Imagine a video that you can view from any angle and move around in the scene.
4DV.ai is a company actively exploring and publishing work in 4D Gaussian Splatting. Their recent release showcases high-fidelity dynamic scene capture that combines temporal consistency with photorealism. They provide interactive 4DGS demos directly on their site, highlighting applications for immersive environments and digital humans. 4DV.ai’s work contributes to advancing practical deployment of 4DGS for real-time graphics and streaming, making them a noteworthy reference in this field.
The Transition from Static to Dynamic Modeling
Photogrammetry uses multiple overlapping images to reconstruct dense surface geometry. It is ideal for high-resolution scanning of objects and is capable of capturing extremely fine detail with calibrated lighting and lenses. However, it assumes a static subject. Any movement introduces artifacts, limiting its applicability for dynamic content.
Gaussian Splatting takes a different approach by replacing traditional mesh structures with a dense field of three-dimensional Gaussian functions. These primitives each carry information such as position, scale, orientation, color and opacity. The scene is rendered by “splatting” these Gaussians onto an image plane using a fast, differentiable rasterizer.
4D Gaussian Splatting extends this method by incorporating the dimension of time. Each Gaussian is no longer static but is assigned a trajectory or deformation path over a sequence. This allows scenes to be reconstructed with continuous motion, rather than as a series of disconnected static frames.
Technical Foundations of 4DGS
At its core, 4D Gaussian Splatting is about bringing motion into 3D scene capture. Instead of using complex 3D meshes, it builds scenes from tiny blobs of color and light called Gaussians. These points can shift and change over time, letting them capture movement, like a person walking, dancing or talking. The system also removes unnecessary points that don’t add much to the animation, keeping things efficient. With the help of modern graphics cards, these dynamic scenes can be viewed and explored in real time, looking impressively lifelike.
How to Capture 4DGS
There are several pathways to create 4DGS models, each with its own advantages and limitations. Understanding these is important for determining how such methods might integrate with existing photogrammetry-based pipelines.
First is the use of monocular or stereo video. A single handheld camera or a stereo setup can be used to capture dynamic scenes. These video sequences are used to train a model that learns the changes of the scene over time. This approach is hardware-efficient and accessible, making it appealing for applications that require mobility. However, it is highly sensitive to camera pose errors, motion blur and occlusions. Accurate reconstruction often requires AI-based post-processing to infer geometry that was not directly observed.
Second is multi-camera capture using calibrated camera arrays. This method involves synchronizing several cameras around a moving subject to achieve dense, simultaneous views. It yields high-quality dynamic reconstructions and is well suited for performance capture. However, it requires expensive hardware and careful calibration. Processing time is longer, and its use is mostly limited to studio environments.
A third approach is synthetic capture using 3D animation software such as Blender. In this method, a virtual camera array is positioned around a digital scene. The animation is rendered out as a series of frames and used to train a 4DGS model. This synthetic approach provides complete control over geometry, lighting, and motion. It is also useful for evaluating and developing new capture pipelines. However, data size remains a bottleneck. Until compression techniques improve or real-time streaming is seamless, longer-duration scenes will likely remain constrained.
botspot’s photogrammetry infrastructure could contribute to this area by enabling synthetic scan data generation and simulation of multi-angle capture using static models and rigged animations.
Compression, Storage and the Path Forward
Despite its advantages, 4DGS currently produces large files. Even with pruning and entropy-based compression, scenes of more than one minute in length become difficult to store and distribute efficiently. This limits the feasibility of 4DGS for web-based or mobile applications. Until more efficient encoding methods are developed, or cloud-based streaming becomes more viable, most content will be restricted to short segments.
Looking ahead, a hybrid structure for scene assembly may emerge. Static backgrounds can be reconstructed using traditional photogrammetry, while moving foreground elements are captured or authored using 4DGS. AI-based tools will likely play a role in filling in missing geometry, refining motion paths, and enabling user-driven animation.
Conclusion: Toward 4D-Capable Capture Workflows
4D Gaussian Splatting represents a major step forward in digital reconstruction. It provides a framework for capturing not only how objects appear, but how they change. While our company remains focused on delivering photogrammetric models with exceptional geometric accuracy, we see 4DGS as a complementary technology. Its ability to represent time-varying geometry with real-time rendering could open up new possibilities in areas such as virtual production, AR/VR, and digital fashion.
As 4DGS matures, we expect to see more integrated workflows combining the strengths of photogrammetry with the flexibility of spatiotemporal modeling. This will allow for more immersive, accurate, and adaptive digital content creation.