4D Gaussian Splatting

4D Gaussian Splatting は、3D Gaussian Splatting を dynamic scene へ拡張した方法です。Static scene を 3D Gaussian の集合で表す代わりに、時間とともに位置・形状・opacity・appearance が変化する Gaussian primitive を使います。

3DGS から 4DGS へ

3DGS では、scene は多数の 3D Gaussian で表されます。

G_i = (\mu_i, \Sigma_i, \alpha_i, \mathbf{c}_i)

4DGS では、これに時間依存性を加えます。

G_i(t) = (\mu_i(t), \Sigma_i(t), \alpha_i(t), \mathbf{c}_i(t))

または、4D Gaussian primitive として $(x, y, z, t)$ 空間内の blob を直接最適化します。

主な設計パターン

パターン	説明
Canonical + deformation	Canonical 3D Gaussians を各時刻へ warp します。
Native 4D Gaussians	Gaussian 自体を spacetime primitive として扱います。
Motion scaffold	Control point や motion graph で変形を制御します。
Spacetime features	Gaussian に time-dependent feature を持たせます。

なぜ速いのか

NeRF 系は ray marching によって多くの sample point を評価する必要があります。3DGS / 4DGS は、Gaussian primitive を rasterization するため、real-time rendering に向いています。

Dynamic scene でも、4DGS は次の点で有利です。

Rendering が速い。
Explicit primitive なので編集しやすい。
Dense appearance detail を表しやすい。

Deformable 3D Gaussians

Deformable 3D Gaussians は、canonical space の 3D Gaussian と deformation field を組み合わせ、monocular dynamic scene を復元します。CVPR 2024 では、implicit dynamic rendering method に比べて detail と rendering speed を改善する方向として注目されました。

難点

4DGS にも課題があります。

Memory 使用量が大きくなりやすい。
Long video では Gaussian 数が増えやすい。
Dynamic object と camera motion の分離が難しい。
Topology change や occlusion を安定して扱うのは難しい。
Physical consistency が保証されるわけではない。

数式で見る dynamic Gaussian

4D Gaussian Splatting では、Gaussian の中心、回転、scale、opacity、color が時間に依存します。たとえば中心は次のように書けます。

\boldsymbol{\mu}_i(t)=\boldsymbol{\mu}_i^0+\Delta\boldsymbol{\mu}_i(t)

より一般には、deformation network $D_\theta$ が canonical Gaussian parameter を時刻 $t$ の parameter へ写します。

(\boldsymbol{\mu}_i(t),\mathbf{R}_i(t),\mathbf{s}_i(t),\alpha_i(t),\mathbf{c}_i(t)) =D_\theta(\boldsymbol{\mu}_i^0,\mathbf{f}_i,t)

ここで、 $\mathbf{f}_i$ は Gaussian ごとの latent feature です。この式の気持ちは、「静的な Gaussian cloud を時間に応じて変形・変色させることで、動く scene を表現する」というものです。

学習では、各時刻・各 camera の rendering loss に加え、時間方向の滑らかさを入れることが多いです。

\mathcal{L}=\sum_{t,v}\ell(\hat{I}_{t,v},I_{t,v})+ \lambda\sum_{i,t}\|\Delta\boldsymbol{\mu}_i(t+1)-\Delta\boldsymbol{\mu}_i(t)\|^2

主なソース

Deformable 3D Gaussians, CVPR 2024: https://openaccess.thecvf.com/content/CVPR2024/html/Yang_Deformable_3D_Gaussians_for_High-Fidelity_Monocular_Dynamic_Scene_Reconstruction_CVPR_2024_paper.html
4D Gaussian Splatting: https://arxiv.org/abs/2402.03307
4D Gaussian Splatting with Native 4D Primitives: https://arxiv.org/abs/2412.20720

3DGS から 4DGS へ​

主な設計パターン​

なぜ速いのか​

Deformable 3D Gaussians​

難点​

数式で見る dynamic Gaussian​

関連ページ​

主なソース​