DirectFisheye-GS: Enabling Native Fisheye Input in Gaussian Splatting with Cross-View Joint Optimization

Zhengxian Yang1,4,*,   Fei Xie1,*,   Xutao Xue2,   Rui Zhang1,   Taicheng Huang3,   Yang Liu3,  
Mengqi Ji2,   Tao Yu1,†
1BNRist, Tsinghua University   2Beihang University   3JD.com, Beijing, China   4Shanghai AI Laboratory  
* Equal contribution     † Corresponding author
{zx-yang23, xief22, zhangrui22}@mails.tsinghua.edu.cn    xuexutao@buaa.edu.cn    {huangtaicheng.1, liuyang1605}@jd.com    jimengqi@buaa.edu.cn    ytrock@mail.tsinghua.edu.cn
Teaser image

DirectFisheye-GS enables native fisheye image input for 3DGS training, avoiding information loss caused by undistortion. It achieves, and in many cases surpasses, state-of-the-art performance on various public datasets, demonstrating superior detail preservation and cross-view geometric & global illumination consistency.

Abstract

3D Gaussian Splatting (3DGS) has enabled efficient 3D scene reconstruction from everyday images with real-time, high-fidelity rendering, greatly advancing VR/AR applications. Fisheye cameras, with their wider field of view (FOV), promise high-quality reconstructions from fewer inputs and have recently attracted much attention. However, since 3DGS relies on rasterization, most subsequent works involving fisheye camera inputs first undistort images before training, which introduces two problems: 1) Black borders at image edges cause information loss and negate the fisheye's large FOV advantage; 2) Undistortion's stretch-and-interpolate resampling spreads each pixel's value over a larger area, diluting detail density and causing 3DGS to overfit these low-frequency zones, producing blur and floating artifacts.

In this work, we integrate a fisheye camera model into the original 3DGS framework, enabling native fisheye image input for training without preprocessing. Despite correct modeling, we observed that reconstructed scenes still exhibit floaters at image edges: Distortion increases toward the periphery, and 3DGS's original per-iteration random-selecting-view optimization ignores the cross-view correlations of a Gaussian, leading to extreme shapes (e.g., oversized or elongated) that degrade reconstruction quality. To address this, we introduce a feature-overlap-driven cross-view joint optimization strategy that establishes consistent geometric and photometric constraints across views, a technique equally applicable to existing pinhole-camera-based pipelines. Our DirectFisheye-GS matches or surpasses state-of-the-art performance on public datasets.

Video

Method

Common fisheye camera projection models.

(a) Common fisheye camera projection models.

Illustration of the 3DGS single-view training paradigm and our proposed cross-view joint optimization strategy.

(b) Illustration of the 3DGS single-view training paradigm and our proposed cross-view joint optimization strategy.

We propose DirectFisheye-GS, a novel framework that enables native fisheye inputs in 3DGS while enhancing geometric and photometric consistency through cross-view joint optimization. We integrate the Kannala-Brandt fisheye projection model into the 3DGS pipeline, eliminating the need for undistortion preprocessing and preserving the efficiency of rasterization-based rendering. This not only ensures full compatibility with existing 3DGS viewers and commercial tools, but also retains the original spatial relationships among neighboring rays-- an essential condition for multi-view consistency constraints. We then introduce a cross-view joint optimization strategy that adaptively groups training views based on feature overlap and viewpoint divergence. By simultaneously optimizing Gaussians across correlated views, our method enforces geometric consistency and mitigates shape irregularities, significantly improving reconstruction quality without sacrificing efficiency.

Results

Visual Comparisons

Ablation Study

Ablation study content coming soon.

VR Demo

VR demo content coming soon.

BibTeX

@misc{yang2026directfisheyegsenablingnativefisheye,
      title={DirectFisheye-GS: Enabling Native Fisheye Input in Gaussian Splatting with Cross-View Joint Optimization},
      author={Zhengxian Yang and Fei Xie and Xutao Xue and Rui Zhang and Taicheng Huang and Yang Liu and Mengqi Ji and Tao Yu},
      year={2026},
      eprint={2604.00648},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2604.00648},
}

Reference

[1] Kannala, Juho and Brandt, Sami S. A generic camera model and calibration method for conventional, wide-angle, and fish-eye lenses. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(8):1335-1340, 2006.

[2] Liao, Zimu, Chen, Siyan, Fu, Rong, Wang, Yi, Su, Zhongling, Luo, Hao, Ma, Li, Xu, Linning, Dai, Bo, Li, Hengjie, and others. Fisheye-GS: Lightweight and Extensible Gaussian Splatting Module for Fisheye Cameras. arXiv preprint arXiv:2409.04751, 2024.

[3] Wu, Qi, Esturo, Janick Martinez, Mirzaei, Ashkan, Moenne-Loccoz, Nicolas, and Gojcic, Zan. 3DGUT: Enabling Distorted Cameras and Secondary Rays in Gaussian Splatting. arXiv preprint arXiv:2412.12507, 2024.

[4] Deng, Youming, Xian, Wenqi, Yang, Guandao, Guibas, Leonidas, Wetzstein, Gordon, Marschner, Steve, and Debevec, Paul. Self-Calibrating Gaussian Splatting for Large Field of View Reconstruction. arXiv preprint arXiv:2502.09563, 2025.