Skip to content
Paper Copilot™, originally my personal project, is now open to the public. I deeply appreciate your feedback and support.
twitter x github-circle reddit

Paper Copilot Paper Copilot™ Research Toolbox
  • Statistics
    • AI/ML
      • AAAI
        • 2025
      • ICLR
        • 2025
        • 2024
        • 2023
        • 2022
        • 2021
        • 2020
        • 2019
        • 2018
        • 2017
        • 2013
        • 2014
      • ICML
        • 2024
        • 2023
      • NeurIPS
        • 2024
          • Main Conference
          • Datasets & Benchmarks
          • Creative AI
          • High School Projects
        • 2023
          • Main Conference
          • Datasets & Benchmarks
        • 2022
          • Main Conference
          • Datasets & Benchmarks
        • 2021
          • Main Conference
          • Datasets & Benchmarks
      • UAI
        • 2024
    • Data Mining
      • KDD
        • 2024
          • Research Track
          • Applied Data Science Track
        • 2025
          • Research Track
          • Applied Data Science Track
    • Graphics
      • SIGGRAPH
      • SIGGRAPH Asia
    • Multimedia
      • ACMMM
        • 2024
    • NLP
      • ACL
        • 2024
      • COLM
        • 2024
      • EMNLP
        • 2024
        • 2023
    • Robotics
      • CoRL
        • 2024
        • 2023
        • 2022
        • 2021
      • ICRA
        • 2025
      • IROS
        • 2025
      • RSS
        • 2025
    • Vision
      • 3DV
        • 2025
      • CVPR
        • 2025
      • ECCV
        • 2024
      • ICCV
      • WACV
        • 2025
  • Accepted Papers
    • AI/ML
      • ICLR
        • 2025
        • 2024
        • 2023
        • 2022
        • 2021
        • 2020
        • 2019
        • 2018
        • 2017
        • 2014
        • 2013
      • NeurIPS
        • 2024
          • Main Conference
          • Dataset & Benchmark
        • 2023
          • Main Conference
          • Dataset & Benchmark
        • 2022
          • Main Conference
          • Dataset & Benchmark
        • 2021
          • Main Conference
          • Dataset & Benchmark
      • ICML
        • 2024
        • 2023
    • Graphics
      • SIGGRAPH
        • 2024
        • 2023
        • 2022
        • 2021
        • 2020
        • 2019
      • SIGGRAPH Asia
        • 2023
        • 2022
        • 2021
        • 2020
        • 2019
        • 2018
    • Vision
      • CVPR
        • 2024
        • 2023
        • 2022
        • 2021
        • 2020
        • 2019
        • 2018
        • 2017
        • 2016
        • 2015
        • 2014
        • 2013
      • ICCV
        • 2023
        • 2021
        • 2019
        • 2017
        • 2015
        • 2013
      • ECCV
        • 2024
        • 2022
        • 2020
        • 2018
      • WACV
        • 2024
        • 2023
        • 2022
        • 2021
        • 2020
  • Countdown
  • Map
    • 3D
    • 2D
  • Contact Us
    • About Us
    • Acknowledgment
    • Report Issues
  • twitter x github-circle reddit

Category: AI

Home » CS » AI » Page 2

Magic3D: High-Resolution Text-to-3D Content Creation

NeRF Pretrained LDM Text-to-3D

Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin

NVIDIA

Portals
  • pdf
  • YouTube
  • Project
  • arXiv
  • Paperswithcode
  • CVF
Abstract

DreamFusion has recently demonstrated the utility of a pre-trained text-to-image diffusion model to optimize Neural Radiance Fields (NeRF), achieving remarkable text-to-3D synthesis results. However, the method has two inherent limitations: (a) extremely slow optimization of NeRF and (b) low-resolution image space supervision on NeRF, leading to low-quality 3D models with a long processing time. In this paper, we address these limitations by utilizing a two-stage optimization framework. First, we obtain a coarse model using a low-resolution diffusion prior and accelerate with a sparse 3D hash grid structure. Using the coarse representation as the initialization, we further optimize a textured 3D mesh model with an efficient differentiable renderer interacting with a high-resolution latent diffusion model. Our method, dubbed Magic3D, can create high quality 3D mesh models in 40 minutes, which is 2× faster than DreamFusion (reportedly taking 1.5 hours on average), while also achieving higher resolution. User studies show 61.7% raters to prefer our approach over DreamFusion. Together with the image-conditioned generation capabilities, we provide users with new ways to control 3D synthesis, opening up new avenues to various creative applications.

Related Works

Text-to-image generation; 3D generative models; Text-to-3D generation

Comparisons

DreamFusion

2023 CVPR

Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation

NeRF Pretrained LDM Text-to-3D

Haochen Wang, Xiaodan Du, Jiahao Li, Raymond A. Yeh, Greg Shakhnarovich

TTI-Chicago; Purdue University

Portals
  • pdf
  • Project
  • sjc
  • arXiv
  • Paperswithcode
  • CVF
Abstract

A diffusion model learns to predict a vector field of gradients. We propose to apply chain rule on the learned gradients, and back-propagate the score of a diffusion model through the Jacobian of a differentiable renderer, which we instantiate to be a voxel radiance field. This setup aggregates 2D scores at multiple camera viewpoints into a 3D score, and repurposes a pretrained 2D model for 3D data generation. We identify a technical challenge of distribution mismatch that arises in this application, and propose a novel estimation mechanism to resolve it. We run our algorithm on several off-the-shelf diffusion image generative models, including the recently released Stable Diffusion trained on the large-scale LAION dataset.

Related Works

Neural radiance fields (NeRF); 2D-supervised 3D GANs; CLIP-guided, optimization-based 3D generative models; DreamFusion

Comparisons

DreamFusion

2023 CVPR

D-SDS: Debiasing Scores and Prompts of 2D Diffusionfor View-consistent Text-to-3D Generation

NeRF Pretrained LDM Text-to-3D
Error: Cannot create object