Skip to content

ML-GSAI/PAD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PoseAwareDiff

arXiv deploy deploy

This is the official PyTorch implementation of Pose-Aware Diffusion for 3D Generation.

Installation

1. Create Environment

conda create -n pad python=3.10 -y
conda activate pad

2. Install PyTorch

Install the version matching your CUDA. For CUDA 12.1:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

3. Install Dependencies

pip install -r requirements.txt

4. Install torch-cluster

torch-cluster requires a version matching your PyTorch and CUDA. For PyTorch 2.5.1 + CUDA 12.1:

pip install torch-cluster -f https://data.pyg.org/whl/torch-2.5.1+cu121.html

For other versions, see PyG installation guide.

5. Install MoGe

MoGe is used for monocular depth estimation to lift a 2D image into a 3D point cloud.

pip install git+https://github.com/microsoft/MoGe.git

Checkpoints

Automatically Downloaded

The following models are downloaded automatically from HuggingFace on first run (~17GB total). No manual action needed.

Model HuggingFace ID Purpose Cache Location
Hunyuan3D-2.1 (DiT + VAE) tencent/Hunyuan3D-2.1 Base 3D generation model ~/.cache/hy3dgen/
DINOv2-Large facebook/dinov2-large Image condition encoder (loaded via config YAML) ~/.cache/huggingface/
MoGe v2 Ruicheng/moge-2-vitl-normal Monocular depth estimation ~/.cache/huggingface/

Manual Download

The finetuned denoiser weights must be downloaded separately from zzh0000/PAD:

Option A: Using huggingface-cli (recommended)

huggingface-cli download zzh0000/PAD pytorch_model.bin --local-dir checkpoints/

Option B: Using Python

from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="zzh0000/PAD", filename="pytorch_model.bin", local_dir="checkpoints/")

Option C: Direct download

Download pytorch_model.bin from https://huggingface.co/zzh0000/PAD and place it under checkpoints/.

Usage

Quick Start

# Minimal example — other models download automatically on first run
python inference.py \
    --image ./assets/example_images/052.png \
    --output ./results/ \
    --ckpt_config configs/objaverse_ptscond.yaml \
    --ckpt checkpoints/pytorch_model.bin

Batch Inference

# Process all images in a directory
python inference.py \
    --image ./assets/example_images/ \
    --output ./results/ \
    --ckpt_config configs/objaverse_ptscond.yaml \
    --ckpt checkpoints/pytorch_model.bin

Without Finetuned Weights

You can also run with the base Hunyuan3D-2.1 model only (no finetuning):

python inference.py --image ./assets/example_images/052.png --output ./results/

Output is one .glb 3D mesh file per input image.

ToDo List

  • Release scene generation model weights and inference code
  • Release training code
  • Release data preprocessing code

Acknowledgement

This work is built on many amazing open source projects, thanks to all the authors!

License

This project builds upon Hunyuan3D-2.1 and is subject to the Tencent Hunyuan Community License Agreement. Non-commercial use only.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages