Skip to content

Fix HunyuanVideo 1.5 I2V by preprocessing image at pixel resolution i…#13440

Merged
yiyixuxu merged 1 commit intohuggingface:mainfrom
akshan-main:fix-hv15-i2v-image-conditioning
Apr 10, 2026
Merged

Fix HunyuanVideo 1.5 I2V by preprocessing image at pixel resolution i…#13440
yiyixuxu merged 1 commit intohuggingface:mainfrom
akshan-main:fix-hv15-i2v-image-conditioning

Conversation

@akshan-main
Copy link
Copy Markdown
Contributor

@akshan-main akshan-main commented Apr 10, 2026

…nstead of latent resolution

What does this PR do?

Fixes #13439

prepare_cond_latents_and_mask shadows the pixel height/width parameters with latent dims from latents.shape (line 614):

batch, channels, frames, height, width = latents.shape  # overwrites pixel h/w with latent h/w

This causes _get_image_latents to preprocess the conditioning image at latent resolution (~30x44) instead of pixel resolution (480x704).

Renamed to latent_height/latent_width so the original pixel dims are preserved for image preprocessing.

The original Tencent implementation (HunyuanVideo-1.5) resizes at pixel resolution.

Found while working on the modular pipeline for HunyuanVideo 1.5 (#13389).

Fixes #13439

Before submitting

Who can review?

@yiyixuxu @DN6 @sayakpaul

@github-actions github-actions bot added pipelines size/S PR with diff < 50 LOC labels Apr 10, 2026
Copy link
Copy Markdown
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@akshan-main
Copy link
Copy Markdown
Contributor Author

These failures are not related to my PRs. Happened with #13406 too

@yiyixuxu yiyixuxu merged commit 87beae7 into huggingface:main Apr 10, 2026
10 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pipelines size/S PR with diff < 50 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HunyuanVideo 1.5 I2V image conditioning preprocessed at latent resolution instead of pixel resolution

3 participants