Program Hang/Stuck after using F.interpolate? VAE Decode Step of HunyuanVideo model. #8470

radna0 · 2024-12-09T15:16:44Z

🐛 Bug

I'm trying out the HunyuanVideo model on TPUs and it seems like everything works fine after changing where needed to work with pytorch xla. But when doing VAE decode at the end of the program, it is stuck after doing F.interpolate

To Reproduce

Steps to reproduce the behavior:

Both generation at 256x160x49 and 544x960x49 were tested, and the program hangs.

line that uses F.interpolate:

https://github.com/Tencent/HunyuanVideo/blob/c4a9d7708dac7c930181c9e147d0092dffa36f92/hyvideo/vae/unet_causal_3d_blocks.py#L159

LOGS:

Execution Analysis: ================================================================================
Execution Analysis: Execution Cause
Execution Analysis:   most likely user code trying to access tensor value before mark_step
Execution Analysis: Graph Info:
Execution Analysis:   Graph Hash: c27015a9b7ba81e4f844a13b392de491
Execution Analysis:   Number of Graph Inputs: 58
Execution Analysis:   Number of Graph Outputs: 1
Execution Analysis: Python Frame Triggered Execution:
Execution Analysis:   interpolate (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/functional.py:4646)
Execution Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/unet_causal_3d_blocks.py:159)
Execution Analysis:   _call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1747)
Execution Analysis:   _wrapped_call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1736)
Execution Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/unet_causal_3d_blocks.py:762)
Execution Analysis:   _call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1747)
Execution Analysis:   _wrapped_call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1736)
Execution Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/vae.py:299)
Execution Analysis:   ..........
Execution Analysis: --------------------------------------------------------------------------------
Execution Analysis: ================================================================================

Compilation Analysis: ================================================================================
Compilation Analysis: Compilation Cause
Compilation Analysis:   most likely user code trying to access tensor value before mark_step
Compilation Analysis: Graph Info:
Compilation Analysis:   Graph Hash: 3dfc3198fbefbf8c55c2d4483dbe39ad
Compilation Analysis:   Number of Graph Inputs: 85
Compilation Analysis:   Number of Graph Outputs: 1
Compilation Analysis: Python Frame Triggered Execution:
Compilation Analysis:   interpolate (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/functional.py:4646)
Compilation Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/unet_causal_3d_blocks.py:159)
Compilation Analysis:   _call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1747)
Compilation Analysis:   _wrapped_call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1736)
Compilation Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/unet_causal_3d_blocks.py:762)
Compilation Analysis:   _call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1747)
Compilation Analysis:   _wrapped_call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1736)
Compilation Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/vae.py:299)
Compilation Analysis:   ..........
Compilation Analysis: --------------------------------------------------------------------------------
Compilation Analysis: ================================================================================

Expected behavior

Environment

Reproducible on XLA backend [CPU/TPU/CUDA]: TPU
torch_xla version: 2.6.0.dev20241105

pip3 install torch==2.6.0.dev20241105+cpu torchvision torchaudio==2.5.0.dev20241105+cpu --index-url https://download.pytorch.org/whl/nightly/cpu
pip3 install https://storage.googleapis.com/pytorch-xla-releases/wheels/tpuvm/torch_xla-2.6.0.dev20241105-cp310-cp310-linux_x86_64.whl

Additional context

The text was updated successfully, but these errors were encountered:

radna0 · 2024-12-09T15:25:55Z

*Update: After a very long time, for 256x160x49 the program is running its next iteration. But I will still keep this issue up as it's taking a very long time to run.

FULL LOGS:

Execution Analysis: ================================================================================
Execution Analysis: Execution Cause
Execution Analysis:   most likely user code trying to access tensor value before mark_step
Execution Analysis: Graph Info:
Execution Analysis:   Graph Hash: c27015a9b7ba81e4f844a13b392de491
Execution Analysis:   Number of Graph Inputs: 58
Execution Analysis:   Number of Graph Outputs: 1
Execution Analysis: Python Frame Triggered Execution:
Execution Analysis:   interpolate (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/functional.py:4646)
Execution Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/unet_causal_3d_blocks.py:159)
Execution Analysis:   _call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1747)
Execution Analysis:   _wrapped_call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1736)
Execution Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/unet_causal_3d_blocks.py:762)
Execution Analysis:   _call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1747)
Execution Analysis:   _wrapped_call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1736)
Execution Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/vae.py:299)
Execution Analysis:   ..........
Execution Analysis: --------------------------------------------------------------------------------
Execution Analysis: ================================================================================

Compilation Analysis: ================================================================================
Compilation Analysis: Compilation Cause
Compilation Analysis:   most likely user code trying to access tensor value before mark_step
Compilation Analysis: Graph Info:
Compilation Analysis:   Graph Hash: 3dfc3198fbefbf8c55c2d4483dbe39ad
Compilation Analysis:   Number of Graph Inputs: 85
Compilation Analysis:   Number of Graph Outputs: 1
Compilation Analysis: Python Frame Triggered Execution:
Compilation Analysis:   interpolate (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/functional.py:4646)
Compilation Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/unet_causal_3d_blocks.py:159)
Compilation Analysis:   _call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1747)
Compilation Analysis:   _wrapped_call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1736)
Compilation Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/unet_causal_3d_blocks.py:762)
Compilation Analysis:   _call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1747)
Compilation Analysis:   _wrapped_call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1736)
Compilation Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/vae.py:299)
Compilation Analysis:   ..........
Compilation Analysis: --------------------------------------------------------------------------------
Compilation Analysis: ================================================================================

Post Compilation Analysis: ================================================================================
Post Compilation Analysis: Graph input size: 0.087411 GB
Post Compilation Analysis: Graph output size: 0.014649 GB
Post Compilation Analysis: Aliased Input size: 0.000000 GB
Post Compilation Analysis: Intermediate tensor size: 0.398816 GB
Post Compilation Analysis: Compiled program size: 0.465443 GB
Post Compilation Analysis: --------------------------------------------------------------------------------
Post Compilation Analysis: ================================================================================
tcmalloc: large alloc 1610612736 bytes == 0x11e4ba4000 @  0x7f8d200a4680 0x7f8d200c5824 0x7f8c1bf0ad59 0x7f8c100217bf 0x7f8c1bb11898 0x7f8c1bb0f1d0 0x7f8c101471d7 0x7f8c1ad7aa4c 0x7f8c1ad7af69 0x7f8c1ad7b6ec 0x7f8c1bab72e0 0x7f8c1bab6421 0x7f8c10d3f101 0x7f8c10d3ee89 0x7f8c10d322d8 0x7f8c10d33782 0x7f8c10cf9076 0x7f8c10c66ca9 0x7f8c5b183d7a 0x7f8c5b17e90a 0x7f8c55a22bba 0x7f8c55761c68 0x7f8c557697f9 0x7f8c55776c14 0x7f8c55777889 0x7f8c554bcb75 0x7f8c554c2947 0x7f8c557b3c48 0x7f8d0aea48dc 0x7f8d0a58bcc6 0x7f8d0a58c42e

Execution Analysis: ================================================================================
Execution Analysis: Execution Cause
Execution Analysis:   most likely user code trying to access tensor value before mark_step
Execution Analysis: Graph Info:
Execution Analysis:   Graph Hash: 3dfc3198fbefbf8c55c2d4483dbe39ad
Execution Analysis:   Number of Graph Inputs: 85
Execution Analysis:   Number of Graph Outputs: 1
Execution Analysis: Python Frame Triggered Execution:
Execution Analysis:   interpolate (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/functional.py:4646)
Execution Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/unet_causal_3d_blocks.py:159)
Execution Analysis:   _call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1747)
Execution Analysis:   _wrapped_call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1736)
Execution Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/unet_causal_3d_blocks.py:762)
Execution Analysis:   _call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1747)
Execution Analysis:   _wrapped_call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1736)
Execution Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/vae.py:299)
Execution Analysis:   ..........
Execution Analysis: --------------------------------------------------------------------------------
Execution Analysis: ================================================================================

Compilation Analysis: ================================================================================
Compilation Analysis: Compilation Cause
Compilation Analysis:   most likely user code trying to access tensor value before mark_step
Compilation Analysis: Graph Info:
Compilation Analysis:   Graph Hash: 4e055b5c19f9dc32e51e9f89cef5192d
Compilation Analysis:   Number of Graph Inputs: 114
Compilation Analysis:   Number of Graph Outputs: 1
Compilation Analysis: Python Frame Triggered Execution:
Compilation Analysis:   interpolate (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/functional.py:4646)
Compilation Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/unet_causal_3d_blocks.py:159)
Compilation Analysis:   _call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1747)
Compilation Analysis:   _wrapped_call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1736)
Compilation Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/unet_causal_3d_blocks.py:762)
Compilation Analysis:   _call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1747)
Compilation Analysis:   _wrapped_call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1736)
Compilation Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/vae.py:299)
Compilation Analysis:   ..........
Compilation Analysis: --------------------------------------------------------------------------------
Compilation Analysis: ================================================================================

Post Compilation Analysis: ================================================================================
Post Compilation Analysis: Graph input size: 0.560742 GB
Post Compilation Analysis: Graph output size: 0.029298 GB
Post Compilation Analysis: Aliased Input size: 0.000000 GB
Post Compilation Analysis: Intermediate tensor size: 0.870249 GB
Post Compilation Analysis: Compiled program size: 0.476917 GB
Post Compilation Analysis: --------------------------------------------------------------------------------
Post Compilation Analysis: ================================================================================

Execution Analysis: ================================================================================
Execution Analysis: Execution Cause
Execution Analysis:   most likely user code trying to access tensor value before mark_step
Execution Analysis: Graph Info:
Execution Analysis:   Graph Hash: 4e055b5c19f9dc32e51e9f89cef5192d
Execution Analysis:   Number of Graph Inputs: 114
Execution Analysis:   Number of Graph Outputs: 1
Execution Analysis: Python Frame Triggered Execution:
Execution Analysis:   interpolate (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/functional.py:4646)
Execution Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/unet_causal_3d_blocks.py:159)
Execution Analysis:   _call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1747)
Execution Analysis:   _wrapped_call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1736)
Execution Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/unet_causal_3d_blocks.py:762)
Execution Analysis:   _call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1747)
Execution Analysis:   _wrapped_call_impl (/home/kojoe/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1736)
Execution Analysis:   forward (/home/kojoe/HunyuanVideo/hyvideo/vae/vae.py:299)
Execution Analysis:   ..........
Execution Analysis: --------------------------------------------------------------------------------
Execution Analysis: ================================================================================
tcmalloc: large alloc 2013265920 bytes == 0x1481d14000 @  0x7f8d200a4680 0x7f8d200c5824 0x7f8d200c5b8a 0x7f8d087f2194 0x7f8d087b4313 0x7f8d0a17cd59 0x7f8d0a177c04 0x7f8d0a177c4f 0x7f8d0a177cba 0x7f8d0a177db8 0x7f8d0b5d37cb 0x7f8d0b631412 0x7f8d0aa02885 0x7f8d0b5fec1f 0x7f8d0b74fd0d 0x7f8d0a58c5dd 0x7f8c554b992c 0x7f8d1e1d2ea8 0x7f8d0a1a399d 0x7f8d0ae7df57 0x7f8d0d3df1df 0x7f8d0d3df86f 0x7f8d0af0f243 0x7f8d0aa00250 0x7f8d0b99db3d 0x7f8d0b99dbc8 0x7f8d0af0bfcc 0x7f8d1dc18e37 0x53b8c9 0x62a834 0x5aed50
tcmalloc: large alloc 2013265920 bytes == 0x14f9d14000 @  0x7f8d200a4680 0x7f8d200c5824 0x7f8d200c5b8a 0x7f8d087f2194 0x7f8d087b4313 0x7f8d0a17ec1d 0x7f8d0a178053 0x7f8d0a1780a8 0x7f8d0a17812d 0x7f8d0a94c66a 0x7f8d0b5dd5a6 0x7f8d0b5dd610 0x7f8d0b1e29a3 0x7f8d0b5a26ee 0x7f8d0b22d429 0x7f8d0a557117 0x7f8d0a916cb6 0x7f8d0b77d94b 0x7f8d0ae60785 0x7f8d0b5a5803 0x7f8d0aef01eb 0x7f8d0a91226d 0x7f8d0b98f801 0x7f8d0b026814 0x7f8d0b5a593e 0x7f8d0b09fd00 0x7f8c554ba58f 0x7f8c555b2f5d 0x7f8c554cfb6d 0x7f8c554e6591 0x7f8c557b3c25
Finish vae decoding
Start cast

Compilation Analysis: ================================================================================
Compilation Analysis: Compilation Cause
Compilation Analysis:   most likely user code trying to access tensor value before mark_step
Compilation Analysis: Graph Info:
Compilation Analysis:   Graph Hash: 88ab009cf184aaa7472976f8ec05c45c
Compilation Analysis:   Number of Graph Inputs: 149
Compilation Analysis:   Number of Graph Outputs: 1
Compilation Analysis: Python Frame Triggered Execution:
Compilation Analysis:   __call__ (/home/kojoe/HunyuanVideo/hyvideo/diffusion/pipelines/pipeline_hunyuan_video.py:1153)
Compilation Analysis:   decorate_context (/home/kojoe/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:116)
Compilation Analysis:   predict (/home/kojoe/HunyuanVideo/hyvideo/inference.py:706)
Compilation Analysis:   decorate_context (/home/kojoe/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:116)
Compilation Analysis:   main (/home/kojoe/HunyuanVideo/sample_video.py:32)
Compilation Analysis:   <module> (/home/kojoe/HunyuanVideo/sample_video.py:57)
Compilation Analysis: --------------------------------------------------------------------------------
Compilation Analysis: ================================================================================

Post Compilation Analysis: ================================================================================
Post Compilation Analysis: Graph input size: 2.436933 GB
Post Compilation Analysis: Graph output size: 0.011216 GB
Post Compilation Analysis: Aliased Input size: 0.000000 GB
Post Compilation Analysis: Intermediate tensor size: 7.477266 GB
Post Compilation Analysis: Compiled program size: 0.491248 GB
Post Compilation Analysis: --------------------------------------------------------------------------------
Post Compilation Analysis: ================================================================================

Execution Analysis: ================================================================================
Execution Analysis: Execution Cause
Execution Analysis:   most likely user code trying to access tensor value before mark_step
Execution Analysis: Graph Info:
Execution Analysis:   Graph Hash: 88ab009cf184aaa7472976f8ec05c45c
Execution Analysis:   Number of Graph Inputs: 149
Execution Analysis:   Number of Graph Outputs: 1
Execution Analysis: Python Frame Triggered Execution:
Execution Analysis:   __call__ (/home/kojoe/HunyuanVideo/hyvideo/diffusion/pipelines/pipeline_hunyuan_video.py:1153)
Execution Analysis:   decorate_context (/home/kojoe/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:116)
Execution Analysis:   predict (/home/kojoe/HunyuanVideo/hyvideo/inference.py:706)
Execution Analysis:   decorate_context (/home/kojoe/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:116)
Execution Analysis:   main (/home/kojoe/HunyuanVideo/sample_video.py:32)
Execution Analysis:   <module> (/home/kojoe/HunyuanVideo/sample_video.py:57)
Execution Analysis: --------------------------------------------------------------------------------
Execution Analysis: ================================================================================
Finish cast

Compilation Analysis: ================================================================================
Compilation Analysis: Compilation Cause
Compilation Analysis:   user mark_step
Compilation Analysis: Graph Info:
Compilation Analysis:   Graph Hash: d019d1645a98ecb2aec31244de79935a
Compilation Analysis:   Number of Graph Inputs: 2
Compilation Analysis:   Number of Graph Outputs: 1
Compilation Analysis: Python Frame Triggered Execution:
Compilation Analysis:   sync (/home/kojoe/.local/lib/python3.10/site-packages/torch_xla/torch_xla.py:69)
Compilation Analysis:   __call__ (/home/kojoe/HunyuanVideo/hyvideo/diffusion/pipelines/pipeline_hunyuan_video.py:1156)
Compilation Analysis:   decorate_context (/home/kojoe/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:116)
Compilation Analysis:   predict (/home/kojoe/HunyuanVideo/hyvideo/inference.py:706)
Compilation Analysis:   decorate_context (/home/kojoe/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:116)
Compilation Analysis:   main (/home/kojoe/HunyuanVideo/sample_video.py:32)
Compilation Analysis:   <module> (/home/kojoe/HunyuanVideo/sample_video.py:57)
Compilation Analysis: --------------------------------------------------------------------------------
Compilation Analysis: ================================================================================

Post Compilation Analysis: ================================================================================
Post Compilation Analysis: Graph input size: 0.000398 GB
Post Compilation Analysis: Graph output size: 0.000398 GB
Post Compilation Analysis: Aliased Input size: 0.000000 GB
Post Compilation Analysis: Intermediate tensor size: 0.000000 GB
Post Compilation Analysis: Compiled program size: 0.000029 GB
Post Compilation Analysis: --------------------------------------------------------------------------------
Post Compilation Analysis: ================================================================================

Execution Analysis: ================================================================================
Execution Analysis: Execution Cause
Execution Analysis:   user mark_step
Execution Analysis: Graph Info:
Execution Analysis:   Graph Hash: d019d1645a98ecb2aec31244de79935a
Execution Analysis:   Number of Graph Inputs: 2
Execution Analysis:   Number of Graph Outputs: 1
Execution Analysis: Python Frame Triggered Execution:
Execution Analysis:   sync (/home/kojoe/.local/lib/python3.10/site-packages/torch_xla/torch_xla.py:69)
Execution Analysis:   __call__ (/home/kojoe/HunyuanVideo/hyvideo/diffusion/pipelines/pipeline_hunyuan_video.py:1156)
Execution Analysis:   decorate_context (/home/kojoe/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:116)
Execution Analysis:   predict (/home/kojoe/HunyuanVideo/hyvideo/inference.py:706)
Execution Analysis:   decorate_context (/home/kojoe/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:116)
Execution Analysis:   main (/home/kojoe/HunyuanVideo/sample_video.py:32)
Execution Analysis:   <module> (/home/kojoe/HunyuanVideo/sample_video.py:57)
Execution Analysis: --------------------------------------------------------------------------------
Execution Analysis: ================================================================================
pt-xla-profiler: CompileTime too slow: longest instance took 11m17s089ms586.550us. Please open a GitHub issue with the graph dump for our team to optimize.
pt-xla-profiler: Op(s) not lowered: aten::nonzero, aten::upsample_nearest3d,  Please open a GitHub issue with the above op lowering requests.

radna0 · 2024-12-11T07:53:01Z

WARNING:pt-xla-profiler:================================================================================
WARNING:pt-xla-profiler:Unlowered Op usage summary (more of these ops, lower performance)
WARNING:pt-xla-profiler:Note: _local_scalar_dense typically indicates CPU context access
WARNING:pt-xla-profiler:--------------------------------------------------------------------------------
WARNING:pt-xla-profiler:FRAME (count=2):
WARNING:pt-xla-profiler:Unlowered Op: "xla_fallback"
WARNING:pt-xla-profiler:
WARNING:pt-xla-profiler:
WARNING:pt-xla-profiler:FRAME (count=1):
WARNING:pt-xla-profiler:  index_for_timestep (/home/kojoe/HunyuanVideo/hyvideo/diffusion/schedulers/scheduling_flow_match_discrete.py:162)
WARNING:pt-xla-profiler:  _init_step_index (/home/kojoe/HunyuanVideo/hyvideo/diffusion/schedulers/scheduling_flow_match_discrete.py:176)
WARNING:pt-xla-profiler:  step (/home/kojoe/HunyuanVideo/hyvideo/diffusion/schedulers/scheduling_flow_match_discrete.py:234)
WARNING:pt-xla-profiler:  __call__ (/home/kojoe/HunyuanVideo/hyvideo/diffusion/pipelines/pipeline_hunyuan_video.py:1103)
WARNING:pt-xla-profiler:  decorate_context (/home/kojoe/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:116)
WARNING:pt-xla-profiler:  predict (/home/kojoe/HunyuanVideo/hyvideo/inference.py:702)
WARNING:pt-xla-profiler:  decorate_context (/home/kojoe/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:116)
WARNING:pt-xla-profiler:  main (/home/kojoe/HunyuanVideo/sample_video.py:32)
WARNING:pt-xla-profiler:  <module> (/home/kojoe/HunyuanVideo/sample_video.py:57)
WARNING:pt-xla-profiler:
WARNING:pt-xla-profiler:
WARNING:pt-xla-profiler:FRAME (count=1):
WARNING:pt-xla-profiler:  index_for_timestep (/home/kojoe/HunyuanVideo/hyvideo/diffusion/schedulers/scheduling_flow_match_discrete.py:170)
WARNING:pt-xla-profiler:  _init_step_index (/home/kojoe/HunyuanVideo/hyvideo/diffusion/schedulers/scheduling_flow_match_discrete.py:176)
WARNING:pt-xla-profiler:  step (/home/kojoe/HunyuanVideo/hyvideo/diffusion/schedulers/scheduling_flow_match_discrete.py:234)
WARNING:pt-xla-profiler:  __call__ (/home/kojoe/HunyuanVideo/hyvideo/diffusion/pipelines/pipeline_hunyuan_video.py:1103)
WARNING:pt-xla-profiler:  decorate_context (/home/kojoe/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:116)
WARNING:pt-xla-profiler:  predict (/home/kojoe/HunyuanVideo/hyvideo/inference.py:702)
WARNING:pt-xla-profiler:  decorate_context (/home/kojoe/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:116)
WARNING:pt-xla-profiler:  main (/home/kojoe/HunyuanVideo/sample_video.py:32)
WARNING:pt-xla-profiler:  <module> (/home/kojoe/HunyuanVideo/sample_video.py:57)
WARNING:pt-xla-profiler:
WARNING:pt-xla-profiler:
WARNING:pt-xla-profiler:================================================================================

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Program Hang/Stuck after using F.interpolate? VAE Decode Step of HunyuanVideo model. #8470

Program Hang/Stuck after using F.interpolate? VAE Decode Step of HunyuanVideo model. #8470

radna0 commented Dec 9, 2024 •

edited

Loading

radna0 commented Dec 9, 2024 •

edited

Loading

radna0 commented Dec 11, 2024

Program Hang/Stuck after using F.interpolate? VAE Decode Step of HunyuanVideo model. #8470

Program Hang/Stuck after using F.interpolate? VAE Decode Step of HunyuanVideo model. #8470

Comments

radna0 commented Dec 9, 2024 • edited Loading

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

radna0 commented Dec 9, 2024 • edited Loading

radna0 commented Dec 11, 2024

radna0 commented Dec 9, 2024 •

edited

Loading

radna0 commented Dec 9, 2024 •

edited

Loading