-
Notifications
You must be signed in to change notification settings - Fork 644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changing video frame properties on the fly is not supported by all filters #3317
Comments
#3318 may resolve this, please review. |
Hi @whoyao Thanks for the report and detailed description. I agree that there is something not quite right about the GPU encoder. Questions:
The changes made to StreamWriter after 2.0.1 are mainly to allow users to pass custom filter graphs which allows on-the-fly transformation. In terms of the fix, I think that the proper fix is to make filter graph support CUDA, but just for the sake of resolving the warning, I think it is more appropriate to overwrite pixel format for the
audio/torchaudio/csrc/ffmpeg/stream_writer/encode_process.cpp Lines 660 to 662 in a8dc4de
So I think, just changing the |
Hi, @mthrok. Thank you for your thoughtful response. Q1Yes, my old code no longer works. In version 2.0.1, the following code has actually completed the padding work. audio/torchaudio/csrc/ffmpeg/stream_writer/stream_writer.cpp Lines 940 to 949 in 3b40834
However, these codes disappeared in the latest version. Therefore, I have to complete the padding work myself. This part of the code is indeed incompatible. Previously, only 3 channels were required to input, now 4 channels must be input. Q2Yes, I am trying to fix the issue. I agree with your approach. It is true that In my case, the actual |
Thanks for the reply. According to my understanding then, allowing Regarding passing RGB24 Tensor, (like in the previous version), the reason it was changed is that so that the behavior of the StreamWriter is consistent across the formats (though the family of YUV format completely falls out of this). I guess we can add back that logic. It's a difficult line to draw when designing a library, but asking client code to pad manually requires more memory to be wasted, which is not desirable. |
#3428 will fix the regression. One should be able to pass regular RGB tensor when encoding is RGB0. |
Summary: StreamWriter's encoding pipeline looks like the following 1. convert tensor to AVFrame 2. pass AVFrame to AVFilter 3. pass the resulting AVFrame to AVCodecContext (encoder) and AVFormatContext (muxer) When dealing with CUDA tensor, the AVFilter becomes no-op, as we have not added support for CUDA-compatible filters. When CUDA frame is passed, the existing solution passes the software pixel format to AVFilter, which issues warning later as what AVFilter sees is AV_PIX_FMT_CUDA. Since the filter itself is no-op, it functions as expected. But this commit fixes it. See #3317 Pull Request resolved: #3426 Differential Revision: D46562370 Pulled By: mthrok fbshipit-source-id: ce0131f1e50bcc826ee036fc0f35db2a5162b660
🐛 Describe the bug
I used NVIDIA hardware encoders (
h264_nvenc
) in the project. Compared with v2.0.1, the latest code will perform the following validation on the input tensor:Compared with the old version, the accepted channel has changed from 3 to 4, which means that developers cannot input rgb24 format data and need to complete padding operations externally.
However, after the conversion is completed, our input has become rgb0, and rgb0 is not a legal input. (
get_src_pix_fmt
only acceptsAV_PIX_FMT_GRAY8
,AV_PIX_FMT_RGB24
,AV_PIX_FMT_BGR24
, andAV_PIX_FMT_YUV444P
as legal inputs)In order to pass the validation, I pretended that the input video format was rgb24, and the filter seemed to work. I dumped the structure of the filter. The filter was initialized with rgb24, but the actual input data was rgb0.
The dump result of the filter is as follows:
It seems that the output video is fine, but if you open the ffmpeg log, you can see the following warnings:
fmt: 2
represents AV_PIX_FMT_RGB24, whilefmt: 119
represents AV_PIX_FMT_CUDA. The true pix_fmt is AV_PIX_FMT_CUDA and was set inconfigure_hw_accel
.Therefore, I believe that this usage still has hidden problems. So I submit a PR to add AV_PIX_FMT_CUDA as a valid format.
Snippet to reproduce the error is provided below.
Versions
05/04/23 nightly (1e48af0)
Collecting environment information...
PyTorch version: 2.1.0a0+git979c5b4
Is debug build: False
CUDA used to build PyTorch: 11.8
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.5 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: version 3.26.3
Libc version: glibc-2.31
Python version: 3.9.13 (main, Aug 25 2022, 23:26:10) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-5.4.0-126-generic-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration:
GPU 0: Tesla V100-SXM2-32GB
GPU 1: Tesla V100-SXM2-32GB
GPU 2: Tesla V100-SXM2-32GB
GPU 3: Tesla V100-SXM2-32GB
GPU 4: Tesla V100-SXM2-32GB
GPU 5: Tesla V100-SXM2-32GB
GPU 6: Tesla V100-SXM2-32GB
GPU 7: Tesla V100-SXM2-32GB
Nvidia driver version: 510.85.02
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.7.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.7.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.7.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.7.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.7.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.7.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.7.0
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 80
On-line CPU(s) list: 0-79
Thread(s) per core: 1
Core(s) per socket: 40
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz
Stepping: 5
CPU MHz: 2500.000
BogoMIPS: 5000.00
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 2.5 MiB
L1i cache: 2.5 MiB
L2 cache: 320 MiB
L3 cache: 71.5 MiB
NUMA node0 CPU(s): 0-39
NUMA node1 CPU(s): 40-79
Vulnerability Itlb multihit: KVM: Vulnerable
Vulnerability L1tf: Mitigation; PTE Inversion
Vulnerability Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Mmio stale data: Not affected
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Retpolines, STIBP disabled, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 arat avx512_vnni
Versions of relevant libraries:
[pip3] flake8==5.0.4
[pip3] flake8-bugbear==22.9.11
[pip3] flake8-comprehensions==3.10.0
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.23.1
[pip3] pytorch-lightning==1.2.4
[pip3] pytorch-msssim==0.2.1
[pip3] pytorch3d==0.7.1
[pip3] torch==2.1.0a0+git979c5b4
[pip3] torchaudio==2.1.0a0+1e48af0
[pip3] torchfile==0.1.0
[pip3] torchvision==0.16.0a0+fc377d0
[pip3] triton==2.0.0
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.6.0 hecad31d_10 conda-forge
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py39h7f8727e_0
[conda] mkl_fft 1.3.1 py39hd3c417c_0
[conda] mkl_random 1.2.2 py39h51133e4_0
[conda] numpy 1.23.1 py39h6c91a56_0
[conda] numpy-base 1.23.1 py39ha15fc14_0
[conda] pytorch-lightning 1.2.4 pypi_0 pypi
[conda] pytorch-msssim 0.2.1 pypi_0 pypi
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] pytorch3d 0.7.1 pypi_0 pypi
[conda] torch 2.1.0a0+git979c5b4 dev_0
[conda] torchaudio 2.1.0a0+1e48af0 dev_0
[conda] torchfile 0.1.0 pypi_0 pypi
[conda] torchvision 0.16.0a0+fc377d0 dev_0
[conda] triton 2.0.0 pypi_0 pypi
The text was updated successfully, but these errors were encountered: