Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing dependent workflows #357

Draft
wants to merge 71 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
3ff6292
Added doc for nvdec
ahmadsharif1 Nov 5, 2024
1fd5a10
.
ahmadsharif1 Nov 5, 2024
fa3e3b9
.
ahmadsharif1 Nov 5, 2024
36a5420
.
ahmadsharif1 Nov 5, 2024
f49baca
.
ahmadsharif1 Nov 5, 2024
f087a91
.
ahmadsharif1 Nov 5, 2024
5092418
.
ahmadsharif1 Nov 5, 2024
243e2ca
.
ahmadsharif1 Nov 5, 2024
7c6c033
.
ahmadsharif1 Nov 5, 2024
e40ec7a
.
ahmadsharif1 Nov 5, 2024
bb4bff9
.
ahmadsharif1 Nov 5, 2024
e8a5b07
.
ahmadsharif1 Nov 5, 2024
c9d54a4
.
ahmadsharif1 Nov 5, 2024
fb633e4
.
ahmadsharif1 Nov 6, 2024
9e334cd
.
ahmadsharif1 Nov 6, 2024
c107e02
.
ahmadsharif1 Nov 6, 2024
885c43f
.
ahmadsharif1 Nov 6, 2024
dd937c6
.
ahmadsharif1 Nov 6, 2024
bab07db
.
ahmadsharif1 Nov 6, 2024
60b06e1
.
ahmadsharif1 Nov 6, 2024
904bfa3
.
ahmadsharif1 Nov 6, 2024
75e76ee
.
ahmadsharif1 Nov 6, 2024
16218ac
.
ahmadsharif1 Nov 6, 2024
e8f0128
.
ahmadsharif1 Nov 6, 2024
9c36f4e
.
ahmadsharif1 Nov 6, 2024
2406435
.
ahmadsharif1 Nov 6, 2024
7b78be3
.
ahmadsharif1 Nov 6, 2024
20c6fba
.
ahmadsharif1 Nov 6, 2024
7630fdd
.
ahmadsharif1 Nov 6, 2024
37bfa5c
.
ahmadsharif1 Nov 6, 2024
24f2843
.
ahmadsharif1 Nov 6, 2024
4cb95a2
.
ahmadsharif1 Nov 6, 2024
4055346
.
ahmadsharif1 Nov 6, 2024
63bbb9e
.
ahmadsharif1 Nov 6, 2024
51e2308
.
ahmadsharif1 Nov 6, 2024
a926934
.
ahmadsharif1 Nov 6, 2024
400001a
.
ahmadsharif1 Nov 6, 2024
ccf95da
.
ahmadsharif1 Nov 7, 2024
209e746
.
ahmadsharif1 Nov 7, 2024
8d66147
.
ahmadsharif1 Nov 7, 2024
0a8ae5f
.
ahmadsharif1 Nov 7, 2024
8864b30
.
ahmadsharif1 Nov 7, 2024
936cbd1
.
ahmadsharif1 Nov 7, 2024
49197b5
.
ahmadsharif1 Nov 7, 2024
8291aa6
.
ahmadsharif1 Nov 7, 2024
4e10d0b
.
ahmadsharif1 Nov 7, 2024
b90bc7f
.
ahmadsharif1 Nov 7, 2024
2ae49ac
.
ahmadsharif1 Nov 7, 2024
f0444d4
.
ahmadsharif1 Nov 7, 2024
3d95977
.
ahmadsharif1 Nov 7, 2024
5cbccd0
.
ahmadsharif1 Nov 8, 2024
bf81cbe
.
ahmadsharif1 Nov 8, 2024
0ca9469
.
ahmadsharif1 Nov 8, 2024
64a9ebd
.
ahmadsharif1 Nov 8, 2024
30d9be7
.
ahmadsharif1 Nov 8, 2024
c91e73c
.
ahmadsharif1 Nov 8, 2024
0f50210
.
ahmadsharif1 Nov 8, 2024
f8d5e69
.
ahmadsharif1 Nov 8, 2024
5a4291a
.
ahmadsharif1 Nov 8, 2024
af3f684
.
ahmadsharif1 Nov 8, 2024
891125b
.
ahmadsharif1 Nov 8, 2024
9809feb
.
ahmadsharif1 Nov 8, 2024
92e2aef
.
ahmadsharif1 Nov 8, 2024
8d206f4
Merge branch 'main' of https://github.com/pytorch/torchcodec into doc1
ahmadsharif1 Nov 8, 2024
893c490
.
ahmadsharif1 Nov 8, 2024
2a106ca
.
ahmadsharif1 Nov 9, 2024
39f4606
.
ahmadsharif1 Nov 9, 2024
a51dfbd
.
ahmadsharif1 Nov 10, 2024
f29b05c
.
ahmadsharif1 Nov 10, 2024
3f85afa
.
ahmadsharif1 Nov 10, 2024
a828d85
Merge branch 'main' of https://github.com/pytorch/torchcodec into doc3
ahmadsharif1 Nov 11, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 19 additions & 13 deletions .github/workflows/docs.yaml
Original file line number Diff line number Diff line change
@@ -1,17 +1,18 @@
name: Docs

on:
push:
branches: [ main ]
pull_request:
workflow_run:
workflows: [linux_cuda_wheels]
types:
- completed

defaults:
run:
shell: bash -l -eo pipefail {0}

jobs:
build:
runs-on: ubuntu-latest
runs-on: linux.g5.4xlarge.nvidia.gpu
strategy:
fail-fast: false
steps:
Expand All @@ -23,19 +24,24 @@ jobs:
auto-update-conda: true
miniconda-version: "latest"
activate-environment: test
python-version: '3.12'
python-version: '3.9'
- name: Update pip
run: python -m pip install --upgrade pip
- name: Install dependencies and FFmpeg
- name: Download wheel
uses: actions/download-artifact@v3
with:
name: pytorch_torchcodec__3.9_cu124_x86_64
path: pytorch/torchcodec/dist/
- name: Install torchcodec from the wheel
run: |
# TODO: torchvision and torchaudio shouldn't be needed. They were only added
# to silence an error as seen in https://github.com/pytorch/torchcodec/issues/203
python -m pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu
conda install "ffmpeg=7.0.1" pkg-config -c conda-forge
ffmpeg -version
- name: Build and install torchcodec
wheel_path=`find pytorch/torchcodec/dist -type f -name "*.whl"`
echo Installing $wheel_path
${CONDA_RUN} python -m pip install $wheel_path -vvv
- name: Install FFMPEG and other deps
run: |
python -m pip install -e ".[dev]" --no-build-isolation -vvv
conda install cuda-nvrtc=12.4 libnpp -c nvidia
conda install ffmpeg=7 -c conda-forge
ffmpeg -version
- name: Install doc dependencies
run: |
cd docs
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/linux_cuda_wheel.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Build and test Linux CUDA wheels
name: linux_cuda_wheels

on:
pull_request:
Expand Down
8 changes: 8 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,14 @@ We achieve these capabilities through:

How to sample video clips

.. grid-item-card:: :octicon:`file-code;1em`
GPU decoding using TorchCodec
:img-top: _static/img/card-background.svg
:link: generated_examples/basic_cuda_example.html
:link-type: url

A simple example demonstrating Nvidia GPU decoding

.. toctree::
:maxdepth: 1
:caption: TorchCodec documentation
Expand Down
182 changes: 182 additions & 0 deletions examples/basic_cuda_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.
"""
Accelerated video decoding on GPUs with CUDA and NVDEC
================================================================

.. _ndecoderec_tutorial:

TorchCodec can use supported Nvidia hardware (see support matrix
`here <https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new>`_) to speed-up
video decoding. This is called "CUDA Decoding" and it uses Nvidia's
`NVDEC hardware decoder <https://developer.nvidia.com/video-codec-sdk>`_
and CUDA kernels to respectively decompress and convert to RGB.
CUDA Decoding can be faster than CPU Decoding for the actual decoding step and also for
subsequent transform steps like scaling, cropping or rotating. This is because the decode step leaves
the decoded tensor in GPU memory so the GPU doesn't have to fetch from main memory before
running the transform steps. Encoded packets are often much smaller than decoded frames so
CUDA decoding also uses less PCI-e bandwidth.

CUDA Decoding can offer speed-up over CPU Decoding in a few scenarios:

#. You are decoding a large resolution video
#. You are decoding a large batch of videos that's saturating the CPU
#. You want to do whole-image transforms like scaling or convolutions on the decoded tensors
after decoding
#. Your CPU is saturated and you want to free it up for other work


Here are situations where CUDA Decoding may not make sense:

#. You want bit-exact results compared to CPU Decoding
#. You have small resolution videos and the PCI-e transfer latency is large
#. Your GPU is already busy and CPU is not

It's best to experiment with CUDA Decoding to see if it improves your use-case. With
TorchCodec you can simply pass in a device parameter to the
:class:`~torchcodec.decoders.VideoDecoder` class to use CUDA Decoding.


In order to use CUDA Decoding will need the following installed in your environment:

#. An Nvidia GPU that supports decoding the video format you want to decode. See
the support matrix `here <https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new>`_
#. `CUDA-enabled pytorch <https://pytorch.org/get-started/locally/>`_
#. FFmpeg binaries that support NdecoderEC-enabled codecs
#. libnpp and nvrtc (these are usually installed when you install the full cuda-toolkit)


FFmpeg versions 5, 6 and 7 from conda-forge are built with NdecoderEC support and you can
install them with conda. For example, to install FFmpeg version 7:


.. code-block:: bash

conda install ffmpeg=7 -c conda-forge
conda install libnpp cuda-nvrtc -c nvidia


"""

# %%
# Checking if Pytorch has CUDA enabled
# -------------------------------------
#
# .. note::
#
# This tutorial requires FFmpeg libraries compiled with CUDA support.
#
#
import torch

print(f"{torch.__version__=}")
print(f"{torch.cuda.is_available()=}")
print(f"{torch.cuda.get_device_properties(0)=}")


# %%
# Downloading the video
# -------------------------------------
#
# We will use the following video which has the following properties;
#
# - Codec: H.264
# - Resolution: 960x540
# - FPS: 29.97
# - Pixel format: YUV420P
#
# .. raw:: html
#
# <video style="max-width: 100%" controls>
# <source src="https://download.pytorch.org/torchaudio/tutorial-assets/stream-api/NASAs_Most_Scientifically_Complex_Space_Observatory_Requires_Precision-MP4_small.mp4" type="video/mp4">
# </video>
import urllib.request

video_file = "video.mp4"
urllib.request.urlretrieve(
"https://download.pytorch.org/torchaudio/tutorial-assets/stream-api/NASAs_Most_Scientifically_Complex_Space_Observatory_Requires_Precision-MP4_small.mp4",
video_file,
)


# %%
# CUDA Decoding using VideoDecoder
# -------------------------------------
#
# To use CUDA decoder, you need to pass in a cuda device to the decoder.
#
from torchcodec.decoders import VideoDecoder

decoder = VideoDecoder(video_file, device="cuda")
frame = decoder[0]

# %%
#
# The video frames are decoded and returned as tensor of NCHW format.

print(frame.data.shape, frame.data.dtype)

# %%
#
# The video frames are left on the GPU memory.

print(frame.data.device)


# %%
# Visualizing Frames
# -------------------------------------
#
# Let's look at the frames decoded by CUDA decoder and compare them
# against equivalent results from the CPU decoders.
import matplotlib.pyplot as plt


def get_frames(timestamps: list[float], device: str):
decoder = VideoDecoder(video_file, device=device)
return [decoder.get_frame_played_at(ts) for ts in timestamps]


def get_numpy_images(frames):
numpy_images = []
for frame in frames:
# We transfer to the CPU so they can be visualized by matplotlib.
numpy_image = frame.data.to("cpu").permute(1, 2, 0).numpy()
numpy_images.append(numpy_image)
return numpy_images


timestamps = [12, 19, 45, 131, 180]
cpu_frames = get_frames(timestamps, device="cpu")
cuda_frames = get_frames(timestamps, device="cuda:0")
cpu_numpy_images = get_numpy_images(cpu_frames)
cuda_numpy_images = get_numpy_images(cuda_frames)


def plot_cpu_and_cuda_images():
n_rows = len(timestamps)
fig, axes = plt.subplots(n_rows, 2, figsize=[12.8, 16.0])
for i in range(n_rows):
axes[i][0].imshow(cpu_numpy_images[i])
axes[i][1].imshow(cuda_numpy_images[i])

axes[0][0].set_title("CPU decoder")
axes[0][1].set_title("CUDA decoder")
plt.setp(axes, xticks=[], yticks=[])
plt.tight_layout()


plot_cpu_and_cuda_images()

# %%
#
# They look visually similar to the human eye but there may be subtle
# differences because CUDA math is not bit-exact with respect to CPU math.
#
first_cpu_frame = cpu_frames[0].data.to("cpu")
first_cuda_frame = cuda_frames[0].data.to("cpu")
frames_equal = torch.equal(first_cpu_frame, first_cuda_frame)
print(f"{frames_equal=}")
Loading