API Reference¶

Core Modules¶

videodataset: A GPU-accelerated library that enables random frame access and efficient video decoding for data loading.

class videodataset.VideoDecoder¶

Bases: pybind11_object

Video decoder with NvCodec acceleration.

codec(self: videodataset._decoder.VideoDecoder) → str¶: Video codec format being decoded

decode_to_np(self: videodataset._decoder.VideoDecoder, video_path: str, frame_index: int) → numpy.ndarray[numpy.uint8]¶

Decode a single frame from a video file.

This function decodes a single frame from a video file using the provided demuxer and decoder objects.

Parameters:

video_path (str) – The path to the video file to be decoded.
frame_index (int) – The index of the frame to be decoded.

Returns:

A numpy object representing the decoded frame.

Raises:

RuntimeError – if there are issues such as file opening failure, decoding failure, or target frame not found.

decode_to_nps(self: videodataset._decoder.VideoDecoder, video_path: str, frame_indices: list[int]) → list[numpy.ndarray[numpy.uint8]]¶

Decode multiple frames from a video file.

This function decodes multiple frames from a video file using the provided demuxer and decoder objects.

Parameters:

video_path (str) – The path to the video file to be decoded.
frame_indices (list) – The indices of the frames to be decoded.

Returns:

A list of numpy arrays representing the decoded frames.

Raises:

RuntimeError – if there are issues such as file opening failure, decoding failure, or target frame not found.

decode_to_tensor(self: videodataset._decoder.VideoDecoder, video_path: str, frame_index: int) → torch.Tensor¶

Decode a single frame from a video file.

This function decodes a single frame from a video file using the provided demuxer and decoder objects.

Parameters:

video_path (str) – The path to the video file to be decoded.
frame_index (int) – The index of the frame to be decoded.

Returns:

A torch object representing the decoded frame.

Raises:

RuntimeError – if there are issues such as file opening failure, decoding failure, or target frame not found.

gpu_id(self: videodataset._decoder.VideoDecoder) → int¶: ID of the GPU being used for decoding

Dataset Modules¶

Base Dataset¶

class videodataset.dataset.base_dataset.BaseVideoDataset[source]¶

Bases: object

Decoder extension that defines decoder specific functionalities

decode_video_frame(decoder, video_path, frame_idx, to_cpu=False)[source]¶

Decode a specific frame from a video file using the provided decoder. Converts the decoded frame from NV12 format to RGB and optionally moves the tensor to the CPU.

Parameters:

decoder (VideoDecoder)
video_path (str | Path)
frame_idx (int)
to_cpu (bool)

Return type:

Tensor

decode_video_frames(decoder, video_path, frame_indices, to_cpu=False)[source]¶

Decode specific frames from a video file using the provided decoder. Converts the decoded frames from NV12 format to RGB and optionally moves the tensors to the CPU.

Parameters:

decoder (VideoDecoder)
video_path (str | Path)
frame_indices (list[int])
to_cpu (bool)

Return type:

list[Tensor]

property device: int¶: Return the device ID where decoders are running.

get_decoder(decoder_key, codec)[source]¶

Retrieve a VideoDecoder for a specific key and codec. If the decoder does not exist, it creates a new one and logs the creation.

Parameters:

decoder_key (str)
codec (str)

Return type:

VideoDecoder

property num_decoders: int¶: Return the number of decoders currently managed by the dataset.