API Reference

Core Modules

Copyright (c) 2025 agibot. All rights reserved.

videodataset: A GPU-accelerated library that enables random frame access and efficient video decoding for data loading.

class videodataset.VideoDecoder

Bases: pybind11_object

Video decoder with NvCodec acceleration.

codec(self: videodataset._decoder.VideoDecoder) str

Video codec format being decoded

decode_to_np(self: videodataset._decoder.VideoDecoder, video_path: str, frame_index: int) numpy.ndarray[numpy.uint8]

Decode a single frame from a video file.

This function decodes a single frame from a video file using the provided demuxer and decoder objects.

Parameters:
  • video_path (str) – The path to the video file to be decoded.

  • frame_index (int) – The index of the frame to be decoded.

Returns:

A numpy object representing the decoded frame.

Raises:

RuntimeError – if there are issues such as file opening failure, decoding failure, or target frame not found.

decode_to_nps(self: videodataset._decoder.VideoDecoder, video_path: str, frame_indices: list[int]) list[numpy.ndarray[numpy.uint8]]

Decode multiple frames from a video file.

This function decodes multiple frames from a video file using the provided demuxer and decoder objects.

Parameters:
  • video_path (str) – The path to the video file to be decoded.

  • frame_indices (list) – The indices of the frames to be decoded.

Returns:

A list of numpy arrays representing the decoded frames.

Raises:

RuntimeError – if there are issues such as file opening failure, decoding failure, or target frame not found.

decode_to_tensor(self: videodataset._decoder.VideoDecoder, video_path: str, frame_index: int) torch.Tensor

Decode a single frame from a video file.

This function decodes a single frame from a video file using the provided demuxer and decoder objects.

Parameters:
  • video_path (str) – The path to the video file to be decoded.

  • frame_index (int) – The index of the frame to be decoded.

Returns:

A torch object representing the decoded frame.

Raises:

RuntimeError – if there are issues such as file opening failure, decoding failure, or target frame not found.

gpu_id(self: videodataset._decoder.VideoDecoder) int

ID of the GPU being used for decoding

Dataset Modules

Base Dataset

class videodataset.dataset.base_dataset.BaseVideoDataset[source]

Bases: object

Decoder extension that defines decoder specific functionalities

decode_video_frame(decoder, video_path, frame_idx, to_cpu=False)[source]

Decode a specific frame from a video file using the provided decoder. Converts the decoded frame from NV12 format to RGB and optionally moves the tensor to the CPU.

Parameters:
  • decoder (VideoDecoder)

  • video_path (str | Path)

  • frame_idx (int)

  • to_cpu (bool)

Return type:

Tensor

decode_video_frames(decoder, video_path, frame_indices, to_cpu=False)[source]

Decode specific frames from a video file using the provided decoder. Converts the decoded frames from NV12 format to RGB and optionally moves the tensors to the CPU.

Parameters:
  • decoder (VideoDecoder)

  • video_path (str | Path)

  • frame_indices (list[int])

  • to_cpu (bool)

Return type:

list[Tensor]

property device: int

Return the device ID where decoders are running.

get_decoder(decoder_key, codec)[source]

Retrieve a VideoDecoder for a specific key and codec. If the decoder does not exist, it creates a new one and logs the creation.

Parameters:
  • decoder_key (str)

  • codec (str)

Return type:

VideoDecoder

property num_decoders: int

Return the number of decoders currently managed by the dataset.