Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading specific pre-determined frames from video #5540

Open
1 task done
JosselinSomervilleRoberts opened this issue Jun 27, 2024 · 2 comments
Open
1 task done

Loading specific pre-determined frames from video #5540

JosselinSomervilleRoberts opened this issue Jun 27, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request Video Video related feature/question

Comments

@JosselinSomervilleRoberts

Describe the question.

Hi, I just learned about DALI and wanted to ask if it was the correct tool for my use case.
I have a dataset of videos and I want to load them in a Dataloader in PyTorch.
I work on multiple GPUs.

My pipeline goes like this:

  • Get file_name by accessing fnames[index] (fnames: List[str])
  • Get the number of frames of the video stored at file_name. (the number of frames might be different for each video)
  • Compute an indexing of T: int frames I want to extract. This value T is a constant and so will be the same for each video but the indexing might differ. (If I want T=3 frames uniformly sampled in a video of 101 frames it would be [0,50,100] while it would be [0,100,200] in a video of 201 frames)
  • Extract the T frames in the video (hopefully without having to decode the entire video)
  • Convert this into a PyTorch tensor of shape T,C,H,W

Now what I want is the batch version of this in a distributed manner. So a pipeline that gives me some frames of shape B,T,C,H,W.

I am currently using a custom DataLoader currently and in __get_item__(index: int) -> torch.Tensor I call a load_video(fname: str, rel_indices: np.ndarray) -> torch.Tensor that can be implemented with different engines (Decord, torchvision.io, ...) which are all too slow.
If I understand correctly, the setup of DALI is different as it directly processes batches?

Do you think DALI could be useful in my use case and if so how could I implement this? Keep in mind that I am working in a distributed setup with multiple GPUs (and potentially multiple nodes later on) and that the number of frames extracted T is significantly smaller that the number of frames available.

Thanks!

Check for duplicates

  • I have searched the open bugs/issues and have found no duplicates for this bug report
@JosselinSomervilleRoberts JosselinSomervilleRoberts added the question Further information is requested label Jun 27, 2024
@JanuszL
Copy link
Contributor

JanuszL commented Jun 27, 2024

Hi @JosselinSomervilleRoberts,

Thank you for reaching out. I'm afraid DALI doesn't support the sampling patterns you ask for. What it can do is sample video with constant steps and stride, while in your case, you look for the equal distribution of a fixed number of samples.

@JanuszL JanuszL added enhancement New feature or request Video Video related feature/question and removed question Further information is requested labels Jun 27, 2024
@JanuszL JanuszL modified the milestone: Release_1.40.0 Jun 27, 2024
@JosselinSomervilleRoberts
Copy link
Author

Ok thank you for letting me know! Please if this update is made in the future, I would love to hear about it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Video Video related feature/question
Projects
Status: ToDo
Development

No branches or pull requests

2 participants