-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Stream's repeat
option to cycle through entire dataset before repeating, when shuffle=True
#521
Comments
repeat
option to cycle through entire dataset before repeatingrepeat
option to cycle through entire dataset before repeating, when shuffle=True
repeat
option to cycle through entire dataset before repeating, when shuffle=Truerepeat
option to cycle through entire dataset before repeating, when shuffle=True
@m-harmonic Does keeping |
@karan6181 Yes exactly, we do have cases where we have multiple streams some of which have multiple repeats. Separately we are also experiencing a problem that is forcing us to train within a single epoch, so duplicating the data within one epoch is the workaround we're trying to use. Do you think there is a possible fix, or an easy solution? |
Hey, @m-harmonic, thanks for the clarification. Unfortunately, we don't support that use case at the moment. I wonder why you care |
@m-harmonic Can you also explain why you would want the repeated samples to show up after going through the original dataset? Can you please share your use case and what exactly you are trying to do? Thanks! |
@m-harmonic Gentle reminder on the above question. |
馃殌 Feature Request
I am using the repeat option when creating a stream, i.e.
Stream(repeat=2)
in addition to random shuffle, i.e.StreamingDataset(shuffle=True)
. It appears that there is no constraint about surfacing every sample once before repeating, that is, the ideal before for my use case is going through every sample once in a shuffled manner before starting to see a sample a second time. Is there already some way to achieve this behavior, and if not, would it be possible to add? Thanks!Motivation
For various reasons I am constructing datasets that should have samples duplicated a certain number of times, but each sample should be seen once before any are seen a second time.
[Optional] Implementation
Additional context
The text was updated successfully, but these errors were encountered: