Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H:M:S for Date_To in the Temporal function defaults to 0:0:0 #190

Closed
amfriesz opened this issue Dec 28, 2022 · 8 comments · Fixed by #546
Closed

H:M:S for Date_To in the Temporal function defaults to 0:0:0 #190

amfriesz opened this issue Dec 28, 2022 · 8 comments · Fixed by #546
Labels
bug Something isn't working question A question needs to be answered to proceed

Comments

@amfriesz
Copy link

The H:M:S for the Date_To parameter in the Temporal function defaults to 0:0:0 when only the date is supplied (e.g. 2020-01-01 defaults to 2020-01-01T00:00:00). I think the H:M:S for the Date_To parameter should default to 23:59:59 unless specified by the user. This would match the behavior of NASA's Earthdata Search.

@itcarroll
Copy link
Collaborator

itcarroll commented Mar 20, 2024

The current behavior in earthaccess (using a module that is one step "under-the-hood" compared to earthaccess.search_data for understanding) always rounds missing datetime elements down. For example:

query = earthaccess.search.DataGranules().temporal("2000-01-01", "2000-01-01")
query.params["temporal"]
# ['2000-01-01T00:00:00Z,2000-01-01T00:00:00Z']

The feature requested is for the second (date_to) parameter to make the same input give ['2000-01-01T00:00:00Z,2000-01-01T23:59:59Z']. In considering an implementation of this feature, I'm faced with the question of whether this "rounding-up" should occur for all incomplete date_to strings. For example:

query = earthaccess.search.DataGranules().temporal("2000-01", "2000-01")
query.params["temporal"]
# ['2000-01-01T00:00:00Z,2000-01-01T00:00:00Z']

Should an implementation of this feature also round up the missing day, giving ['2000-01-01T00:00:00Z,2000-01-31T23:59:59Z'] or would it be particular to time, giving ['2000-01-01T00:00:00Z,2000-01-01T23:59:59Z']?

It is not possible to match the behavior of Earthdata Search here, because their temporal filter uses a date-picker and autocorrects user input. Yes, it changes 00:00:00 to 23:59:59, but it does this in the form as you type. (Absurdly, it autocorrects a perfectly reasonable end date like 2000-01-02 00:00:00). We cannot autocorrect or use a date-picker. We could print a message showing the date_to we assumed, but I doubt that is of use. The CMR API dodges this concern entirely by requiring the request to be a complete datetime.

As an alternative, the user could provide a date_to understanding that it will be rounded down, so 2000-01-02 for the first example and 2000-02 for the second (if you meant all of January 2000). In both cases, the date_to would be off from Earthdata Search results by one second.

So my question(s):

  1. Would it be acceptable to document that the user must construct their date_to with consideration of the "rounding down" behavior?
  2. If not, what should we do in the second example above?

@mfisher87 mfisher87 added bug Something isn't working good first issue Good for newcomers question A question needs to be answered to proceed and removed good first issue Good for newcomers labels Mar 31, 2024
@mfisher87
Copy link
Member

mfisher87 commented Mar 31, 2024

Would it be acceptable to document that the user must construct their date_to with consideration of the "rounding down" behavior?

Personally, I think it's too surprising. When I put in an end date I always expect that whole date to be included.

That said, I don't know how to handle the other complexities you raised. I think the simplest approach is to round up both times and partial date strings for the end value to provide one uniform rule users need to remember -- the end of a temporal window is always rounded up at the level of granularity provided, i.e. a yyyy end date includes the whole year, a yyyy-mm end date includes the whole month, and a yyyy-mm-dd end date includes the whole day. Personal opinion though, if other tools familiar to our users have a standard behavior, let's use that. On that note:

We cannot autocorrect or use a date-picker. We could print a message showing the date_to we assumed, but I doubt that is of use. The CMR API dodges this concern entirely by requiring the request to be a complete datetime.

Perhaps this is the best approach for our library too, except using Python datetimes. Don't let the user pass a datetime.date, just datetime.datetime, because there's too much room for interpretation of that request otherwise? 🤔 Explicit is better than implicit!

Once those questions are answered, I think this issue is a good candidate for good first issue label.

@amfriesz
Copy link
Author

amfriesz commented Apr 1, 2024

I think the round down behavior is a bit counterintuitive for me. My expectation would be the round-up behavior. When I look at the second example ("2000-01", "2000-01") my expectation is the full month of January, 2000 (i.e., ['2000-01-01T00:00:00Z,2000-01-01T23:59:59Z']). If the example was ("2000-01", "2000-02"), I'd expect the full months for both January and February.

@jhkennedy
Copy link
Collaborator

jhkennedy commented Apr 1, 2024

@itcarroll and I had a bit of a discussion about rounding here: #486 (comment)

Perhaps this is the best approach for our library too, except using Python datetimes. Don't let the user pass a datetime.date, just datetime.datetime, because there's too much room for interpretation of that request otherwise? 🤔 Explicit is better than implicit!

I think this overall would be bad behavior from a user perspective. Earthacess temporal search parameters are date_from to date_to, so:

  • for incomplete date_from times, we obviously want to round down and that'd be the general user expectations.
  • For incomplete date_to times, I think the direction we round depends on if you view this as an "end" time (inclusive) or a "stop" time (exclusive), I think it could go either way and as long as we document the behavior, we should just pick one:
    • rounding down: rounding down for the stop times would be in line with the general python behavior -- remember range and other iterators typically are exclusive bounds (which is why the parameter is named stop, not end), so you would not expect the stop to be included in the range
    • rounding up: Slightly harder to implement and document, but is more in line with NASA's Earthdata Search GUI. I also think the current docstring implies inclusive: date_to: latest date of temporal range

For date_to, it is probably fine to just pick one as long as we document it well, and I lean towards round date_to up -- I initially expected rounding up, came around to rounding down, and think I'm now back to preferring round up.

It is not possible to match the behavior of Earthdata Search here, because their temporal filter uses a date-picker and autocorrects user input.

This isn't strictly true, as datutil behaves in a that could generally handle this. In terms of implementation, it's not a significant change, and amounts to adding a rounding parameter like:

#  up/down or floor/ceil would work for the round parameter values
def _normalize_datetime(raw: Union[None, str, dt.date], round: Literal['up', 'down'] = 'down') -> Union[None, dt.datetime]:
    if round == 'down':
        default_date = datetime.datetime(datetime.MINYEAR, 1, 1, 0, 0, 0, 0, timezone.utc)
    elif round == 'up':
        default_date = datetime.datetime(datetime.MAXYEAR, 12, 31, 23, 59, 59, 999999, timezone.utc)
    else:
        raise ValueError(...)
        
    ...

at least for my proposed implementation in #486 (wich requires roundtripping date/datetime -> str -> datetime).

@jhkennedy
Copy link
Collaborator

jhkennedy commented Apr 1, 2024

A few things of interest:

  1. The ISO_8601 specification for Calendar dates specifies impartial dates to be inclusive.

    The standard also allows for calendar dates to be written with reduced precision. For example, one may write "1981-04" to mean "1981 April". One may simply write "1981" to refer to that year, "198" to refer to the decade from 1980 to 1989 inclusive, or "19" to refer to the century from 1900 to 1999 inclusive.

  2. The ISO_8601 specification for time intervals specifies a start/end range and says:

    An interval denoted "2007-11-13/15" can start at any time on 2007-11-13 and end at any time on 2007-11-15, whereas "2007-11-13T09:00/15T17:00" includes the start and end times.

  3. CMR temporal range searches allow for ISO_8601 ranges or comma-separated start and stop times.

    • CMR additionally allows open-ended intervals: <start>, or ,<stop>

So, ISO_8601 is generally inclusive for lower precision date-times, CMR allows for open-ended intervals but does require a full date-time representation of any provided dates, which is inclusive in some ways but exclusive in others.

So, I think overall I'd prefer to follow the IOS_8601 for partial dates and be inclusive (rounding down date_from; rounding up date_to) as that matches expectations in this issue, Earthdata Search's GUI, and the specification better.

@jhkennedy
Copy link
Collaborator

And this may be the clencher, from the CMR API docs:
image

@amfriesz
Copy link
Author

amfriesz commented Apr 2, 2024

I’m on team @jhkennedy!

So, I think overall I'd prefer to follow the IOS_8601 for partial dates and be inclusive (rounding down date_from; rounding up date_to) as that matches expectations in this issue, Earthdata Search's GUI, and the specification better.

@itcarroll
Copy link
Collaborator

Thanks, all, for the feedback and clear direction! I'll be working on an implementation today, but FYI we're going to try to send this upstream to python-cmr. You'll see some references popping in below.

chuckwondo pushed a commit to chuckwondo/earthaccess that referenced this issue Apr 30, 2024
Bump `python_cmr` version to make use of additional logic in
the `temporal` methods that were pushed from `earthaccess`
up to `python_cmr` itself.

Fixes nsidc#190 nsidc#330
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question A question needs to be answered to proceed
Projects
Status: Done
4 participants