Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise a GMTInvalidInput exception for only x,y input to blockmedian and blockmean #1447

Closed
maxrjones opened this issue Aug 18, 2021 · 8 comments · Fixed by #1478
Closed

Raise a GMTInvalidInput exception for only x,y input to blockmedian and blockmean #1447

maxrjones opened this issue Aug 18, 2021 · 8 comments · Fixed by #1478
Assignees
Labels
enhancement Improving an existing feature
Milestone

Comments

@maxrjones
Copy link
Member

Description of the desired feature

Currently, no exception is raised in the blockmedian and blockmean functions if the x and y parameters are used but not the z parameter. This should be disallowed because these functions require three columns. For example, GMT will raise an error if a textfile with only two columns is used.

The simplest method to fix this problem would be to add the equivalent of these lines of code (using table rather than data) to the start of the _blockm function:

kind = data_kind(data, x, y, z)
if kind == "vectors" and z is None:
raise GMTInvalidInput("Must provide z with x and y.")

Another option could be to extend the virtualfile_from_data function to (optionally) check the number of columns. This would limit redundancy relative to using the same code block for all functions operating on x,y,z data (e.g., blockm*, surface, nearneighbor). If the test were in virtualfile_from_data, we could also reduce the tests required because the exception would not need to be tested in each module that requires three columns.

Are you willing to help implement and maintain this feature? Glad to help someone else with this issue

@maxrjones maxrjones added the enhancement Improving an existing feature label Aug 18, 2021
@seisman
Copy link
Member

seisman commented Aug 18, 2021

Another option could be to extend the virtualfile_from_data function to (optionally) check the number of columns.

Sounds a better option to me.

@yohaimagen
Copy link
Contributor

I think I can try to do that, I can try to extend virtualfile_from_data. but I didn't understand how would I know if virtualfile_from_data is called from a function that required z values(e.g blockmean) or called from a function that required only x, y data like plot?

@maxrjones
Copy link
Member Author

I think I can try to do that, I can try to extend virtualfile_from_data. but I didn't understand how would I know if virtualfile_from_data is called from a function that required z values(e.g blockmean) or called from a function that required only x, y data like plot?

One option would be to add an optional keyword parameter z_required to virtualfile_from_data so that this can be specified for each function. This could be passed to data_kind so that the checking is done in the same block as the checking of x and y.

But, I am actually not sure if making this code specific to x, y and z is the best way to go or if we should instead have a more general function that can be used to check that if any one of a set of parameters is not None, all are not None. This way, the function could be used for other GMT parameters that need to be paired (relates to #256). While I think this would likely be more useful, I have not looked into the implementation.

@yohaimagen
Copy link
Contributor

yohaimagen commented Aug 28, 2021

I think those are separate issues.
to raise a descriptive error message we will still need to do something like the following in data_kind

if z_required and parameter_set_not_none(x, y, z):
    raise GMTInvalidInput("Must provide x, y and z.") 

parameter_set_not_none can be implemented with something like

def parameter_set_not_none(*args):
    return any(par is None for par in args)

@maxrjones
Copy link
Member Author

I think those are separate issues.
to raise a descriptive error message we will still need to do something like the following in data_kind

if z_required and parameter_set_not_none(x, y, z):
    raise GMTInvalidInput("Must provide x, y and z.") 

parameter_set_not_none can be implemented with something like

def parameter_set_not_none(*args):
    return any(par is None for par in args)

OK, do you want to address the first of these two separate issues using z_required? If so, I can assign the issue to you.

@yohaimagen
Copy link
Contributor

can do both if you think both of them are necessary, let's start with the first one I'll open a PR.

@seisman
Copy link
Member

seisman commented Sep 1, 2021

I'm wondering if we should use a minimum_cols parameter, rather than a required_z parameter. Note that sometimes functions may need more than 4 columns. For example, the blockmean modules can accept another column for weighting (not implemented in PyGMT yet).

@yohaimagen
Copy link
Contributor

I think minimum_cols is not practical for two reasons.

  1. As I mentioned we want to raise a descriptive error message rather than a vague(e.g Must provided both x, y, and z. rather than let's say Not all necessary data were provided.
  2. Let's say some function needs both x, y, z, and some other column a to know that it needs 4 columns are not enough to check that they are all provided (e.g to check a is not None) I need to know to check specifically a.

I think the z column is widely used in many functions so it is sensible to have a general propose code that checks for that, for other columns that are more function-specific we should check in the function itself(and raise a specific descriptive error). We could implement some helper function as @meghanrjones suggested

But, I am actually not sure if making this code specific to x, y and z is the best way to go or if we should instead have a more general function that can be used to check that if any one of a set of parameters is not None, all are not None. This way, the function could be used for other GMT parameters that need to be paired (relates to #256). While I think this would likely be more useful, I have not looked into the implementation.

@seisman seisman added this to the 0.5.0 milestone Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improving an existing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants