Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending StudyJob API by adding more trials to finished StudyJob #346

Closed
janvdvegt opened this issue Jan 28, 2019 · 11 comments
Closed

Extending StudyJob API by adding more trials to finished StudyJob #346

janvdvegt opened this issue Jan 28, 2019 · 11 comments

Comments

@janvdvegt
Copy link
Contributor

For our usecase we would like to be able to extend finished studyjobs by adding more trials to it even after it is finished. If we would submit a new one instead they are seen as a completely new instance which means it's more difficult to compare the results and it cannot leverage the previous results for smarter hyperparameter search.

@gaocegege
Copy link
Member

Thanks for the issue. Do you mean we need to support creating new trials for the finished studies?

@richardsliu
Copy link
Contributor

This is interesting. I think @janvdvegt is asking for a way to resubmit a StudyJob that has already completed, to continue off the results from the previous trials. This brings up some additional questions:

  • Do we want to allow resubmitting study jobs with different parameter search spaces?
  • What about adding/removing hyperparameters?
  • Changing the suggestion algorithm/parameters?

@YujiOshima
Copy link
Contributor

@janvdvegt I think my PR #352 is related your use-case. Maybe not enough yet.

@janvdvegt
Copy link
Contributor Author

I'm not 100% sure @YujiOshima how related it is. Unfortunately my Go is not enough to disect your PR.

@gaocegege Yeah I guess. Maybe my proposal is not the best way to do it, but I would like to be able to continue a StudyJob after the initially submitted amount of suggestions.

@richardsliu In my case I would not necessarily need to allow the first two options but it would not hurt either. The third option would not be necessary but would be nice to have.

For me, most importantly I would just like to be able to start an experiment without having to say up front how many trials I would like to do. This would allow me to just start an experiment run and then later automatically if there are idle resources continue searching, or if the results seem promising continue the job. In that sense your three questions don't apply directly but would not hurt at all.

@hougangliu
Copy link
Member

hougangliu commented Jan 31, 2019

#291 is about how we should consider update for completed and failed studyjob. Maybe we can trigger new trial by updating requestCount field of studyjob

@gaocegege
Copy link
Member

I am thinking if we could implement trial as a CRD. Then we could have two CRDs: StudyJob and Trial. And the relation of the two resources is similar to Job and Pod. In this way, we could fix this issue natively. Besides this, there are some other benefits:

  • Make managing resources of katib easier
  • Easy to implement the UI (Just CRUD two Kubernetes CRs)
  • More K8s native

/cc @YujiOshima @johnugeorge @richardsliu @hougangliu @gyliu513

@gaocegege gaocegege changed the title Extending a StudyJob Extending StudyJob API by adding more trials to finished StudyJob Feb 15, 2019
@johnugeorge
Copy link
Member

Interesting idea. Can you create a proposal with more details?

@gaocegege
Copy link
Member

Trial is already a CRD. But we still cannot create trials for a finished experiment.

@johnugeorge
Copy link
Member

It is duplicate of #891

@johnugeorge
Copy link
Member

Closing this issue
/close

@k8s-ci-robot
Copy link

@johnugeorge: Closing this issue.

In response to this:

Closing this issue
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants