Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Add support for Weights and Biases #3417

Merged
merged 7 commits into from
Feb 12, 2021
Merged

Add support for Weights and Biases #3417

merged 7 commits into from
Feb 12, 2021

Conversation

stephenroller
Copy link
Contributor

Patch description
Thanks to @annirudh for making the initial version of this a long time ago, and to @lavanyashukla for support internally. We've been playing with it internally and find W&B to be an excellent addition now that our metrics have grown.

Testing steps
Manual testing.

@lavanyashukla
Copy link

Thanks for making the PR @stephenroller! Really excited about this.

@annirudh
Copy link

Wow, glad to see this going in! Thank you, @stephenroller!

@stephenroller stephenroller requested review from emilydinan and jxmsML and removed request for klshuster January 27, 2021 03:22
"""
Add w&b CLI args.
"""
logger = parser.add_argument_group('Tensorboard Arguments')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "Wandb Arguments"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: still says Tensorboard

'--wandb-log',
type='bool',
default=False,
help="Enable W&B logging of metrics, default is %(default)s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is default false here? LOL maybe same question, why is tensorboard_log default False

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm good with keeping it False, since it requires additional overhead/setup that is not necessary to just train a model


def __init__(self, opt: Opt, model=None):
try:
# tensorboard is a very expensive thing to import. Wait until the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: tensorboard in the comment

raise ImportError('Please run `pip install wandb`.')

name = opt.get('wandb_name')
project = opt.get('wandb_project') or datetime.datetime.now().strftime(
Copy link
Contributor

@jxmsML jxmsML Jan 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. if it is set by timestamps, what if my job gets pre-empted, would it setup a new wandblogger then in this case?
  2. Just curious if we need to set reinit=False here in case of preempted jobs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. In our sweep setup, I have this parameter automatically set to the sweep name. So we should be good
  2. Good question. Let's try it and see what happens.

Copy link
Contributor

@jxmsML jxmsML left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excited!! a few nits

Copy link
Contributor

@klshuster klshuster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super excited for this

type='bool',
default=False,
help="Enable W&B logging of metrics, default is %(default)s",
hidden=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason we're specifying hidden=False? just for verbosity?

'--wandb-log',
type='bool',
default=False,
help="Enable W&B logging of metrics, default is %(default)s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm good with keeping it False, since it requires additional overhead/setup that is not necessary to just train a model


def log_metrics(self, setting, step, report):
"""
Add all metrics from tensorboard_metrics opt key.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: wandb_metrics

@stephenroller
Copy link
Contributor Author

Requesting a re-review for sanity.

I think I'd land this fast and start crowdsourcing bugs instead of manually testing myself.

Copy link
Contributor

@klshuster klshuster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

"""
Add w&b CLI args.
"""
logger = parser.add_argument_group('Tensorboard Arguments')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: still says Tensorboard

'--wandb-name',
type=str,
default=None,
help='W&B run name. If not set, WandB will randomly generate a name.',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does a randomly generated name look like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like "peachy-armadillo-1" or sth like that

image

For our internal sweeps, it's set to the name of the directory (e.g. hashname)

for key, value in opt.items():
if value is None or isinstance(value, (str, numbers.Number, tuple)):
setattr(self.run.config, key, value)
if model is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when will model be None?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non torchagent models I believe

Copy link
Contributor

@jxmsML jxmsML left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@stephenroller
Copy link
Contributor Author

Wtf

@klshuster
Copy link
Contributor

Wtf

what?

@stephenroller stephenroller merged commit 4f9ee04 into master Feb 12, 2021
@stephenroller stephenroller deleted the wandb branch February 12, 2021 03:34
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants