Skip to content

Commit

Permalink
Add demo for run_models experimental method.
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 388982644
  • Loading branch information
juanuribe28 authored and Tensorflow Cloud maintainers committed Aug 6, 2021
1 parent cb4108f commit 4922b2b
Show file tree
Hide file tree
Showing 2 changed files with 343 additions and 0 deletions.
2 changes: 2 additions & 0 deletions g3doc/_book.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ upper_tabs:
path: /cloud/tutorials/distributed_training_nasnet_with_tensorflow_cloud
- title: Hyperparameter tuning on Google Cloud
path: /cloud/tutorials/hp_tuning_cifar10_using_google_cloud
- title: Running vision models from TF Model Garden on GCP with TF Cloud
path: /cloud/tutorials/experimental/running_vision_models_from_tf_model_garden_on_gcp_with_tf_cloud
- heading: "Guides"
- title: Cloud `run` guide
path: /cloud/guides/run_guide
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,341 @@
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Running vision models from TF Model Garden on GCP with TF Cloud",
"provenance": [],
"collapsed_sections": [],
"toc_visible": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "ApxORpbFShVP"
},
"source": [
"##### Copyright 2021 The TensorFlow Cloud Authors."
]
},
{
"cell_type": "code",
"metadata": {
"id": "eR70XKMMmC8I",
"cellView": "form"
},
"source": [
"#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
"# you may not use this file except in compliance with the License.\n",
"# You may obtain a copy of the License at\n",
"#\n",
"# https://www.apache.org/licenses/LICENSE-2.0\n",
"#\n",
"# Unless required by applicable law or agreed to in writing, software\n",
"# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
"# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
"# See the License for the specific language governing permissions and\n",
"# limitations under the License."
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "wKcTRRxsAmDl"
},
"source": [
"# Running vision models from TF Model Garden on GCP with TF Cloud\n",
"\n",
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
" <td>\n",
" <!-- <a target=\"_blank\" href=\"https://www.tensorflow.org/cloud/tutorials/overview.ipynb\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a> MSSING HREF -->\n",
" </td>\n",
" <td>\n",
" <!-- <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/cloud/blob/master/g3doc/tutorials/overview.ipynb\"\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a> MSSING HREF -->\n",
" </td>\n",
" <td>\n",
" <!-- <a target=\"_blank\" href=\"https://github.com/tensorflow/cloud/blob/master/g3doc/tutorials/overview.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View on GitHub</a> MSSING HREF -->\n",
" </td>\n",
" <td>\n",
" <!-- <a href=\"https://storage.googleapis.com/tensorflow_docs/cloud/tutorials/overview.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a> MSSING HREF -->\n",
" </td>\n",
" <td>\n",
" <!-- <a href=\"https://kaggle.com/kernels/welcome?src=https://github.com/tensorflow/cloud/blob/master/g3doc/tutorials/overview.ipynb\" target=\"blank\"> <img width=\"90\" src=\"https://www.kaggle.com/static/images/site-logo.png\" alt=\"Kaggle logo\" />Run in Kaggle</a>MSSING HREF MSSING HREF -->\n",
" </td>\n",
"</table>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "FAUbwFuJB3bw"
},
"source": [
"In this example we will use [run_models](https://github.com/tensorflow/cloud/blob/690c3eee65dadee8af260a19341ff23f42f1f070/src/python/tensorflow_cloud/core/experimental/models.py#L30) from the experimental module of TF Cloud to train a ResNet model from [TF Model Garden](https://github.com/tensorflow/models/tree/master/official) on [imagenette from TFDS](https://www.tensorflow.org/datasets/catalog/imagenette)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EFCSAVDbC8-W"
},
"source": [
"## Install Packages\n",
"\n",
"We need the nightly version of tensorflow-cloud that we can get from github, the official release of tf-models-official, and keras 2.6.0rc0 for compatibility."
]
},
{
"cell_type": "code",
"metadata": {
"id": "r4sSs1azu-Ti"
},
"source": [
"!pip install -q 'git+https://github.com/tensorflow/cloud.git@refs/pull/352/head#egg=tensorflow-cloud&subdirectory=src/python' tf-models-official keras==2.6.0rc0"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "N3NC5vrDslsf"
},
"source": [
"## Import required modules"
]
},
{
"cell_type": "code",
"metadata": {
"id": "sdkgm_6PvHkk",
"colab": {
"base_uri": "https://localhost:8080/"
},
"outputId": "c17384b4-07f1-493c-edf8-5eadde79524f"
},
"source": [
"import os\n",
"import sys\n",
"\n",
"import tensorflow_cloud as tfc\n",
"from tensorflow_cloud.core.experimental.models import run_models\n",
"\n",
"print(tfc.__version__)"
],
"execution_count": 2,
"outputs": [
{
"output_type": "stream",
"text": [
"0.1.17.dev\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ka6MHtF-tTU1"
},
"source": [
"## Project Configurations\n",
"Setting project parameters. For more details on Google Cloud Specific parameters please refer to [Google Cloud Project Setup Instructions](https://www.kaggle.com/nitric/google-cloud-project-setup-instructions/)."
]
},
{
"cell_type": "code",
"metadata": {
"id": "OFPPSLF9vx4H"
},
"source": [
"# Set Google Cloud Specific parameters\n",
"\n",
"# TODO: Please set GCP_PROJECT_ID to your own Google Cloud project ID.\n",
"GCP_PROJECT_ID = 'YOUR_PROJECT_ID' #@param {type:\"string\"}\n",
"\n",
"# TODO: set GCS_BUCKET to your own Google Cloud Storage (GCS) bucket.\n",
"GCS_BUCKET = 'YOUR_GCS_BUCKET_NAME' #@param {type:\"string\"}\n",
"\n",
"# DO NOT CHANGE: Currently only the 'us-central1' region is supported.\n",
"REGION = 'us-central1'\n",
"\n",
"# OPTIONAL: You can change the job name to any string.\n",
"JOB_NAME = 'run_models_demo' #@param {type:\"string\"}"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "F1_shlH4tUM5"
},
"source": [
"## Authenticating the notebook to use your Google Cloud Project\n",
"\n",
"This code authenticates the notebook, checking your valid Google Cloud credentials and identity. It is inside the `if not tfc.remote()` block to ensure that it is only run in the notebook, and will not be run when the notebook code is sent to Google Cloud.\n",
"\n",
"Note: For Kaggle Notebooks click on \"Add-ons\"->\"Google Cloud SDK\" before running the cell below."
]
},
{
"cell_type": "code",
"metadata": {
"id": "EeW7IHBgtPJD"
},
"source": [
"if not tfc.remote():\n",
"\n",
" # Authentication for Kaggle Notebooks\n",
" if \"kaggle_secrets\" in sys.modules:\n",
" from kaggle_secrets import UserSecretsClient\n",
" UserSecretsClient().set_gcloud_credentials(project=GCP_PROJECT_ID)\n",
"\n",
" # Authentication for Colab Notebooks\n",
" if \"google.colab\" in sys.modules:\n",
" from google.colab import auth\n",
" auth.authenticate_user()\n",
" os.environ[\"GOOGLE_CLOUD_PROJECT\"] = GCP_PROJECT_ID"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "EQrVntO2twh1"
},
"source": [
"## Set up TensorFlowCloud run\n",
"\n",
"Set up parameters for tfc.run(). The chief_config, worker_count and worker_config will be set up individually for each distribution strategy. For more details refer to [TensorFlow Cloud overview tutorial](https://colab.research.google.com/github/tensorflow/cloud/blob/master/g3doc/tutorials/overview.ipynb)"
]
},
{
"cell_type": "code",
"metadata": {
"id": "o539iLTKv9a3"
},
"source": [
"with open('requirements.txt','w') as f:\n",
" f.write('git+https://github.com/tensorflow/cloud.git@refs/pull/352/head#egg=tensorflow-cloud&subdirectory=src/python\\n'+\n",
" 'tf-models-official\\n'+\n",
" 'keras==2.6.0rc0')\n",
"\n",
"run_kwargs = dict(\n",
" requirements_txt = 'requirements.txt',\n",
" docker_config=tfc.DockerConfig(\n",
" parent_image=\"gcr.io/deeplearning-platform-release/tf2-gpu.2-5\",\n",
" image_build_bucket=GCS_BUCKET\n",
" ),\n",
" chief_config=tfc.COMMON_MACHINE_CONFIGS[\"P100_4X\"],\n",
" job_labels={'job': JOB_NAME}\n",
")"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "hd4luG7nt3_0"
},
"source": [
"## Run the training using run_models"
]
},
{
"cell_type": "code",
"metadata": {
"id": "_aVt71qpxHUe"
},
"source": [
"values = run_models(\n",
" 'imagenette',\n",
" 'resnet',\n",
" GCS_BUCKET,\n",
" 'train',\n",
" 'validation',\n",
" **run_kwargs,\n",
" )"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "Ku7oBH8iuc2X"
},
"source": [
"## Training Results\n",
"\n",
"### Reconnect your Colab instance\n",
"\n",
"Most remote training jobs are long running. If you are using Colab, it may time out before the training results are available.\n",
"\n",
"In that case, **rerun the following sections in order** to reconnect and configure your Colab instance to access the training results.\n",
"\n",
"1. Import required modules\n",
"2. Project Configurations\n",
"3. Authenticating the notebook to use your Google Cloud Project\n",
"\n",
"**DO NOT** rerun the rest of the code.\n",
"\n",
"### Load Tensorboard\n",
"While the training is in progress you can use Tensorboard to view the results. Note the results will show only after your training has started. This may take a few minutes."
]
},
{
"cell_type": "code",
"metadata": {
"id": "rhVCh8x9upRY"
},
"source": [
"if not tfc.remote():\n",
" %load_ext tensorboard\n",
" tensorboard_logs_dir = values['tensorboard_logs']\n",
" %tensorboard --logdir $tensorboard_logs_dir"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "kOU5Gu4Ku1Qc"
},
"source": [
"### Load your trained model\n",
"\n",
"Once training is complete, you can retrieve your model from the GCS Bucket you specified above."
]
},
{
"cell_type": "code",
"metadata": {
"id": "rHoQnqKhu2Y8"
},
"source": [
"import tensorflow as tf\n",
"if not tfc.remote():\n",
" trained_model = tf.keras.models.load_model(values['saved_model'])\n",
" trained_model.summary()"
],
"execution_count": null,
"outputs": []
}
]
}

0 comments on commit 4922b2b

Please sign in to comment.