Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

azurerm_virtual_machine_run_command timeout bug #27428

Open
1 task done
vangork opened this issue Sep 18, 2024 · 4 comments
Open
1 task done

azurerm_virtual_machine_run_command timeout bug #27428

vangork opened this issue Sep 18, 2024 · 4 comments
Labels

Comments

@vangork
Copy link

vangork commented Sep 18, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave comments along the lines of "+1", "me too" or "any updates", they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment and review the contribution guide to help.

Terraform Version

1.7.5

AzureRM Provider Version

3.116.0

Affected Resource(s)/Data Source(s)

azurerm_virtual_machine_run_command

Terraform Configuration Files

resource "azurerm_linux_virtual_machine" "installer" {
  custom_data = filebase64("${path.module}/any_job_longer_than_90mins.sh")
}

resource "azurerm_virtual_machine_run_command" "wait_pfmp_installation_status" {
  location           = var.location
  name               = "wait_pfmp_installation_status"
  virtual_machine_id = azurerm_linux_virtual_machine.installer.id
  source {
    script = "cloud-init status --wait"
  }
  timeouts {
    create = "180m"
  }
}

Debug Output/Panic Output

│ Error: running the command: polling failed: the Azure API returned the following error:
│
│ Status: "VMExtensionProvisioningTimeout"
│ Code: ""
│ Message: "Provisioning of VM extension wait_pfmp_installation_status has timed out. Extension provisioning has taken too long to complete. The extension last reported \"Plugin enabled\".\r\n\r\nMore information on troubleshooting is available at https://aka.ms/RunCommandManagedLinux"
│ Activity Id: ""
│
│ ---
│
│ API Response:
│
│ ----[start]----
│ {
│   "startTime": "2024-09-18T14:29:48.690995+00:00",
│   "endTime": "2024-09-18T15:59:58.6403596+00:00",
│   "status": "Failed",
│   "error": {
│     "code": "VMExtensionProvisioningTimeout",
│     "message": "Provisioning of VM extension wait_pfmp_installation_status has timed out. Extension provisioning has taken too long to complete. The extension last reported \"Plugin enabled\".\r\n\r\nMore information on troubleshooting is available at https://aka.ms/RunCommandManagedLinux"
│   },
│   "name": "4d63493f-b3c9-403d-a969-9831042ba6d2"
│ }
│ -----[end]-----
│
│
│   with module.installer_node.azurerm_virtual_machine_run_command.wait_pfmp_installation_status,
│   on ..\..\modules\installer\main.tf line 79, in resource "azurerm_virtual_machine_run_command" "wait_pfmp_installation_status":
│   79: resource "azurerm_virtual_machine_run_command" "wait_pfmp_installation_status" {
│
│ running the command: polling failed: the Azure API returned the following error:
│
│ Status: "VMExtensionProvisioningTimeout"
│ Code: ""
│ Message: "Provisioning of VM extension wait_pfmp_installation_status has timed out. Extension provisioning has taken too long to complete. The extension last
│ reported \"Plugin enabled\".\r\n\r\nMore information on troubleshooting is available at https://aka.ms/RunCommandManagedLinux"
│ Activity Id: ""
│
│ ---
│
│ API Response:
│
│ ----[start]----
│ {
│   "startTime": "2024-09-18T14:29:48.690995+00:00",
│   "endTime": "2024-09-18T15:59:58.6403596+00:00",
│   "status": "Failed",
│   "error": {
│     "code": "VMExtensionProvisioningTimeout",
│     "message": "Provisioning of VM extension wait_pfmp_installation_status has timed out. Extension provisioning has taken too long to complete. The extension last reported \"Plugin enabled\".\r\n\r\nMore information on troubleshooting is available at https://aka.ms/RunCommandManagedLinux"
│   },
│   "name": "4d63493f-b3c9-403d-a969-9831042ba6d2"
│ }
│ -----[end]-----

Expected Behaviour

My vm has a "pfmp installation" job which would last around 2 - 2.5 hours being set in the custom_data. I wanna leverage managed run command of "cloud-init status --wait" to check if the job is done and move forward. As per https://learn.microsoft.com/en-us/azure/virtual-machines/linux/run-command-managed, managed run command should support for long running (hours/days) scripts.

Actual Behaviour

But the managed run command would timeout after 90 minutes even the create timeout value has been set to 180m.

Steps to Reproduce

No response

Important Factoids

No response

References

No response

@github-actions github-actions bot added the v/3.x label Sep 18, 2024
@Chambras
Copy link
Contributor

@vangork interesting have you tried with version 4.2.0?

@vangork
Copy link
Author

vangork commented Sep 20, 2024

@Chambras I've checked the code, there is no change between 4.2.0 and 3.116.0 for azurerm_virtual_machine_run_command.
I am guessing that extensions_time_budget with defalt value 90 mins of azurerm_linux_virtual_machine cause the limitation, but this value can only be set to [15mins, 120mins].

Shall we allow a wider range for extensions_time_budget?

@teowa
Copy link
Contributor

teowa commented Sep 20, 2024

Hi @vangork , there is a blog might help, for now the azurerm_virtual_machine_run_command resource only supports synchronous mode, by review comment. As for azurerm_linux_virtual_machine.extensions_time_budget this seems a API limitation for the value must be '15' and '120' minutes, sending 180 will get error message

performing CreateOrUpdate: unexpected status 400 (400 Bad Request) with error: InvalidParameter: The value 180 of parameter 'extensionsTimeBudget' is out of range. The value must be between '15' and '120', inclusive.

@vangork
Copy link
Author

vangork commented Sep 20, 2024

@teowa Thanks for the info. Even if azurerm_virtual_machine_run_command support async mode later, it doesn't solve my problem. The custom-data which uses cloud-init agent of azurerm_linux_virtual_machine already can help with async long running job creation post vm creation. I just need a way to track the successful execution of the job, get the output for downstream resources and move forward. Currently I only found custom-data, user-data, CustomScript and RunCommand to excute a command inside of the vm, but unfortunately none of them supports a timeout of 3 hours. Do you see other ways if I can do that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants