Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure in UpgradeFromSpecificVersion.test_basic_upgrade #5417

Closed
mmaslankaprv opened this issue Jul 11, 2022 · 6 comments
Closed

Failure in UpgradeFromSpecificVersion.test_basic_upgrade #5417

mmaslankaprv opened this issue Jul 11, 2022 · 6 comments

Comments

@mmaslankaprv
Copy link
Member

Failure:

Module: rptest.tests.upgrade_test
Class:  UpgradeFromSpecificVersion
Method: test_basic_upgrade
[INFO  - 2022-07-11 08:28:56,402 - runner_client - log - lineno:278]: RunnerClient: rptest.tests.upgrade_test.UpgradeFromSpecificVersion.test_basic_upgrade: FAIL: RemoteCommandError({'ssh_config': {'host': 'docker-rp-5', 'hostname': 'docker-rp-5', 'user': 'root', 'port': 22, 'password': 'UNUSED', 'identityfile': '/root/.ssh/id_rsa'}, 'hostname': 'docker-rp-5', 'ssh_hostname': 'docker-rp-5', 'user': 'root', 'externally_routable_ip': 'docker-rp-5', '_logger': <Logger rptest.tests.upgrade_test.UpgradeFromSpecificVersion.test_basic_upgrade-351 (DEBUG)>, 'os': 'linux', '_ssh_client': <paramiko.client.SSHClient object at 0x7fc18dd377c0>, '_sftp_client': <paramiko.sftp_client.SFTPClient object at 0x7fc18dd0d160>}, 'curl -fsSL https://packages.vectorized.io/qSZR7V26sJx7tCXe/redpanda/raw/names/redpanda-amd64/versions/22.1.3/redpanda-22.1.3-amd64.tar.gz --create-dir --output-dir /opt/redpanda_installs/v22.1.3 -o redpanda.tar.gz && gunzip -c /opt/redpanda_installs/v22.1.3/redpanda.tar.gz | tar -xf - -C /opt/redpanda_installs/v22.1.3 && rm /opt/redpanda_installs/v22.1.3/redpanda.tar.gz && unlink /opt/redpanda && ln -s /opt/redpanda_installs/v22.1.3 /opt/redpanda', 22, b'')
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", line 133, in run
    self.setup_test()
  File "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", line 218, in setup_test
    self.test.setup()
  File "/usr/local/lib/python3.9/dist-packages/ducktape/tests/test.py", line 91, in setup
    self.setUp()
  File "/root/tests/rptest/tests/upgrade_test.py", line 52, in setUp
    self.installer.install(self.redpanda.nodes, (22, 1, 3))
  File "/root/tests/rptest/services/redpanda_installer.py", line 199, in install
    self.wait_for_async_ssh(self._redpanda.logger, ssh_install_per_node,
  File "/root/tests/rptest/services/redpanda_installer.py", line 53, in wait_for_async_ssh
    for l in ssh_out_per_node[node]:
  File "/usr/local/lib/python3.9/dist-packages/ducktape/cluster/remoteaccount.py", line 652, in next
    return next(self.iter_obj)
  File "/usr/local/lib/python3.9/dist-packages/ducktape/cluster/remoteaccount.py", line 328, in output_generator
    raise RemoteCommandError(self, cmd, exit_status, stderr.read())
ducktape.cluster.remoteaccount.RemoteCommandError: root@docker-rp-5: Command 'curl -fsSL https://packages.vectorized.io/qSZR7V26sJx7tCXe/redpanda/raw/names/redpanda-amd64/versions/22.1.3/redpanda-22.1.3-amd64.tar.gz --create-dir --output-dir /opt/redpanda_installs/v22.1.3 -o redpanda.tar.gz && gunzip -c /opt/redpanda_installs/v22.1.3/redpanda.tar.gz | tar -xf - -C /opt/redpanda_installs/v22.1.3 && rm /opt/redpanda_installs/v22.1.3/redpanda.tar.gz && unlink /opt/redpanda && ln -s /opt/redpanda_installs/v22.1.3 /opt/redpanda' returned non-zero exit status 22.

https://buildkite.com/redpanda/redpanda/builds/12374#0181ebec-26ff-47d1-95e9-2c070d18e3f5

@bharathv
Copy link
Contributor

A different stack trace from the same test in this build, dumping here for the record.

  1 --------------------------------------------------------------------------------
  2 test_id:    rptest.tests.upgrade_test.UpgradeFromSpecificVersion.test_basic_upgrade
  3 status:     FAIL
  4 run time:   41.606 seconds
  5 
  6     TimeoutError('Redpanda service docker-rp-24 failed to start within 20 sec')
  7 Traceback (most recent call last):
  8   File "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", line 133, in run
  9     self.setup_test()
 10   File "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", line 218, in setup_test
 11     self.test.setup()
 12   File "/usr/local/lib/python3.9/dist-packages/ducktape/tests/test.py", line 91, in setup
 13     self.setUp()
 14   File "/root/tests/rptest/tests/upgrade_test.py", line 53, in setUp
 15     super(UpgradeFromSpecificVersion, self).setUp()
 16   File "/root/tests/rptest/tests/redpanda_test.py", line 103, in setUp
 17     self.redpanda.start()
 18   File "/root/tests/rptest/services/redpanda.py", line 615, in start
 19     self.start_node(node)
 20   File "/root/tests/rptest/services/redpanda.py", line 837, in start_node
 21     self.start_service(node, start_rp)
 22   File "/root/tests/rptest/services/redpanda.py", line 857, in start_service
 23     start()
 24   File "/root/tests/rptest/services/redpanda.py", line 828, in start_rp
 25     wait_until(
 26   File "/usr/local/lib/python3.9/dist-packages/ducktape/utils/util.py", line 58, in wait_until
 27     raise TimeoutError(err_msg() if callable(err_msg) else err_msg) from last_exception
 28 ducktape.errors.TimeoutError: Redpanda service docker-rp-24 failed to start within 20 sec

@andrwng
Copy link
Contributor

andrwng commented Jul 13, 2022

Thanks @bharathv! That looks to be #5437

INFO  2022-07-12 18:41:34,909 [shard 0] redpanda::main - application.cc:205 - Failure during startup: std::invalid_argument (Unknown property partition_autobalancing_mode)

@andrwng
Copy link
Contributor

andrwng commented Jul 13, 2022

Reposting discussion here for posterity: #5282 (comment)

The crux of the issue here is that upgrade tests currently pull some GBs each run of a full ducktape test suite. As we add more, this number will only go up, and we will hit our Cloudsmith limits. We either need to start downloading less in our tests, our build out infrastructure that skirts us around Cloudsmith entirely.

I have a PR in progress that does the former by using a sharing a bind mount across all test containers.

@ajfabbri
Copy link
Contributor

Failure in CI here.

@andrwng
Copy link
Contributor

andrwng commented Jul 14, 2022

Failure in CI here.

@ajfabbri that looks like #5437. Should be good to go if you rebase on dev

@andrwng
Copy link
Contributor

andrwng commented Jul 15, 2022

The immediate issue here has been fixed for a while (no longer seeing failure to download the package). #5459 helps mitigate this in the future by reusing downloads across containers

@andrwng andrwng closed this as completed Jul 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants