Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network perf tests run on the Nutanix Cloud errors out after waiting for some time; Need to also address the literal failure as failed. #449

Open
SachinNinganure opened this issue Aug 2, 2022 · 7 comments

Comments

@SachinNinganure
Copy link
Contributor

SachinNinganure commented Aug 2, 2022

1>a network perf test for Nutanix cloud and ended up with below error;

08-01 23:27:55.713 [pod/uperf-client-10.131.0.33-23c6c6c9-jvqkk/benchmark] 2022-08-01T17:57:45Z - CRITICAL - MainProcess - process: After 3 attempts, unable to run command: ['uperf', '-v', '-a', '-R', '-i', '1', '-m', '/tmp/uperf-test/uperf-rr-udp-16384-16384-1', '-P', '30300']

08-01 23:27:55.713 [pod/uperf-client-10.131.0.33-23c6c6c9-jvqkk/benchmark] 2022-08-01T17:57:45Z - CRITICAL - MainProcess - uperf: Uperf failed to run! Got results: ProcessSample(expected_rc=0, success=False, attempts=3, timeout=None, failed=[ProcessRun(rc=1, stdout='Error getting SSL CTX:1\nAllocating shared memory of size 156624 bytes\nCompleted handshake phase 1\nStarting handshake phase 2\nHandshake phase 2 with 10.131.0.33\n Done preprocessing accepts\n Sent handshake header\n Sending workorder\n

The network perf test is claiming to be success while it has actually failed. same above test outputs success while it has failed.

@jtaleric
Copy link
Member

jtaleric commented Aug 2, 2022

This link seems to be a internal only link.

The output leads me to think there is connectivity issues?

@jtaleric
Copy link
Member

jtaleric commented Aug 2, 2022

08-01 17:57:55.710  uperf-client-10.131.0.33-23c6c6c9-jvqkk   0/1     Error               0          29m
08-01 17:57:55.710  uperf-client-10.131.0.33-23c6c6c9-r9phc   0/1     ContainerCreating   0          2s

From the internal only link you provided.... can you determine why the clients went into error state for the 4p test?

@jtaleric
Copy link
Member

jtaleric commented Aug 2, 2022

08-01 17:57:55.714  [pod/uperf-server-3-23c6c6c9-k8jz2/benchmark] Error creating ports: Address already in use

Seems like there might be a race condition?

@SachinNinganure
Copy link
Contributor Author

I will rerun and provide more info; taking the link from here.

@rsevilla87
Copy link
Member

rsevilla87 commented Aug 3, 2022

Im not sure if this error is a consequence of #444 or otherwise the error has been present for a while and cloud-bulldozer/benchmark-operator#782 raised it up. Im still trying to determine the cause of this issue, in my case happened in the first sample of the scenario stream-udp-16384-16384-1

BTW @SachinNinganure , what do you mean with that the benchmark claiming success when this error happen, I did run a pod2pod benchmark, that failed in a similar way, here the benchmark was marked as Failed and exit code of the script was 1.

ubuntu@ip-10-0-24-166:~/e2e-benchmarking/workloads/network-perf$ oc get benchmark
NAME              TYPE    STATE    METADATA STATE   SYSTEM METRICS   UUID                                   AGE
uperf-pod2pod-4   uperf   Failed   not collected    Not collected    00f201d7-fea6-4fa6-8d40-f205e2728780   29m

Can you verify that, please?

@rsevilla87
Copy link
Member

There's a similar issue created some time ago #247

@SachinNinganure
Copy link
Contributor Author

@rsevilla87 ; In few tests I ran, even though the test got failed the pipeline did not mark the failure. from your comments I am slightly unsure if that happens every-time. I will trigger some runs to get the failure and see if pipeline reports failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants