Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection sometimes performs extra retry when it should transition to SUSPENDED state #1782

Open
lawrence-forooghian opened this issue Jul 5, 2023 · 2 comments
Labels
bug Something isn't working. It's clear that this does need to be fixed.

Comments

@lawrence-forooghian
Copy link
Collaborator

lawrence-forooghian commented Jul 5, 2023

Issue description

ARTRealtime will sometimes perform one more connection retry than it should — that is, it will perform a connection retry after connectionStateTtl has been exceeded, violating RTN14e.

Steps to reproduce

The steps vary depending on whether you're viewing the code before or after PR #1613, which made some test changes.

Observed behaviour

The test fails because one more retry than expected was made.

Underlying issue

By adding logging to ARTRealtime (see 6f738c5), we see the following messages:

2023-07-05 14:12:19.420393-0300 xctest[29178:1697374] VERBOSE: (ARTRealtime.m:674) RT:0x102d045a0 set connection lost time; expected suspension at 2023-07-05T17:12:24.320Z (ttl=4.900000)
2023-07-05 14:12:19.316145-0300 xctest[29178:1696902] VERBOSE: (ARTRealtime.m:1099) Scheduling timeout work in transportConnectForcingNewToken:newConnection:
2023-07-05 14:12:20.453626-0300 xctest[29178:1697372] VERBOSE: (ARTRealtime.m:1099) Scheduling timeout work in transportConnectForcingNewToken:newConnection:
2023-07-05 14:12:21.616243-0300 xctest[29178:1697374] VERBOSE: (ARTRealtime.m:1099) Scheduling timeout work in transportConnectForcingNewToken:newConnection:
2023-07-05 14:12:22.745132-0300 xctest[29178:1697374] VERBOSE: (ARTRealtime.m:1099) Scheduling timeout work in transportConnectForcingNewToken:newConnection:
2023-07-05 14:12:23.882761-0300 xctest[29178:1697374] VERBOSE: (ARTRealtime.m:1099) Scheduling timeout work in transportConnectForcingNewToken:newConnection:
2023-07-05 14:12:25.046635-0300 xctest[29178:1697372] VERBOSE: (ARTRealtime.m:1099) Scheduling timeout work in transportConnectForcingNewToken:newConnection:

Notice that the final "Scheduling timeout work" log message occurs after the date described by the "expected suspension at" log message. This suggests that the connection is only checking whether it should enter SUSPENDED in the moment that a timeout occurs, and not before scheduling work that might time out.

┆Issue is synchronized with this Jira Bug by Unito

@lawrence-forooghian lawrence-forooghian added the bug Something isn't working. It's clear that this does need to be fixed. label Jul 5, 2023
@sync-by-unito
Copy link

sync-by-unito bot commented Jul 5, 2023

➤ Automation for Jira commented:

The link to the corresponding Jira issue is https://ably.atlassian.net/browse/SDK-3721

lawrence-forooghian added a commit that referenced this issue Jul 5, 2023
This bug describes how the library will sometimes perform an extra retry
when it should transition to the SUSPENDED state. This can cause
test__059 to fail depending on the exact chosen value of
connectionStateTtl. In #1431 I will be changing the value of
connectionStateTtl that this test uses, and the value I want to use does
indeed cause the test to fail. So, adjust the test to handle the case
where an extra retry occurs. We should remove this workaround once we
fix #1782.
@lawrence-forooghian
Copy link
Collaborator Author

Oops — GitHub got confused by my commit message for 706f269 and thought that it fixed this issue. It doesn’t, reopening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working. It's clear that this does need to be fixed.
Development

No branches or pull requests

1 participant