-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Source HubSpot: fix infinite loop when iterating through search results #44899
base: master
Are you sure you want to change the base?
🐛 Source HubSpot: fix infinite loop when iterating through search results #44899
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
777f1d8
to
19a3a26
Compare
…n iterating through search results to avoid infinite loop fixes airbytehq/airbyte/airbytehq#43317
1b80a48
to
41a0622
Compare
I will put this here as a comment just if in the future, somebody wants to understand the change and has some inquiries: Before:
After:
|
"filters": [ | ||
{"value": int(self._state.timestamp() * 1000), "propertyName": self.last_modified_field, "operator": "GTE"}, | ||
{"value": int(self._init_sync.timestamp() * 1000), "propertyName": self.last_modified_field, "operator": "LTE"}, | ||
{"value": last_id, "propertyName": self.primary_key, "operator": "GTE"}, | ||
], | ||
"sorts": [{"propertyName": self.primary_key, "direction": "ASCENDING"}], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ehearty, Would you mind putting here some documentation about why this may be needed? It doesn't need to be as large as my comment in the PR, but some TL;DR on why we have a second filter may be complementary to the one in the read records section that talks about the 10,000 limitations.
My only concern is that we don't have a specific test for this scenario. Would you mind taking a look at the chance to have one? Please let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*docummentation = comments
docs/integrations/sources/hubspot.md
Outdated
@@ -331,7 +331,8 @@ The connector is restricted by normal HubSpot [rate limitations](https://legacyd | |||
<summary>Expand to review</summary> | |||
|
|||
| Version | Date | Pull Request | Subject | | |||
|:--------|:-----------|:---------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | |||
|:--------|:-----------| :------------------------------------------------------- |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | |||
| 4.2.19 | 2024-08-29 | [42688](https://github.com/airbytehq/airbyte/pull/44899) | Fix incremental search to use primary key as placeholder instead of lastModifiedDate | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to update the date here.
@ehearty LGTM left some comments, but I'm happy the unit test is good, and regression is returning good results. |
Thanks @aldogonzalez8 - time permitting, I'll try to have the changes you requested in by the end of the week. |
@ehearty If you make changes before the weekend, can you put the release date on Monday? I prefer not to ship on Thursday afternoon-Friday. If all tests pass and everything looks good, I will approve and later merge that day. Again, just if you do it before Friday. Thanks. |
Sorry for the delay. I've specifically blocked off time this week to make those final updates and will have them submitted in the next couple of days. |
@aldogonzalez8 Just letting you know that I've jumped back into the PR today and am working on a unit test for the new sort/filter parameters. Hoping to have the changes in by tomorrow, but will set the release date to Monday as requested. Thanks again for your patience. |
@ehearty That makes sense to me, thanks! |
🐛 Source HubSpot: fix infinite loop when iterating through search results
What
Fixes airbytehq/airbyte/#43317
How
I updated our custom HS connector to filter on both primary key AND timestamp, then to sort by object's primary key. The timestamps new return out of order, but once we finish iterating through all the results we should have all the data (similar to full refresh processing).
Review guide
airbyte-integrations/connectors/source-hubspot/source_hubspot/streams.py
note: I added this logic because I don't like "open ended queries" where the result set might change depending on whether new records were modified after the start of the pull.
User Impact
Can this PR be safely reverted and rolled back?