Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref(source-amazon-ads): migrate to low code #48116

Merged
merged 44 commits into from
Nov 20, 2024

Conversation

artem1205
Copy link
Collaborator

@artem1205 artem1205 commented Nov 1, 2024

What

Resolving https://github.com/airbytehq/airbyte-internal-issues/issues/10429

How

  • bump CDK to v6
  • migrate REST-based streams to low-code
  • set concurrency to 1 (disable); will enable after migration of async reports

Profiles stream python implementation is left because it is used in reporting logic (will be also removed after migration of async reports)

Review guide

  1. airbyte-integrations/connectors/source-amazon-ads/source_amazon_ads/manifest.yaml

User Impact

Can this PR be safely reverted and rolled back?

  • YES 💚
  • NO ❌

Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Copy link

vercel bot commented Nov 1, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Nov 19, 2024 3:58pm

Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
[skip ci]

Signed-off-by: Artem Inzhyyants <[email protected]>
@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label Nov 4, 2024
…205/source-amazon-ads-migrate-low-code

# Conflicts:
#	airbyte-integrations/connectors/source-amazon-ads/metadata.yaml
#	airbyte-integrations/connectors/source-amazon-ads/pyproject.toml
#	docs/integrations/sources/amazon-ads.md
Signed-off-by: Artem Inzhyyants <[email protected]>
@octavia-squidington-iii octavia-squidington-iii removed the area/documentation Improvements or additions to documentation label Nov 5, 2024
Signed-off-by: Artem Inzhyyants <[email protected]>
[skip ci]

Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
…azon-ads-migrate-low-code

# Conflicts:
#	airbyte-integrations/connectors/source-amazon-ads/metadata.yaml
#	airbyte-integrations/connectors/source-amazon-ads/pyproject.toml
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label Nov 8, 2024
@artem1205 artem1205 marked this pull request as ready for review November 8, 2024 15:24
@artem1205 artem1205 requested a review from a team November 8, 2024 15:28
Signed-off-by: Artem Inzhyyants <[email protected]>
Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code seems good. Overall, my concern is mostly to make sure we have the right safety net on this one. Can we test a couple of syncs using regression testing?

type: SimpleRetriever
requester:
$ref: "#/definitions/base_requester"
path: sp/budgetRules # WHY sp here???
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this comment? This was the previous behavior the connector had, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment line from py version, still not sure about correct naming for this stream.

paginator:
type: NoPagination

sponsored_display_campaigns:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should Sponsored Display streams have an error handler too? It seems like the previous behavior was the default one so it might be fine here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed

type: DefaultErrorHandler
response_filters:
- http_codes: [400]
action: IGNORE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to only ignore this one if the message is This profileID is not authorized to use Amazon Attribution. Should we add this condition here too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
@artem1205
Copy link
Collaborator Author

The code seems good. Overall, my concern is mostly to make sure we have the right safety net on this one. Can we test a couple of syncs using regression testing?

Run in https://github.com/airbytehq/airbyte/actions/runs/11799238139/job/32867273755

…azon-ads-migrate-low-code

# Conflicts:
#	airbyte-integrations/connectors/source-amazon-ads/metadata.yaml
#	airbyte-integrations/connectors/source-amazon-ads/pyproject.toml
#	airbyte-integrations/connectors/source-amazon-ads/source_amazon_ads/streams/attribution_report.py
#	airbyte-integrations/connectors/source-amazon-ads/source_amazon_ads/streams/common.py
#	airbyte-integrations/connectors/source-amazon-ads/source_amazon_ads/streams/portfolios.py
#	docs/integrations/sources/amazon-ads.md
Signed-off-by: Artem Inzhyyants <[email protected]>
…azon-ads-migrate-low-code

# Conflicts:
#	airbyte-integrations/connectors/source-amazon-ads/metadata.yaml
#	airbyte-integrations/connectors/source-amazon-ads/pyproject.toml
#	airbyte-integrations/connectors/source-amazon-ads/unit_tests/integrations/test_report_streams.py
#	docs/integrations/sources/amazon-ads.md
Signed-off-by: Artem Inzhyyants <[email protected]>
@artem1205
Copy link
Collaborator Author

UPD: tested with regression tool locally; fix 2156e2c,

Results (1467.41s (0:24:27)):
      33 passed
       2 failed
         - regression_tests/test_discover.py:18 test_catalog_are_the_same
         - validation_tests/test_read.py:31 test_read
       6 skipped

test_catalog_are_the_same: streams outputted in different order. (expected)
test_read: No RUNNING status for empty streams (expected)

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_catalog_are_the_same: streams outputted in different order. (expected)

I know the diff of the tests can show problems with the order of the streams (I've already been confused by this) but the test seem to remove ordering issue. Can we please validate this a second time knowing this?

test_read: No RUNNING status for empty streams (expected)

Does the test validates the records before failing on the stream statuses? Are there other things that were not validated by regression tests because it failed on the status?

)
).decode()
)
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably:

  • init_uncaught_exception_handler
  • raise here

Else, the error message won't be so clear if the SourceAmazonAds.init fails

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added init_uncaught_exception_handler

# TODO: change concurrency after migration of async based reports
concurrency_level:
type: ConcurrencyLevel
default_concurrency: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting concurrency to 1 might be dangerous memory wise in some cases. What this does is that it will allow for only one thread to run at the same time. However, streams are processed in two different threads: one that generates the partitions and another that emits record from those partitions. If we were to have a case where there are a lot of partitions (thing about IssueComments in source-jira for example), we would generate all those partitions without a reader thread to consume them which could lead to memory issues.

So regarding source-amazon-ads, I would have two questions:

  • Do we have cases where we have a lot of partitions? I know we use a list partition router on the config profiles which should be small but do we have other cases?
  • Why are we afraid of enabling concurrency?

Copy link
Collaborator Author

@artem1205 artem1205 Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have cases where we have a lot of partitions? I know we use a list partition router on the config profiles which should be small but do we have other cases?

No, only profile stream is used as a Parent, usually up to 5 profiles.

Why are we afraid of enabling concurrency?

Not at all, I'll set this to default value.

UPD: with ignoring these lines test read is green.

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this change given the green regression testing, thanks Artem!

@artem1205 artem1205 merged commit 696b979 into master Nov 20, 2024
34 checks passed
@artem1205 artem1205 deleted the artem1205/source-amazon-ads-migrate-low-code branch November 20, 2024 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/amazon-ads
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants