Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I use http_client to download data only once? #2367

Closed
Someone894 opened this issue Feb 8, 2024 · 4 comments
Closed

How can I use http_client to download data only once? #2367

Someone894 opened this issue Feb 8, 2024 · 4 comments
Labels

Comments

@Someone894
Copy link

Someone894 commented Feb 8, 2024

Hello,

currently I am trying to use benthos to regularly collect data from an open REST api and store it locally. To do so I am using the http_client component of benthos, which is working, but instead of downloading the data only once, it keeps downloading it for infinity. here is a MWE:

input:
  label: "rest_api_call"
  http_client:
    url: "https://api.db.nomics.world/v22/series/IMF/WEO:2023-10?dimensions=%7B%22weo-subject%22%3A%5B%22NGDP_RPCH%22%5D%7D&observations=true&metadata=true&format=json&limit=100&offset=0"
    verb: GET
    timeout: "10s"
    headers:
      Content-Type: application/json
    stream:
      enabled: false
      reconnect: true
      codec: lines

pipeline:
  processors: []

output:
  label: "file_write"
  file:
    path: /data/result.json
    codec: lines

How can I setup benthos to download the data only one time (e.g. one line in the result.json file)?

@Someone894 Someone894 changed the title How can i use http_client to download data only once? How can I use http_client to download data only once? Feb 8, 2024
@Jeffail
Copy link
Collaborator

Jeffail commented Feb 8, 2024

Hey @Someone894, either wrap it within a read until: https://www.benthos.dev/docs/components/inputs/read_until, or use a generate input with an http processor under it: https://www.benthos.dev/docs/components/inputs/generate. I'd probably go with the generate input myself.

Closing as per: #2026

@Jeffail Jeffail closed this as completed Feb 8, 2024
@Someone894
Copy link
Author

Someone894 commented Feb 8, 2024

I see now that I worded my Issue badly, since to me this behavior looks like a bug (or at least it is rather unexpected) and I am unsure if it is a bug or a feature, this is why I opend an issue and not a discussion.
Can you confirm that this is the intended behavior?
Then I can switch over to discussions and rephrase my issue there as a propper qestion, since read_until forces me to check for some internal propperties of the data which feels not very stable, since that data could change over time and generate is supposed to generate test data how could that possbilly help me? So i am still looing for a solution.

@mihaitodor
Copy link
Collaborator

It's intended and documented behaviour:

Connects to a server and continuously performs requests for a single message.

Like Ash mentioned, you have two alternatives to achieve what you need:

  • Wrap the http_client input in a read_until input which uses the counter() function to stop the input after one message
  • Use the generate input with count: 1 and mapping: root = "" and attach to it a http processor

@Someone894
Copy link
Author

Someone894 commented Feb 12, 2024

with @mihaitodor help I manages to figure it out. in case someone else stumbles across this issue, here is the working MWE:

input:
  label: "rest_api_call"
  read_until:
    input:
      http_client:
        url: "https://api.db.nomics.world/v22/series/IMF/WEO:2023-10?dimensions=%7B%22weo-subject%22%3A%5B%22NGDP_RPCH%22%5D%7D&observations=true&metadata=true&format=json&limit=1000&offset=0"
        verb: GET
        timeout: "10s"
        headers:
          Content-Type: application/json
        stream:
          enabled: false
          reconnect: true
          codec: lines
    check: "counter() == 1"
    idle_timeout: ""
    restart_input: false

pipeline:
  processors: []

output:
  label: "file_write"
  file:
    path: /data/result.json
    codec: lines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants