Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cavalcade Log delivery to cloudwatch fails if events span more than 24 hours #603

Open
4 tasks
roborourke opened this issue Apr 4, 2022 · 7 comments
Open
4 tasks
Labels
must have Must be done, high priority

Comments

@roborourke
Copy link
Contributor

roborourke commented Apr 4, 2022

During the Cavalcade healthcheck background task the following error can occur:

The batch of log events in a single PutLogEvents request cannot span more than 24 hours

The logger should process and chunk the delivery of logs into 24 hours batches based on log timestamp.

(as per Rob's comment)

Acceptance criteria

  • Reproduce the issue
  • bump to v2.0.4
    "maxbanton/cwh": "^2.0"
  • Check if the issue is still presented
  • Generate a new ticket with new improvements of the system given that the limits are gone
@roborourke roborourke added the bug Existing functionality isn't behaving as expected label Apr 4, 2022
@roborourke
Copy link
Contributor Author

Also seeing some rate limit exceeded warnings:

Fluent Bit Exception[Aws\CloudWatchLogs\Exception\CloudWatchLogsException]: Error executing "DescribeLogStreams" on "https://logs.eu-west-1.amazonaws.com"; AWS HTTP error: Client error: `POST https://logs.eu-west-1.amazonaws.com` resulted in a `400 Bad Request` response:
{"__type":"ThrottlingException","message":"Rate exceeded"}
 ThrottlingException (client): Rate exceeded - {"__type":"ThrottlingException","message":"Rate exceeded"} in /usr/src/app/vendor/aws/aws-sdk-php/src/WrappedHttpHandler.php:195

@roborourke
Copy link
Contributor Author

This issue can now be resolved by updating the maxbanton/cwh package maxbanton/cwh#96

cc @humanmade/altis-vulcan

@kovshenin kovshenin added the to refine Issues needing refine label Nov 8, 2022
@mikelittle
Copy link
Contributor

Is there a way to reliably test this and confirm the fix?

@ferschubert-hm
Copy link
Contributor

https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_PutLogEvents.html

  • The maximum batch size is 1,048,576 bytes. This size is calculated as the sum of all event messages in UTF-8, plus 26 bytes for each log event.
  • None of the log events in the batch can be more than 2 hours in the future.
  • None of the log events in the batch can be more than 14 days in the past. Also, none of the log events can be from earlier than the retention period of the log group.
  • The log events in the batch must be in chronological order by their timestamp. The timestamp is the time that the event occurred, expressed as the number of milliseconds after Jan 1, 1970 00:00:00 UTC. (In AWS Tools for PowerShell and the AWS SDK for .NET, the timestamp is specified in .NET format: yyyy-mm-ddThh:mm:ss. For example, 2017-09-15T13:45:30.)
    - A batch of log events in a single request cannot span more than 24 hours. Otherwise, the operation fails.
  • The maximum number of log events in a batch is 10,000.
    - There is a quota of five requests per second per log stream. Additional requests are throttled. This quota can't be changed.

@veselala veselala removed the to refine Issues needing refine label Jan 4, 2023
@rmccue
Copy link
Member

rmccue commented Jan 9, 2023

AWS has just (Jan 4) removed their rate limiting for PutLogEvents: https://aws.amazon.com/about-aws/whats-new/2023/01/amazon-cloudwatch-logs-log-stream-transaction-quota-sequencetoken-requirement/

With this change we have removed the need for splitting your log ingestion across multiple log streams to prevent log stream throttling.

This should fix the rate exceeded issue I think, and possibly this means we now want to push events much more often?

@ferschubert-hm
Copy link
Contributor

So the AC would be here to bump to v2.0.4

"maxbanton/cwh": "^2.0"

@veselala veselala added must have Must be done, high priority and removed bug Existing functionality isn't behaving as expected labels Jan 11, 2023
@veselala
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
must have Must be done, high priority
Projects
None yet
Development

No branches or pull requests

6 participants