Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[windows][CI/CD] ADOT collector delayed start #1767

Open
Kausik-A opened this issue Jan 19, 2023 · 5 comments
Open

[windows][CI/CD] ADOT collector delayed start #1767

Kausik-A opened this issue Jan 19, 2023 · 5 comments
Assignees

Comments

@Kausik-A
Copy link
Contributor

Describe the bug
Sets ADOT collector agent as Automatic (delayed start) services to mitigate known go windows issues with 1.9.2: golang/go#23479

Services would not restart across reboots on Automatic services, they would timeout before coming up and the service control manager would give up spawning them. This workaround will be reverted back once's it is believed to addressed in the future golang releases

@Kausik-A
Copy link
Contributor Author

Kausik-A commented Feb 2, 2023

Updates:

ADOT collector agent has been crashing sometimes at windows system reboot. This behaviour has been experienced by other projects and it seems to be a bug in golang. The issue is currently being tracked golang/go#23479

Testing

This issue has been tough to replicate as it is a probabilistic failure which occurs to some windows instances from a large fleet of machines. This limitation on not able replicate the issue every-time has been mentioned on the linked issue too. SO we need to come up with a testing strategy to set up a mock env with these condition to run across these failures.

Proposed Solution

Out of all proposed workaround mentioned in the issue thread, one approach found that having delayed start to the application has decreased the probability of the agent failing to start up (Still not 0% but better than before ).This is implemented in #1788 . Considering this change just delays the start of the ADOT collector, it has a comparatively low blast radius/affect on our customer.
Also additionally have some more investigation into other possible solutions.

@github-actions
Copy link
Contributor

github-actions bot commented Apr 9, 2023

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

@github-actions github-actions bot added the stale label Apr 9, 2023
@github-actions
Copy link
Contributor

This issue was closed because it has been marked as stale for 30 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 14, 2023
@bryan-aguilar
Copy link
Contributor

Refer to PR #1788 for additional discussion.

@bryan-aguilar bryan-aguilar reopened this Jun 21, 2023
@bryan-aguilar bryan-aguilar added this to the ADOT Collector Backlog milestone Jun 21, 2023
@javasorn
Copy link

I using a WinSW to handling service ADOTCollector service. Its working in windows-aws-ecs-fargate.

#1788 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants