Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I wish fetch-config would not delete the .json config file #1218

Open
jedwards1211 opened this issue Jun 18, 2024 · 12 comments
Open

I wish fetch-config would not delete the .json config file #1218

jedwards1211 opened this issue Jun 18, 2024 · 12 comments
Labels
bug Something isn't working status/investigate

Comments

@jedwards1211
Copy link

It's counterproductive that running fetch-config deletes the input .json config file.
When I'm debugging issues I want to just edit the file and rerun the fetch-config command. The fact that fetch-config deletes the file makes this more of a hassle.

And to me it seems like the configuration is overcomplicated (converting to a different .toml format, there's also a .yaml file there for some reason). It would be way more straightforward if we just specify .json file or SSM parameter or whatever as the configuration source, and the CloudWatch agent just leaves that as the source of truth, i.e. always reads from that file or SSM parameter on startup instead of fetching it and storing it in some other format.

@okankoAMZ
Copy link
Contributor

Hi

Using the fetch-config command should not deleting the config file. Could you provide some logs and outputs demonstrating this issue?

Thank you!

@okankoAMZ okankoAMZ added bug Something isn't working question Further information is requested and removed bug Something isn't working labels Jul 18, 2024
@jedwards1211
Copy link
Author

jedwards1211 commented Jul 18, 2024

In the /opt/aws/amazon-cloudwatch-agent/etc directory:

[ec2-user@ip-172-31-44-197 etc]$ sudo cp amazon-cloudwatch-agent.json.bak amazon-cloudwatch-agent.json
[ec2-user@ip-172-31-44-197 etc]$ ls
amazon-cloudwatch-agent.d     amazon-cloudwatch-agent.json.bak  amazon-cloudwatch-agent.yaml  env-config.json
amazon-cloudwatch-agent.json  amazon-cloudwatch-agent.toml      common-config.toml            log-config.json
[ec2-user@ip-172-31-44-197 etc]$ sudo amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json -s
****** processing amazon-cloudwatch-agent ******
2024/07/18 18:31:17 I! imds retry client will retry 1 times
I! Trying to detect region from ec2 D! [EC2] Found active network interface Successfully fetched the config and saved in /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_amazon-cloudwatch-agent.json.tmp
Start configuration validation...
2024/07/18 18:31:17 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_amazon-cloudwatch-agent.json.tmp ...
2024/07/18 18:31:17 I! Valid Json input schema.
2024/07/18 18:31:17 I! imds retry client will retry 1 times
2024/07/18 18:31:17 D! ec2tagger processor required because append_dimensions is set
2024/07/18 18:31:17 D! pipeline hostDeltaMetrics has no receivers
2024/07/18 18:31:17 Configuration validation first phase succeeded
I! Detecting run_as_user...
I! Trying to detect region from ec2
D! [EC2] Found active network interface
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -schematest -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml
Configuration validation second phase succeeded
Configuration validation succeeded
[ec2-user@ip-172-31-44-197 etc]$ ls
amazon-cloudwatch-agent.d         amazon-cloudwatch-agent.toml  common-config.toml  log-config.json
amazon-cloudwatch-agent.json.bak  amazon-cloudwatch-agent.yaml  env-config.json

You can see that amazon-cloudwatch-agent.json is gone in output of the final ls.

amazon-cloudwatch-agent.log:

2024-07-18T18:31:17Z I! Profiler is stopped during shutdown
2024-07-18T18:31:17.681Z        info    otelcol/collector.go:227        Received signal from OS {"signal": "terminated"}
2024-07-18T18:31:17.682Z        info    service/service.go:157  Starting shutdown...
2024-07-18T18:31:17.692Z        info    extensions/extensions.go:44     Stopping extensions...
2024-07-18T18:31:17.693Z        info    service/service.go:171  Shutdown complete.
2024/07/18 18:31:19 I! Config has been translated into TOML /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml 
2024/07/18 18:31:19 D! config [agent]
  collection_jitter = "0s"
  debug = false
  flush_interval = "1s"
  flush_jitter = "0s"
  hostname = ""
  interval = "300s"
  logfile = "/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log"
  logtarget = "lumberjack"
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  omit_hostname = false
  precision = ""
  quiet = false
  round_interval = false

[inputs]

  [[inputs.disk]]
    fieldpass = ["used_percent"]
    mount_points = ["/mnt/data-01"]
    tagexclude = ["mode"]

  [[inputs.logfile]]
    destination = "cloudwatchlogs"
    file_state_folder = "/opt/aws/amazon-cloudwatch-agent/logs/state"

    [[inputs.logfile.file_config]]
      file_path = "/var/log/cloud-init-output.log"
      from_beginning = true
      log_group_name = "clarity-2-db-syslog-r02"
      log_stream_name = "/var/log/cloud-init-output.log"
      pipe = false
      retention_in_days = 7

    [[inputs.logfile.file_config]]
      file_path = "/var/log/cfn-init.log"
      from_beginning = true
      log_group_name = "clarity-2-db-syslog-r02"
      log_stream_name = "/var/log/cfn-init.log"
      pipe = false
      retention_in_days = 7

    [[inputs.logfile.file_config]]
      file_path = "/var/log/cfn-init-cmd.log"
      from_beginning = true
      log_group_name = "clarity-2-db-syslog-r02"
      log_stream_name = "/var/log/cfn-init-cmd.log"
      pipe = false
      retention_in_days = 7

    [[inputs.logfile.file_config]]
      file_path = "/var/log/cfn-hup.log"
      from_beginning = true
      log_group_name = "clarity-2-db-syslog-r02"
      log_stream_name = "/var/log/cfn-hup.log"
      pipe = false
      retention_in_days = 7

    [[inputs.logfile.file_config]]
      file_path = "/var/log/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.log"
      from_beginning = true
      log_group_name = "clarity-2-db-syslog-r02"
      log_stream_name = "/var/log/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.log"
      pipe = false
      retention_in_days = 7

    [[inputs.logfile.file_config]]
      file_path = "/var/log/manage-db-reconf.log"
      from_beginning = true
      log_group_name = "clarity-2-db-syslog-r02"
      log_stream_name = "/var/log/manage-db-reconf.log"
      pipe = false
      retention_in_days = 7

  [[inputs.mem]]
    fieldpass = ["used_percent"]

[outputs]

  [[outputs.cloudwatch]]

  [[outputs.cloudwatchlogs]]
    force_flush_interval = "5s"
    log_stream_name = "i-07597f6c4d5733042"
    region = "us-west-2"
2024/07/18 18:31:19 I! Config has been translated into YAML /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.yaml 
2024/07/18 18:31:19 D! config connectors: {}
exporters:
    awscloudwatch:
        force_flush_interval: 1s
        max_datums_per_call: 1000
        max_values_per_datum: 150
        namespace: CWAgent
        region: us-west-2
        resource_to_telemetry_conversion:
            enabled: true
        rollup_dimensions:
            - - InstanceId
              - path
extensions: {}
processors:
    ec2tagger:
        ec2_instance_tag_keys: []
        ec2_metadata_tags:
            - InstanceId
        imds_retries: 1
        refresh_interval_seconds: 0s
receivers:
    telegraf_disk:
        collection_interval: 5m0s
        initial_delay: 1s
    telegraf_mem:
        collection_interval: 5m0s
        initial_delay: 1s
service:
    extensions: []
    pipelines:
        metrics/host:
            exporters:
                - awscloudwatch
            processors:
                - ec2tagger
            receivers:
                - telegraf_disk
                - telegraf_mem
    telemetry:
        logs:
            development: false
            disable_caller: false
            disable_stacktrace: false
            encoding: console
            error_output_paths: []
            initial_fields: {}
            level: info
            output_paths:
                - /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log
            sampling:
                initial: 2
                thereafter: 500
        metrics:
            address: ""
            level: None
            metric_readers: []
        resource: {}
        traces:
            propagators: []
2024/07/18 18:31:19 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json ...
2024/07/18 18:31:19 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_amazon-cloudwatch-agent.json ...
2024/07/18 18:31:19 I! Valid Json input schema.
2024/07/18 18:31:19 I! Detected runAsUser: root
2024/07/18 18:31:19 I! Changing ownership of [/opt/aws/amazon-cloudwatch-agent/logs /opt/aws/amazon-cloudwatch-agent/etc /opt/aws/amazon-cloudwatch-agent/var] to 0:0
2024-07-18T18:31:19Z I! Starting AmazonCloudWatchAgent CWAgent/1.300028.1 (go1.20.8; linux; amd64)
2024-07-18T18:31:19Z I! AWS SDK log level not set
2024-07-18T18:31:19Z I! creating new logs agent
2024-07-18T18:31:19Z I! [logagent] starting
2024-07-18T18:31:19Z I! [logagent] found plugin cloudwatchlogs is a log backend
2024-07-18T18:31:19Z I! [logagent] found plugin logfile is a log collection
2024-07-18T18:31:19Z I! [logagent] start logs plugin file paths [/var/log/cloud-init-output.log /var/log/cfn-init.log /var/log/cfn-init-cmd.log /var/log/cfn-hup.log /var/log/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.log /var/log/manage-db-reconf.log]
2024-07-18T18:31:19Z I! [inputs.logfile] turned on logs plugin
2024-07-18T18:31:19.552Z        info    service/telemetry.go:96 Skipping telemetry setup.       {"address": "", "level": "None"}
2024-07-18T18:31:19Z I! imds retry client will retry 1 times
2024-07-18T18:31:19.559Z        info    service/service.go:131  Starting CWAgent...     {"Version": "1.300028.1", "NumCPU": 2}
2024-07-18T18:31:19.559Z        info    extensions/extensions.go:30     Starting extensions...
2024-07-18T18:31:19Z I! cloudwatch: get unique roll up list [[InstanceId path]]
2024-07-18T18:31:19.572Z        info    ec2tagger/ec2tagger.go:435      ec2tagger: Check EC2 Metadata.  {"kind": "processor", "name": "ec2tagger", "pipeline": "metrics/host"}
2024-07-18T18:31:19Z I! cloudwatch: publish with ForceFlushInterval: 1s, Publish Jitter: 35.296087ms
2024-07-18T18:31:19.575Z        info    ec2tagger/ec2tagger.go:411      ec2tagger: EC2 tagger has started, finished initial retrieval of tags and Volumes      {"kind": "processor", "name": "ec2tagger", "pipeline": "metrics/host"}
2024-07-18T18:31:19.575Z        info    service/service.go:148  Everything is ready. Begin running and processing data.
2024-07-18T18:31:20Z I! [inputs.logfile] Reading from offset 51573 in /var/log/cloud-init-output.log
2024-07-18T18:31:20Z I! [inputs.logfile] Reading from offset 365 in /var/log/cfn-init.log
2024-07-18T18:31:20Z I! [inputs.logfile] Reading from offset 30110 in /var/log/cfn-hup.log
2024-07-18T18:31:20Z I! [inputs.logfile] Reading from offset 25243 in /var/log/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.log
2024-07-18T18:31:20Z I! First time setting retention for log group clarity-2-db-syslog-r02, update map to avoid setting twice
2024-07-18T18:31:20Z I! [logagent] piping log from clarity-2-db-syslog-r02//var/log/cloud-init-output.log(/var/log/cloud-init-output.log) to cloudwatchlogs with retention 7
2024-07-18T18:31:20Z I! [logagent] piping log from clarity-2-db-syslog-r02//var/log/cfn-init.log(/var/log/cfn-init.log) to cloudwatchlogs with retention -1
2024-07-18T18:31:20Z I! [logagent] piping log from clarity-2-db-syslog-r02//var/log/cfn-init-cmd.log(/var/log/cfn-init-cmd.log) to cloudwatchlogs with retention -1
2024-07-18T18:31:20Z I! [logagent] piping log from clarity-2-db-syslog-r02//var/log/cfn-hup.log(/var/log/cfn-hup.log) to cloudwatchlogs with retention -1
2024-07-18T18:31:20Z I! [logagent] piping log from clarity-2-db-syslog-r02//var/log/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.log(/var/log/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.log) to cloudwatchlogs with retention -1
2024-07-18T18:31:20Z I! [logagent] piping log from clarity-2-db-syslog-r02//var/log/manage-db-reconf.log(/var/log/manage-db-reconf.log) to cloudwatchlogs with retention -1

@okankoAMZ okankoAMZ added the bug Something isn't working label Jul 19, 2024
@okankoAMZ
Copy link
Contributor

Hi! Thank you for providing the logs. The fetch-config shouldn't delete the json file by design. I will try to re-create this issue and get back to you as soon as possible.

@okankoAMZ okankoAMZ added status/investigate and removed question Further information is requested labels Jul 19, 2024
@platymatt
Copy link

Any updates to this issue? We are experiencing the same thing. Is it expected instead that the config.json file gets transposed into the .toml file and the .json file is removed as it is no longer needed?

@jedwards1211
Copy link
Author

@okankoAMZ I just noticed this in the journal after re-fetching the config... the main PID is logging:

/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json does not exist or cannot read. Skipping it.

I want to emphasize again, the number of different files and formats CWAgent seems to shuffle the config through doesn't inspire confidence. It seems like asking for bugs.

● amazon-cloudwatch-agent.service - Amazon CloudWatch Agent
     Loaded: loaded (/etc/systemd/system/amazon-cloudwatch-agent.service; enabled; preset: disabled)
     Active: active (running) since Tue 2024-09-17 00:48:53 UTC; 5s ago
   Main PID: 435744 (amazon-cloudwat)
      Tasks: 8 (limit: 2257)
     Memory: 105.1M
        CPU: 888ms
     CGroup: /system.slice/amazon-cloudwatch-agent.service
             └─435744 /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml -envconfig /opt/aws/amazon-cloudwatch-agent/etc/env-config.json -otelconfig /opt/aws/amazon-cloud>

Sep 17 00:48:54 ip-172-31-32-255.us-west-2.compute.internal start-amazon-cloudwatch-agent[435749]: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json does not exist or cannot read. Skipping it.
Sep 17 00:48:54 ip-172-31-32-255.us-west-2.compute.internal start-amazon-cloudwatch-agent[435749]: 2024/09/17 00:48:54 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_amazon-cloudwatch-agent.json ...
Sep 17 00:48:54 ip-172-31-32-255.us-west-2.compute.internal start-amazon-cloudwatch-agent[435749]: 2024/09/17 00:48:54 I! Valid Json input schema.
Sep 17 00:48:54 ip-172-31-32-255.us-west-2.compute.internal start-amazon-cloudwatch-agent[435749]: I! Detecting run_as_user...
Sep 17 00:48:54 ip-172-31-32-255.us-west-2.compute.internal start-amazon-cloudwatch-agent[435749]: I! Trying to detect region from ec2
Sep 17 00:48:54 ip-172-31-32-255.us-west-2.compute.internal start-amazon-cloudwatch-agent[435749]: 2024/09/17 00:48:54 D! ec2tagger processor required because append_dimensions is set
Sep 17 00:48:54 ip-172-31-32-255.us-west-2.compute.internal start-amazon-cloudwatch-agent[435749]: 2024/09/17 00:48:54 D! pipeline hostDeltaMetrics has no receivers
Sep 17 00:48:54 ip-172-31-32-255.us-west-2.compute.internal start-amazon-cloudwatch-agent[435749]: 2024/09/17 00:48:54 Configuration validation first phase succeeded
Sep 17 00:48:54 ip-172-31-32-255.us-west-2.compute.internal start-amazon-cloudwatch-agent[435744]: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json does not exist or cannot read. Skipping it.
Sep 17 00:48:54 ip-172-31-32-255.us-west-2.compute.internal start-amazon-cloudwatch-agent[435744]: I! Detecting run_as_user...

@solomongit3
Copy link

Hi any update on the file getting deleted

@Riskcomplexx
Copy link

This occurs on EC2 (Linux 2023) as well by default.

@umutsesen
Copy link

still happens on 24.04 build, wasted 1 hour

@massimocode
Copy link

This is happening to us too, as a first time user of cloudwatch agent, this was incredibly confusing

Here's our status so you can see the version number. Running on Ubuntu v22 LTS on EC2.

{
  "status": "running",
  "starttime": "2024-12-19T07:20:14+00:00",
  "configstatus": "configured",
  "version": "1.300049.1b929"
}

@massimocode
Copy link

I ran the following:

ubuntu@REDACTED:/tmp$ sudo chattr +i /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json
ubuntu@REDACTED:/tmp$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -s -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json
****** processing amazon-cloudwatch-agent ******
I! Trying to detect region from ec2 D! [EC2] Found active network interface I! imds retry client will retry 1 timesSuccessfully fetched the config and saved in /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_amazon-cloudwatch-agent.json.tmp
Start configuration validation...
2024/12/19 07:34:40 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_amazon-cloudwatch-agent.json.tmp ...
2024/12/19 07:34:40 I! Valid Json input schema.
2024/12/19 07:34:40 D! ec2tagger processor required because append_dimensions is set
2024/12/19 07:34:40 Configuration validation first phase succeeded
I! Detecting run_as_user...
I! Trying to detect region from ec2
D! [EC2] Found active network interface
I! imds retry client will retry 1 times
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -schematest -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml
Configuration validation second phase succeeded
Configuration validation succeeded
rm: cannot remove '/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json': Operation not permitted

** Check out the last line, that fetch-config command is definitely trying to delete the config!**

@jedwards1211
Copy link
Author

@jedwards1211
Copy link
Author

The problem seems to be putting my config in exactly this file before running the import command:

readonly JSON="${CONFDIR}/amazon-cloudwatch-agent.json"

amazon-cloudwatch-agent-ctl deletes this file, so maybe it uses it for some other purpose in some cases?

In any case, using some other file path should work around this.

Once again, this process of importing the config in one file format and outputting in another file format seems like a mess. Would be much cleaner if instead, the main cloudwatch agent process just supported loading its config on startup directly from either a toml, yaml, or json file and we didn't need to run this import step at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working status/investigate
Projects
None yet
Development

No branches or pull requests

7 participants