Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cache client to improve SNMP scraping #1121

Closed
wants to merge 1 commit into from

Conversation

servak
Copy link
Contributor

@servak servak commented Feb 21, 2024

I add the ability to return from Cache for duplicate requests.
If multiple modules are specified and duplicate Metrics are defined, this will not work well.
It may also not be cached if more than one Worker is running.

generator.yml

modules:
  if_mib:
    walk: [ifOperStatus]
    lookups:
      - source_indexes: [ifIndex]
        lookup: ifAlias
      - source_indexes: [ifIndex]
        lookup: 1.3.6.1.2.1.2.2.1.2 # ifDescr
      - source_indexes: [ifIndex]
        lookup: 1.3.6.1.2.1.31.1.1.1.1 # ifName
  if_mib2:
    walk: [ifAdminStatus]
    lookups:
      - source_indexes: [ifIndex]
        lookup: ifAlias
      - source_indexes: [ifIndex]
        lookup: 1.3.6.1.2.1.2.2.1.2 # ifDescr
      - source_indexes: [ifIndex]
        lookup: 1.3.6.1.2.1.31.1.1.1.1 # ifName

testing

It can be seen that the snmp_scrape_packets_sent is decreasing.

> curl 'localhost:9116/snmp?target=192.168.64.3&module=if_mib,if_mib2'
# HELP ifAdminStatus The desired state of the interface - 1.3.6.1.2.1.2.2.1.7
# TYPE ifAdminStatus gauge
ifAdminStatus{ifAlias="eth0",ifDescr="Intel Corporation 82540EM Gigabit Ethernet Controller",ifIndex="2",ifName="eth0"} 1
ifAdminStatus{ifAlias="lo",ifDescr="lo",ifIndex="1",ifName="lo"} 1
ifAdminStatus{ifAlias="swp1",ifDescr="Intel Corporation 82540EM Gigabit Ethernet Controller",ifIndex="3",ifName="swp1"} 2
# HELP ifOperStatus The current operational state of the interface - 1.3.6.1.2.1.2.2.1.8
# TYPE ifOperStatus gauge
ifOperStatus{ifAlias="eth0",ifDescr="Intel Corporation 82540EM Gigabit Ethernet Controller",ifIndex="2",ifName="eth0"} 1
ifOperStatus{ifAlias="lo",ifDescr="lo",ifIndex="1",ifName="lo"} 1
ifOperStatus{ifAlias="swp1",ifDescr="Intel Corporation 82540EM Gigabit Ethernet Controller",ifIndex="3",ifName="swp1"} 2
# HELP snmp_scrape_duration_seconds Total SNMP time scrape took (walk and processing).
# TYPE snmp_scrape_duration_seconds gauge
snmp_scrape_duration_seconds{module="if_mib"} 0.030786458
snmp_scrape_duration_seconds{module="if_mib2"} 0.002798083
# HELP snmp_scrape_packets_retried Packets retried for get, bulkget, and walk.
# TYPE snmp_scrape_packets_retried gauge
snmp_scrape_packets_retried{module="if_mib"} 0
snmp_scrape_packets_retried{module="if_mib2"} 0
# HELP snmp_scrape_packets_sent Packets sent for get, bulkget, and walk; including retries.
# TYPE snmp_scrape_packets_sent gauge
snmp_scrape_packets_sent{module="if_mib"} 4
snmp_scrape_packets_sent{module="if_mib2"} 1
# HELP snmp_scrape_pdus_returned PDUs returned from get, bulkget, and walk.
# TYPE snmp_scrape_pdus_returned gauge
snmp_scrape_pdus_returned{module="if_mib"} 12
snmp_scrape_pdus_returned{module="if_mib2"} 12
# HELP snmp_scrape_walk_duration_seconds Time SNMP walk/bulkwalk took.
# TYPE snmp_scrape_walk_duration_seconds gauge
snmp_scrape_walk_duration_seconds{module="if_mib"} 0.030733375
snmp_scrape_walk_duration_seconds{module="if_mib2"} 0.002757625

@SuperQ
Copy link
Member

SuperQ commented Feb 21, 2024

While I like the idea of simplicity, I think we need to have a more robust cache design before we try and implement this.

@servak
Copy link
Contributor Author

servak commented Feb 22, 2024

OK. Certainly, the following problems could be solved by providing some kind of cash service.

It may also not be cached if more than one Worker is running.

What about the following Cache components?

type CacheService interface {
	Get(string) []gosnmp.SnmpPDU
	Set(string, []gosnmp.SnmpPDU)
}

type cacheClient struct {
	scraper     SNMPScraper
	cache.      CacheService
}

Memory is a given as an implementation, but if the assumption is that Cache will cover a wider area, then memcache or redis might be a good idea.

@SuperQ
Copy link
Member

SuperQ commented Feb 22, 2024

Caching should also have a TTL specification, which would be specified per module and probably per walk/get within each module.

I think we need to expand and write a bit of a design doc for what want for cache behaviors.

@servak
Copy link
Contributor Author

servak commented Feb 23, 2024

I decided to implement this feature because I thought it would be good to save waste at one scrape, but if you want greater functionality from a cache service, we should certainly design it properly, so I'm going to close this PR once and for all.

@servak servak closed this Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants