Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow Image Pull Despite Hitting Local Cache with dfdaemon #3455

Open
liuyuxuan0723 opened this issue Aug 21, 2024 · 1 comment
Open

Slow Image Pull Despite Hitting Local Cache with dfdaemon #3455

liuyuxuan0723 opened this issue Aug 21, 2024 · 1 comment
Assignees
Labels

Comments

@liuyuxuan0723
Copy link

liuyuxuan0723 commented Aug 21, 2024

I've deployed Dragonfly in a Kubernetes cluster using the Helm chart. I've configured it to proxy a private image registry. Here are my steps:

  1. On a random node, pulling a 160MB image for the first time takes 10 seconds.
  2. In the var/lib/dragonfly directory of dfdaemon, I can see cached pieces. The monitoring shows the metrics dragonfly_scheduler_traffic with both backtosource and remotepeer traffic types.
  3. After manually executing crictl rmi to remove the test image from the node and pulling again, it still takes 10 seconds. The dfdaemon logs seem to indicate a cache hit, but the speed remains slow.
  4. Monitoring data again shows no local_peer traffic type in metrics dragonfly_scheduler_traffic.
    the dfdaemon log, seems to hit the local cache:
    image

My Questions:

  1. Why is the pull speed slow despite hitting the cache? It doesn’t differ much from pulling directly.
  2. Does the dragonfly_scheduler_traffic metric collect local_peer traffic? Or is there another metric to monitor local_peer traffic? (I noticed the Rust version of the client exposes related metrics.)

My Configuration:

  • containerd:
version = 2
disabled_plugins = []
imports = []
oom_score = -999
required_plugins = []
root = '/cce/containerd'
state = '/run/containerd'
[debug]
  address = '/run/containerd/debug.sock'
  level = 'info'
[plugins]
  [plugins.'io.containerd.grpc.v1.cri']
    enable_selinux = false
    enable_tls_streaming = false
    max_concurrent_downloads = 10
    sandbox_image = 'registry.baidubce.com/cce-public/pause:3.1'
    stream_server_address = '127.0.0.1'
    stream_server_port = '0'
    [plugins.'io.containerd.grpc.v1.cri'.cni]
      bin_dir = '/opt/cni/bin'
      conf_dir = '/etc/cni/net.d'
      conf_template = ''
    [plugins.'io.containerd.grpc.v1.cri'.containerd]
      default_runtime_name = 'runc'
      [plugins.'io.containerd.grpc.v1.cri'.containerd.runtimes]
        [plugins.'io.containerd.grpc.v1.cri'.containerd.runtimes.runc]
          container_annotations = []
          pod_annotations = []
          privileged_without_host_devices = false
          runtime_type = 'io.containerd.runc.v2'
    [plugins."io.containerd.grpc.v1.cri".registry]
      config_path = "/etc/containerd/certs.d"
  • dfdaemon:

    aliveTime: 0s
    gcInterval: 1m0s
    keepStorage: true
    workHome:
    logDir:
    cacheDir:
    pluginDir:
    dataDir: /var/lib/dragonfly
    console: true
    health:
      path: /server/ping
      tcpListen:
        port: 40901
    verbose: false
    metrics: ":8000"
    scheduler:
      manager:
        enable: true
        netAddrs:
        - type: tcp
          addr: dragonfly-manager.dragonfly-system.svc.cluster.local:65003
        refreshInterval: 10m
      netAddrs:
      scheduleTimeout: 30s
      disableAutoBackSource: false
      seedPeer:
        clusterID: 1
        enable: false
        type: super
    host:
      idc: ""
      location: ""
    download:
      calculateDigest: false
      downloadGRPC:
        security:
          insecure: true
          tlsVerify: true
        unixListen:
          socket: ""
      peerGRPC:
        security:
          insecure: true
        tcpListen:
          port: 65000
      perPeerRateLimit: 512Mi
      prefetch: false
      totalRateLimit: 1024Mi
    upload:
      rateLimit: 1024Mi
      security:
        insecure: true
        tlsVerify: false
      tcpListen:
        port: 65002
    objectStorage:
      enable: false
      filter: Expires&Signature&ns
      maxReplicas: 3
      security:
        insecure: true
        tlsVerify: true
      tcpListen:
        port: 65004
    storage:
      diskGCThreshold: 50Gi
      multiplex: true
      strategy: io.d7y.storage.v2.simple
      taskExpireTime: 24h
    proxy:
      defaultFilter: Expires&Signature&ns
      defaultTag:
      tcpListen:
        namespace: /run/dragonfly/net
        port: 65001
      security:
        insecure: true
        tlsVerify: false
      registryMirror:
        dynamic: true
        insecure: false
        url: https://index.docker.io
      proxies:
        - regx: blobs/sha256.*
    security:
      autoIssueCert: false
      caCert: ""
      certSpec:
        validityPeriod: 4320h
      tlsPolicy: prefer
      tlsVerify: false
    network:
      enableIPv6: false
    announcer:
      schedulerInterval: 30s
    networkTopology:
      enable: false
      probe:
        interval: 20m

Environment:

  • Kubernetes: 1.24
  • containerd: 1.6
  • Dragonfly Chart: 1.1.65
@gaius-qi
Copy link
Member

@jim3ma

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants