Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep OSV feed data current and updated #9

Open
pombredanne opened this issue Nov 17, 2024 · 10 comments
Open

Keep OSV feed data current and updated #9

pombredanne opened this issue Nov 17, 2024 · 10 comments

Comments

@pombredanne
Copy link

It would be great to keep this CVE feed current and updated.

I discovered its existence in this discussion:

@andrewpollock (who contributes to OSV) wrote in aboutcode-org/vulnerablecode#1661 (comment)

I did a quick Google search and happened upon https://github.com/kubernetes-sigs/cve-feed-osv (which makes me wonder why we haven't got OSV.dev importing it, but it is the first I knew of it) @oliverchang FYI

But the repo is not in sync with the latest security feed.

For instance, as of today:

Questions:

  • What is the process and which tools do you use to keep this current?
  • How can we help?
@pombredanne
Copy link
Author

@knqyf263 gentle ping, since you were the last to commit, and @chen-keinan since you wrote the code.

I think there is a permission issue in https://github.com/kubernetes-sigs/cve-feed-osv/actions/runs/11403286917/job/31730038462
and that the code in https://github.com/kubernetes-sigs/cve-feed-osv/tree/main/collector is not running anymore because of some auth issue.

Is this code actually parsing the plain text, unstructured JSON feeds, and merging that with an NVD CVE record? It feels a bit brittle.

What if the security team were to create a structured in the first place?

There is a process at https://github.com/kubernetes/committee-security-response/blob/main/cna-handbook.md#populate-cve-details-after-public-disclosure that also creates a CVE JSON, and ensuring that they also provide proper version ranges and packages (or even better PURLs) would be awesome.

@chen-keinan
Copy link
Member

chen-keinan commented Nov 18, 2024

@pombredanne thanks for the catching it up, I see that the update job is failing due to workflow bot permission, I'll have a look.

@chen-keinan
Copy link
Member

@pombredanne after taking a look I see that github action can't create PRs (new CVE for review) due to org permission

@oliverchang
Copy link

Thanks @chen-keinan !

Should we expect the feed to get updated soon? There doesn't seem to have been any automatic updates since #10

Regarding the feed itself, there's some other small changes required before we can start ingesting this into OSV, which @andrewpollock pointed out in google/osv.dev#281 (comment).

Repeating them here:

  1. Either adding kubernetes as an ecosystem to https://github.com/ossf/osv-schema/blob/main/docs/schema.md#defined-ecosystems, or using an existing ecosystem. Would the existing "Go" ecosystem work to refer to Go modules work here, or would a separate kubernetes ecosystem still be necessary?

  2. Prepending the CVE IDs with a unique prefix ("e.g. "KUBE-CVE-") to distinguish the Kubernetes published CVEs.

Who would be the right point of contact for these changes?

@chen-keinan
Copy link
Member

Thanks @chen-keinan !

Should we expect the feed to get updated soon? There doesn't seem to have been any automatic updates since #10

@oliverchang I'm working on fixing the feed update issue.

  • Thanks for pointing out the google/osv.dev#281, I assume it not depend though it will be good to have it added to Osv spec.
  1. this osv feed purpose is to report on kubernetes core component (api-server, kubelet, kube-controller and etc) meaning application level not lib (golang packages).
    making it a go ecosystem will not do the work in term of appSec matching.
  2. using the kubernetes eco-system will do the work

you can contact myself or @knqyf263

@knqyf263
Copy link
Contributor

Either adding kubernetes as an ecosystem to https://github.com/ossf/osv-schema/blob/main/docs/schema.md#defined-ecosystems, or using an existing ecosystem. Would the existing "Go" ecosystem work to refer to Go modules work here, or would a separate kubernetes ecosystem still be necessary?

There are several reasons we chose kubernetes for ecosystem.

First, Kubernetes has various distributions, such as EKS and GKE. These distributions often fix the same vulnerabilities as upstream. Since we were basing our thinking on PURLs, we thought a namespace would be a good way to represent these, like pkg:kubernetes/eks/k8s.io/apiserver and pkg:kubernetes/gke/k8s.io/apiserver. We literally saw Kubernetes as a single ecosystem.

Secondly, we were also concerned about the mismatch between the Go module versioning and the Kubernetes versioning. For example, the vulnerability describes the affected versions as follows:

Affected Versions
kube-apiserver v1.29.0 - v1.29.3
kube-apiserver v1.28.0 - v1.28.8
kube-apiserver <= v1.27.12

However, the version of k8s.io/apiserver as a Go module is v0.29.0, etc. We felt that defining the affected version as v1.29.0 for k8s.io/apiserver as the Go ecosystem would be inaccurate.

"name": "k8s.io/apiserver"

Lastly, there were cases where components could not be identified from the advisory.

In this case, the Go module name is unknown, but we are using the pseudo k8s.io/kubernetes to detect vulnerabilities using a cluster version of Kubernetes, which is not a Go module.

"name": "k8s.io/kubernetes"

We are using the Kubernetes ecosystem because using the Go ecosystem for these advisories would be inaccurate.

We could use the Go ecosystem if the OSV team has good ideas for these concerns. This project is still experimental in some aspects., and we are open to suggestions.

@oliverchang
Copy link

Thanks for the context and the very detailed examples explaining this @knqyf263!

This makes perfect sense to me, if the Kubernetes ecosystem has a both a different versioning and namespace from Kubernetes Go modules. Out of curiosity, do you have any pointers for how users would construct these identifiers for vulnerability scanning? e.g. how a vulnerability scanner would make use of this DB and what types of things it would be scanning?

Would you be able to help with contributing a definition of the "Kubernetes" ecosystem to the OSV schema? https://ossf.github.io/osv-schema/#affectedpackage-field

And also, would you be OK with prepending a "KUBE-" (or similar) ID prefix to the CVE- prefixes for all of these records, to distinguish these from non-Kubernetes sourced CVEs?

i.e.

{
  "id": "KUBE-CVE-2020-8564",
  "aliases": [ "CVE-2020-8564" ]
}

@knqyf263
Copy link
Contributor

Out of curiosity, do you have any pointers for how users would construct these identifiers for vulnerability scanning? e.g. how a vulnerability scanner would make use of this DB and what types of things it would be scanning?

I'm not familiar with other scanners, but I can explain how this advisory is used in Trivy, which I maintain.

As general background, when Trivy finds a Go binary, it extracts the embedded information to obtain the Go main module name and its dependencies. This is equivalent to the information you get from go version -m, as you may already know. In these cases, Trivy uses OSV advisories from the Go ecosystem.

However, this method may not work well when scanning Kubernetes components for vulnerabilities for two reasons:

  1. First, we often don't have access to Go binaries. When scanning a Kubernetes cluster externally, for example, we cannot access the kubelet binary file.
  2. Second, even when we do have access to Go binaries, we often cannot obtain the main module version. Binaries that weren't installed via go install will show (devel) when using go version -m.

For these reasons, Trivy obtains component versions through the Kubernetes API as much as possible.

$ kubectl get nodes -o yaml | grep kubeletVersion
      kubeletVersion: v1.27.1

In the above example, we can determine that the Kubelet version is v1.27.1. We map Kubelet to k8s.io/kubelet and then search for advisories where ecosystem = "kubernetes" and name = "k8s.io/kubelet" affecting v1.27.1. If there's a match, we detect it as a vulnerability. @chen-keinan You know more about this, so please correct me if I'm wrong about anything.

To elaborate on why we treat this as a Kubernetes ecosystem: This is what @chen-keinan meant by application-level rather than library-level. We're not obtaining Go module information, but rather getting the name and version of components within Kubernetes. We considered forcing it to work with pkg:golang/k8s.io/kubelet, but EKS, GKE and other distributions likely use different Kubelet binaries from upstream, so treating it as pkg:golang/k8s.io/kubelet would probably be incorrect. Therefore, we treat it as pkg:kubernetes/k8s.io/kubelet, and when we add GKE support in the future, we can handle it as pkg:kubernetes/gke/k8s.io/kubelet.

Would you be able to help with contributing a definition of the "Kubernetes" ecosystem to the OSV schema? https://ossf.github.io/osv-schema/#affectedpackage-field

Sure. I'll try it later.

And also, would you be OK with prepending a "KUBE-" (or similar) ID prefix to the CVE- prefixes for all of these records, to distinguish these from non-Kubernetes sourced CVEs?

While we have no objections to this, I'm curious about the reason for wanting to distinguish these, given that Kubernetes uses CVE-IDs, unlike vendor-specific IDs such as GHSA-ID or RHSA-ID.

@oliverchang
Copy link

I'm not familiar with other scanners, but I can explain how this advisory is used in Trivy, which I maintain.

I thought your name looked familiar!

Thanks for the information on how the component versions obtained and how trivy works here.

While we have no objections to this, I'm curious about the reason for wanting to distinguish these, given that Kubernetes uses CVE-IDs, unlike vendor-specific IDs such as GHSA-ID or RHSA-ID.

This is because OSV has a concept of a home database. The same CVE may be expressed in multiple places (e.g. NVD/CVE list, and custom advisory databases), and we need to distinguish them because the contents might be different. To do this, records typically have a custom identifier prefix.

An example is https://osv.dev/vulnerability/UBUNTU-CVE-2023-25136 vs https://osv.dev/vulnerability/CVE-2023-25136, where a Linux distribution wants to publish information about a specific CVE specific to the distro.

Another example is curl, who also publish OSV directly: https://osv.dev/vulnerability/CURL-CVE-2024-9681.

@knqyf263
Copy link
Contributor

knqyf263 commented Dec 4, 2024

This is because OSV has a concept of a home database. The same CVE may be expressed in multiple places (e.g. NVD/CVE list, and custom advisory databases), and we need to distinguish them because the contents might be different. To do this, records typically have a custom identifier prefix.

Thanks for explaining. It makes sense.

I opened a PR.
ossf/osv-schema#319

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants