Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Oss observability proposal draft #806

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

TheGrizzlyDev
Copy link
Contributor

The goal of this proposal is to add observability to Buck2's OSS variant that is on par with similar build systems like Bazel.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 12, 2024
@aherrmann
Copy link
Contributor

Related: #811 provides a draft implementation of BES support for Buck2.

@sluongng
Copy link
Contributor

Let's move the discussion from Discord to this PR so folks can reference items in our discussion more easily. For context, I also raised bazelbuild/remote-apis#318 to the Remote API working group and got some pretty interesting feedback there.

First, let's give a high-level overview of Bazel's current BES/BEP state.

Currently, in Bazel, we have this PublishBuildEvent service with PublishBuildToolEventStream being the most interesting method inside.

service PublishBuildEvent {
  ...
  // Publish build tool events belonging to the same stream to a backend job
  // using bidirectional streaming.
  rpc PublishBuildToolEventStream(stream PublishBuildToolEventStreamRequest)
      returns (stream PublishBuildToolEventStreamResponse) {
      ...
  }
}

This establishes a bi-directional stream for the client to send events to the server.
Each event is annotated with a stream ID and a sequence number (int64) that the server could then send back to the client as an "ack", letting the client know that the event was received successfully. This send-ack mechanism allows events to be sent in the right order and the client can detect send failures to retry accordingly.

I think this is a good design that we should keep in Buck2.

Next, let's talk about the Build Events. There are actually 2 "event" protos in Bazel. An outer generic one and an inner set of "events" that are Bazel-specific.

The first type of events is google.devtools.build.v1.BuildEvent.

message BuildEvent {
  ...
  // //////////////////////////////////////////////////////////////////////////
  // Events that indicate a state change of a build request in the build
  // queue.
  oneof event {
    // An invocation attempt has started.
    InvocationAttemptStarted invocation_attempt_started = 51;

    // An invocation attempt has finished.
    InvocationAttemptFinished invocation_attempt_finished = 52;

    // The build is enqueued.
    BuildEnqueued build_enqueued = 53;

    // The build has finished. Set when the build is terminated.
    BuildFinished build_finished = 55;

    // An event containing printed text.
    ConsoleOutput console_output = 56;

    // Indicates the end of a build event stream (with the same StreamId) from
    // a build component executing the requested build task.
    // *** This field does not indicate the WatchBuild RPC is finished. ***
    BuildComponentStreamFinished component_stream_finished = 59;

    // Structured build event generated by Bazel about its execution progress.
    google.protobuf.Any bazel_event = 60;

    // An event that contains supplemental tool-specific information about
    // build execution.
    google.protobuf.Any build_execution_event = 61;

    // An event that contains supplemental tool-specific information about
    // source fetching.
    google.protobuf.Any source_fetch_event = 62;
  }
}

This is what is being sent in the bidi stream above. All the Bazel-specific events are currently being stuffed into the Any bazel_event = 60; field. This generic outer event and the service, together, could be referred to as Bazel's Build Event Service(BES).

Bazel's Build Event Protocol(BEP) is the set of inner events that are stuffed into the bazel_event field in BES. Confusingly enough, BEP is defined in a file called build_event_stream.proto.

From the Remote APIs meetings, folks were very much in favor of standardizing BES and not BEP as the latter is specific to Bazel. My past proposal in #685, which @aherrmann is building upon, introduced a much more simplified version of BES that is specific to Buck2's BuckEvent:

message BuckEventRequest {
  // A trace-unique 64-bit identifying the stream.
  uint64 stream_id = 1;

  buck.data.BuckEvent event = 2;
};

message BuckEventResponse {
  // A trace-unique 64-bit identifying the stream.
  uint64 stream_id = 1;

  // The trace ID of the event that has been committed.
  uint64 trace_id = 2;
};

service BuckEventPublisher {
  rpc StreamBuckEvent(stream BuckEventRequest) returns (stream BuckEventResponse);
};

I picked this because I think it's the simplest interface that I think can get the job done.
I would be ok with folks wanting to implement BES instead of this interface with a new field added like this.

message BuildEvent {
  ...
  oneof event {
    ...

    // Structured build event generated by Bazel about its execution progress.
    google.protobuf.Any bazel_event = 60;

    ...

    // Structured build event generated by Buck2 about its execution progress.
    google.protobuf.Any buck2_event = 63;
  }
}

Just note that bringing on the entire BES could add additional complexity to the implementation.


Protocol aside, there is a need to make BuckEvent work with existing tooling today.
Most UIs/tools available work with Bazel's BEP and not BuckEvent, and it would take a lot of effort for these tools to add support for BuckEvent. So I see a need for a conversion logic crate that transforms some BuckEvent messages into BEP messages.

However, I prefer to treat it as a separate problem from the protocol as it would help narrow down the problem scope and increase the chance of success for the implementation.

@sluongng
Copy link
Contributor

sluongng commented Nov 15, 2024

Small correction: @aherrmann implemented BuckEvent -> Bazel BEP conversion in #811 instead of using my proposal 😅

@aherrmann
Copy link
Contributor

Small correction: @aherrmann implemented BuckEvent -> Bazel BEP conversion in #811 instead of using my proposal 😅

I opted for that route for now to keep the implementation simple while it is at a more exploratory stage. That said, it is structured in a way that should make it easy to adopt the pattern proposed in #685.
buck_to_bazel_events turns a stream of Buck events into a stream of Bazel events. This could be turned into an adapter component that serves a Buck2 event stream protocol as proposed in #685 and proxies to a Bazel event stream server.

@facebook-github-bot
Copy link
Contributor

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants