Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(bigtable): first_response_latencies and connectivity_error_count metrics #10616

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

bhshkh
Copy link
Contributor

@bhshkh bhshkh commented Jul 31, 2024

@product-auto-label product-auto-label bot added the api: bigtable Issues related to the Bigtable API. label Jul 31, 2024
@bhshkh bhshkh marked this pull request as ready for review August 1, 2024 17:54
@bhshkh bhshkh requested review from a team as code owners August 1, 2024 17:55
@bhshkh bhshkh requested a review from gkevinzheng August 1, 2024 17:55
@bhshkh bhshkh enabled auto-merge (squash) August 1, 2024 18:14
@bhshkh bhshkh requested a review from igorbernstein2 August 5, 2024 16:45
@@ -397,6 +400,7 @@ func (t *Table) readRows(ctx context.Context, arg RowSet, f func(Row) bool, mt *
*headerMD, _ = stream.Header()
for {
res, err := stream.Recv()
mt.currOp.setFirstRespTime(time.Now())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when multiple responses are received?

what happens when there are no rows returned?

is the behavior consistent with java?

Does this work across retries?

Copy link
Contributor Author

@bhshkh bhshkh Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when multiple responses are received?
Updated the code to set only on the first response.

what happens when there are no rows returned?
It will still record first response time
Tested this by adding some logs:

func (o *opTracer) setFirstRespTime(t time.Time) {
	fmt.Println("Entering setFirstRespTime")
	o.firstRespTimeOnce.Do(func() {
		fmt.Println("Entering firstRespTimeOnce.Do")
		o.firstRespTime = t
	})
}

With the following client:

                fmt.Println("Reading row with key:", rowKey)
		row, err := tbl.ReadRow(ctx, rowKey, bigtable.RowFilter(bigtable.ColumnFilter(columnName)))
		if err != nil {
			fmt.Printf("Could not read row with key %s: %v\n", rowKey, err)
		} else {
			fmt.Println(time.Now(), " Successfully read row with key ", rowKey)
		}
		fmt.Printf("row: %+v\n", row)

It prints:

Reading row with key: row-00000-wrong
Entering setFirstRespTime
Entering firstRespTimeOnce.Do
2024-12-20 23:59:56.8720749 +0000 UTC m=+0.016615559  Successfully read row with key  row-00000-wrong
row: map[]

Does this work across retries?

Yes
Tested this by modifying the emulator to return Unavailable error on first 3 tries and returning successful response on 4th

Modified emulator code:

func streamRow(stream btpb.Bigtable_ReadRowsServer, r *row, f *btpb.RowFilter, s *btpb.ReadIterationStats, ff *btpb.FeatureFlags) error {
...
...
        numReadRowsReq++
	if numReadRowsReq < 4 {
		time.Sleep(5 * time.Second)
		fmt.Println(time.Now().String() + " Sending error")
		return status.Error(codes.Unavailable, "Mock unavailable error")
	} else {
		time.Sleep(5 * time.Second)
		fmt.Println(time.Now().String() + " Sending row response")
		return stream.Send(rrr)
	}
}

Logs added to client library:

func (o *opTracer) setFirstRespTime(t time.Time) {
	fmt.Println(time.Now().String() + " Entering setFirstRespTime")
	o.firstRespTimeOnce.Do(func() {
		fmt.Println(time.Now().String() + " Entering firstRespTimeOnce.Do")
		o.firstRespTime = t
	})
}

This prints emulator logs:

2024-12-23 06:52:02.09841592 +0000 UTC m=+12.796711302 Sending error
2024-12-23 06:52:07.170160688 +0000 UTC m=+17.868456072 Sending error
2024-12-23 06:52:12.274257693 +0000 UTC m=+22.972553076 Sending error
2024-12-23 06:52:17.295424927 +0000 UTC m=+27.993720311 Sending row response

and client logs:

2024-12-23 06:52:17.296566065 +0000 UTC m=+20.215521414 Entering setFirstRespTime
2024-12-23 06:52:17.296745978 +0000 UTC m=+20.215701327 Entering firstRespTimeOnce.Do
2024-12-23 06:52:17.296801894 +0000 UTC m=+20.215757239 Entering setFirstRespTime
2024-12-23 06:52:17.29681528 +0000 UTC m=+20.215770640  Successfully read row with key  row-00000

Notice the timestamp of 'Entering firstRespTimeOnce.Do' is after 'Sending row response' i.e. first response time is recorded at the first successful response.

is the behavior consistent with java?

Yes

serverLatAttrs, _ := mt.toOtelMetricAttrs(metricNameServerLatencies)
if mt.currOp.currAttempt.serverLatencyErr == nil {
mt.instrumentServerLatencies.Record(mt.ctx, mt.currOp.currAttempt.serverLatency, metric.WithAttributes(serverLatAttrs...))
}
if mt.currOp.currAttempt.serverLatencyErr == nil && mt.currOp.currAttempt.locationErr == nil {
mt.instrumentConnErrCount.Add(mt.ctx, 0, metric.WithAttributes(connErrCountAttrs...))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need to add a zero?

Copy link
Contributor Author

@bhshkh bhshkh Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bhshkh bhshkh requested a review from mutianf December 23, 2024 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigtable Issues related to the Bigtable API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants