Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smithy-go's HTTP metrics are prone to data races #548

Open
alexandrupitis1 opened this issue Oct 27, 2024 · 3 comments
Open

Smithy-go's HTTP metrics are prone to data races #548

alexandrupitis1 opened this issue Oct 27, 2024 · 3 comments

Comments

@alexandrupitis1
Copy link

Smithy-go's HTTP metrics use httptrace.ClientTrace. To quote the docs:

ClientTrace is a set of hooks to run at various stages of an outgoing HTTP request. Any particular hook may be nil. Functions may be called concurrently from different goroutines and some may be called after the request has completed or failed.

However, current implementation is not safe to call concurrently, as there are write-write data races on various methods ( for example ConnectStart ). This in turn trips data race detection.

@abraithwaite
Copy link

Just ran into this myself:

==================
WARNING: DATA RACE
Write at 0x00c000578390 by goroutine 59:
  github.com/aws/smithy-go/transport/http.(*httpMetrics).ConnectStart()
      /Users/abraithwaite/Projects/golang/pkg/mod/github.com/aws/[email protected]/transport/http/metrics.go:122 +0x6c
  github.com/aws/smithy-go/transport/http.(*httpMetrics).ConnectStart-fm()
      <autogenerated>:1 +0x20
  net.(*sysDialer).dialSingle()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/dial.go:657 +0x108
  net.(*sysDialer).dialSerial()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/dial.go:635 +0x1bc
  net.(*sysDialer).dialParallel.func1()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/dial.go:555 +0xb4
  net.(*sysDialer).dialParallel.gowrap4()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/dial.go:581 +0x58

Previous write at 0x00c000578390 by goroutine 56:
  github.com/aws/smithy-go/transport/http.(*httpMetrics).ConnectStart()
      /Users/abraithwaite/Projects/golang/pkg/mod/github.com/aws/[email protected]/transport/http/metrics.go:122 +0x6c
  github.com/aws/smithy-go/transport/http.(*httpMetrics).ConnectStart-fm()
      <autogenerated>:1 +0x20
  net.(*sysDialer).dialSingle()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/dial.go:657 +0x108
  net.(*sysDialer).dialSerial()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/dial.go:635 +0x1bc
  net.(*sysDialer).dialParallel.func1()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/dial.go:555 +0xb4
  net.(*sysDialer).dialParallel.gowrap2()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/dial.go:570 +0x58

Goroutine 59 (running) created at:
  net.(*sysDialer).dialParallel()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/dial.go:581 +0x6dc
  net.(*Dialer).DialContext()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/dial.go:527 +0x878
  net.(*Dialer).DialContext-fm()
      <autogenerated>:1 +0x64
  github.com/aws/aws-sdk-go-v2/aws/transport/http.(*BuildableClient).GetTransport.defaultHTTPTransport.traceDialContext.func1()
      /Users/abraithwaite/Projects/golang/pkg/mod/github.com/aws/[email protected]/aws/transport/http/client.go:207 +0xb4
  net/http.(*Transport).dial()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/http/transport.go:1226 +0x90
  net/http.(*Transport).dialConn()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/http/transport.go:1728 +0xa3c
  net/http.(*Transport).dialConnFor()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/http/transport.go:1563 +0xb8
  net/http.(*Transport).startDialConnForLocked.func1()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/http/transport.go:1545 +0x38

Goroutine 56 (running) created at:
  net.(*sysDialer).dialParallel()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/dial.go:570 +0x310
  net.(*Dialer).DialContext()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/dial.go:527 +0x878
  net.(*Dialer).DialContext-fm()
      <autogenerated>:1 +0x64
  github.com/aws/aws-sdk-go-v2/aws/transport/http.(*BuildableClient).GetTransport.defaultHTTPTransport.traceDialContext.func1()
      /Users/abraithwaite/Projects/golang/pkg/mod/github.com/aws/[email protected]/aws/transport/http/client.go:207 +0xb4
  net/http.(*Transport).dial()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/http/transport.go:1226 +0x90
  net/http.(*Transport).dialConn()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/http/transport.go:1728 +0xa3c
  net/http.(*Transport).dialConnFor()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/http/transport.go:1563 +0xb8
  net/http.(*Transport).startDialConnForLocked.func1()
      /opt/homebrew/Cellar/go/1.23.3/libexec/src/net/http/transport.go:1545 +0x38
==================

@rjeczalik
Copy link

Just did a quick research, and found that this:

func (m *httpMetrics) ConnectStart(string, string) {
m.connectStart = now()
}

Can be called concurrently if transport is dual-stack:

https://github.com/golang/go/blob/6293a0683171e3a08e275884fddeaa2b57a861f6/src/internal/nettrace/nettrace.go#L36-L45

Due to this:

https://github.com/golang/go/blob/6293a0683171e3a08e275884fddeaa2b57a861f6/src/net/http/httptrace/trace.go#L43-L46

rjeczalik added a commit to rjeczalik/smithy-go that referenced this issue Dec 9, 2024
This PR fixes a race condition reported by aws#548,
by introducing a simple wrapper for time.Time.

While by default aws-sdk-go-v2 uses NopMeterProvider,
the transport/http internals are still instrumenting
the http requests (the withMetrics func).

Maybe there is a room for improvement, to not call
httptrace.WithClientTrace when the meter is nop,
however this is out of scope for this PR.
@rjeczalik
Copy link

#555

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants