Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] [serverless] Add S3 span pointers #3083

Draft
wants to merge 42 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
c6bf758
create skeleton for S3 span pointer creation
nhulston Jan 13, 2025
ce4f56a
get key+bucket from S3 request and etag from S3 response
nhulston Jan 13, 2025
963dadd
calculate hash following span pointer rules
nhulston Jan 13, 2025
c5f0539
add span pointer to links and update `_dd.span_links` tag
nhulston Jan 13, 2025
6a2bb47
add copyright to span_pointers.go
nhulston Jan 13, 2025
98a1df8
add tests for `generatePointerHash`
nhulston Jan 14, 2025
32f31c7
simplify param checks
nhulston Jan 14, 2025
eb16fdd
finish aws span in deserialize, not in init.
nhulston Jan 14, 2025
ded7e8e
implement `SpanContextWithLinks` in `mockspancontext`
nhulston Jan 14, 2025
02da8e3
add TestHandleS3Operation
nhulston Jan 14, 2025
9790821
f
nhulston Jan 15, 2025
acb79a8
impl noop
nhulston Jan 15, 2025
1e8d8c1
temp impl civisibility
nhulston Jan 15, 2025
7fcb8d6
test
nhulston Jan 15, 2025
3020a61
test
nhulston Jan 15, 2025
c9163ef
test
nhulston Jan 15, 2025
ec580e1
test
nhulston Jan 15, 2025
2d77b2a
test
nhulston Jan 15, 2025
0f14d96
test
nhulston Jan 15, 2025
ec57e5c
test
nhulston Jan 15, 2025
7031c27
test
nhulston Jan 15, 2025
d9ae023
test
nhulston Jan 15, 2025
6656be0
test
nhulston Jan 15, 2025
397aaf9
test
nhulston Jan 15, 2025
7aa5eef
test
nhulston Jan 15, 2025
c228cc4
test
nhulston Jan 15, 2025
e9019bc
test
nhulston Jan 15, 2025
61dff10
test
nhulston Jan 15, 2025
2efd7ec
test
nhulston Jan 15, 2025
451c77a
test
nhulston Jan 15, 2025
66c37dd
test
nhulston Jan 15, 2025
c6b7011
test
nhulston Jan 15, 2025
097bbd9
test
nhulston Jan 15, 2025
ad9d8c6
test
nhulston Jan 15, 2025
1d5a2ac
test
nhulston Jan 15, 2025
4b23bd9
test
nhulston Jan 15, 2025
14e74b5
test
nhulston Jan 15, 2025
3b048bd
test
nhulston Jan 15, 2025
b510644
test
nhulston Jan 15, 2025
2963c53
test
nhulston Jan 15, 2025
7e8fb2b
test
nhulston Jan 15, 2025
a325cfb
implement mockspan
nhulston Jan 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion contrib/aws/aws-sdk-go-v2/aws/aws.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"strings"
"time"

"gopkg.in/DataDog/dd-trace-go.v1/contrib/aws/internal/span_pointers"
"gopkg.in/DataDog/dd-trace-go.v1/contrib/aws/internal/tags"
"gopkg.in/DataDog/dd-trace-go.v1/ddtrace"
"gopkg.in/DataDog/dd-trace-go.v1/ddtrace/ext"
Expand Down Expand Up @@ -127,7 +128,6 @@ func (mw *traceMiddleware) startTraceMiddleware(stack *middleware.Stack) error {
if err != nil && (mw.cfg.errCheck == nil || mw.cfg.errCheck(err)) {
span.SetTag(ext.Error, err)
}
span.Finish()

return out, metadata, err
}), middleware.After)
Expand Down Expand Up @@ -357,6 +357,14 @@ func (mw *traceMiddleware) deserializeTraceMiddleware(stack *middleware.Stack) e
span.SetTag(tags.AWSRequestID, requestID)
}

// Create S3 span pointer
serviceID := awsmiddleware.GetServiceID(ctx)
if serviceID == "S3" {
span_pointers.HandleS3Operation(in, out, span)
}

span.Finish()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the aws span should live until we receive a response, not when the request is sent. (this is how AWS spans work in other tracers)

From my testing this has barely any actual impact on the span end time (does anyone know why? I'd expect there to be a big difference?)

Therefore, I'm moving span.Finish() to deserialize. I also had to do this to get unit tests to pass


return out, metadata, err
}), middleware.Before)
}
Expand Down
92 changes: 92 additions & 0 deletions contrib/aws/internal/span_pointers/span_pointers.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
// Unless explicitly stated otherwise all files in this repository are licensed
// under the Apache License Version 2.0.
// This product includes software developed at Datadog (https://www.datadoghq.com/).
// Copyright 2016 Datadog, Inc.

package span_pointers

import (
"crypto/sha256"
"encoding/hex"
"encoding/json"
"github.com/aws/smithy-go/middleware"
smithyhttp "github.com/aws/smithy-go/transport/http"
"gopkg.in/DataDog/dd-trace-go.v1/ddtrace"
"gopkg.in/DataDog/dd-trace-go.v1/ddtrace/tracer"
"gopkg.in/DataDog/dd-trace-go.v1/internal/log"
"strings"
)

const (
// SpanPointerHashLengthBytes 16 bytes = 32 chars.
// See https://github.com/DataDog/dd-span-pointer-rules/blob/main/README.md#general-hashing-rules
SpanPointerHashLengthBytes = 16
PointerDownDirection = "d"
LinkKind = "span-pointer"
S3PointerKind = "aws.s3.object"
)

func HandleS3Operation(in middleware.DeserializeInput, out middleware.DeserializeOutput, span tracer.Span) {
req, ok := in.Request.(*smithyhttp.Request)
if !ok {
return
}
res, ok := out.RawResponse.(*smithyhttp.Response)
if !ok {
return
}

// URL format: https://BUCKETNAME.s3.REGION.amazonaws.com/KEYNAME?x-id=OPERATIONNAME
key := strings.TrimPrefix(req.URL.Path, "/")
bucket := strings.Split(req.URL.Host, ".")[0]
// the AWS SDK sometimes wraps the eTag in quotes
etag := strings.Trim(res.Header.Get("ETag"), "\"")
if key == "" || bucket == "" || etag == "" {
log.Debug("Unable to create S3 span pointer because key could not be found.")
return
}

// Hash calculation rules: https://github.com/DataDog/dd-span-pointer-rules/blob/main/AWS/S3/Object/README.md
components := []string{bucket, key, etag}
hash := generatePointerHash(components)

ctxWithLinks, ok := span.Context().(ddtrace.SpanContextWithLinks)
if !ok {
log.Debug("Span links could not be found. Unable to create S3 span pointer.")
return
}

links := ctxWithLinks.SpanLinks()
link := ddtrace.SpanLink{
// We leave trace_id, span_id, trade_id_high, tracestate, and flags as 0 or empty.
// The Datadog frontend will use `ptr.hash` to find the linked span.
Attributes: map[string]string{
"ptr.kind": S3PointerKind,
"ptr.dir": PointerDownDirection,
"ptr.hash": hash,
"link.kind": LinkKind,
},
}
links = append(links, link)
if spanLinksJsonBytes, err := json.Marshal(links); err == nil {
span.SetTag("_dd.span_links", string(spanLinksJsonBytes))
} else {
log.Debug("Span links could not be marshalled. Unable to create S3 span pointer.")
}
}

// generatePointerHash generates a unique hash from an array of strings by joining them with | before hashing.
// Used to uniquely identify AWS requests for span pointers.
// Returns a 32-character hash uniquely identifying the components.
func generatePointerHash(components []string) string {
h := sha256.New()
for i, component := range components {
if i > 0 {
h.Write([]byte("|"))
}
h.Write([]byte(component))
}

fullHash := h.Sum(nil)
return hex.EncodeToString(fullHash[:SpanPointerHashLengthBytes])
}
185 changes: 185 additions & 0 deletions contrib/aws/internal/span_pointers/span_pointers_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
package span_pointers

import (
"context"
"encoding/json"
"github.com/aws/smithy-go/middleware"
smithyhttp "github.com/aws/smithy-go/transport/http"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"gopkg.in/DataDog/dd-trace-go.v1/ddtrace"
"gopkg.in/DataDog/dd-trace-go.v1/ddtrace/mocktracer"
"gopkg.in/DataDog/dd-trace-go.v1/ddtrace/tracer"
"net/http"
"net/url"
"testing"
)

func TestGeneratePointerHash(t *testing.T) {
tests := []struct {
name string
components []string
expectedHash string
}{
{
name: "basic values",
components: []string{
"some-bucket",
"some-key.data",
"ab12ef34",
},
expectedHash: "e721375466d4116ab551213fdea08413",
},
{
name: "non-ascii key",
components: []string{
"some-bucket",
"some-key.你好",
"ab12ef34",
},
expectedHash: "d1333a04b9928ab462b5c6cadfa401f4",
},
{
name: "multipart-upload",
components: []string{
"some-bucket",
"some-key.data",
"ab12ef34-5",
},
expectedHash: "2b90dffc37ebc7bc610152c3dc72af9f",
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got := generatePointerHash(tt.components)
if got != tt.expectedHash {
t.Errorf("GeneratePointerHash() = %v, want %v", got, tt.expectedHash)
}
})
}
}

func TestHandleS3Operation(t *testing.T) {
mt := mocktracer.Start()
defer mt.Stop()

tests := []struct {
name string
bucket string
key string
etag string
expectedHash string
expectSuccess bool
}{
{
name: "basic operation",
bucket: "some-bucket",
key: "some-key.data",
etag: "ab12ef34",
expectedHash: "e721375466d4116ab551213fdea08413",
expectSuccess: true,
},
{
name: "quoted etag",
bucket: "some-bucket",
key: "some-key.data",
etag: "\"ab12ef34\"",
expectedHash: "e721375466d4116ab551213fdea08413",
expectSuccess: true,
},
{
name: "non-ascii key",
bucket: "some-bucket",
key: "some-key.你好",
etag: "ab12ef34",
expectedHash: "d1333a04b9928ab462b5c6cadfa401f4",
expectSuccess: true,
},
{
name: "empty bucket",
bucket: "",
key: "some_key",
etag: "some_etag",
expectSuccess: false,
},
{
name: "empty key",
bucket: "some_bucket",
key: "",
etag: "some_etag",
expectSuccess: false,
},
{
name: "empty etag",
bucket: "some_bucket",
key: "some_key",
etag: "",
expectSuccess: false,
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
ctx := context.Background()
span, ctx := tracer.StartSpanFromContext(ctx, "test.s3.operation")

// Create request
reqURL, _ := url.Parse("https://" + tt.bucket + ".s3.region.amazonaws.com/" + tt.key)
req := &smithyhttp.Request{
Request: &http.Request{
URL: reqURL,
},
}

// Create response
header := http.Header{}
header.Set("ETag", tt.etag)
res := &smithyhttp.Response{
Response: &http.Response{
Header: header,
},
}

// Create input/output
in := middleware.DeserializeInput{
Request: req,
}
out := middleware.DeserializeOutput{
RawResponse: res,
}

HandleS3Operation(in, out, span)
span.Finish()
spans := mt.FinishedSpans()
if tt.expectSuccess {
require.Len(t, spans, 1)
tags := spans[0].Tags()

spanLinks, exists := tags["_dd.span_links"]
assert.True(t, exists, "Expected span links to be set")
assert.NotEmpty(t, spanLinks, "Expected span links to not be empty")

spanLinksStr, ok := spanLinks.(string)
assert.True(t, ok, "Expected span links to be a string")

var links []ddtrace.SpanLink
err := json.Unmarshal([]byte(spanLinksStr), &links)
require.NoError(t, err)
require.Len(t, links, 1)

attributes := links[0].Attributes
assert.Equal(t, S3PointerKind, attributes["ptr.kind"])
assert.Equal(t, PointerDownDirection, attributes["ptr.dir"])
assert.Equal(t, LinkKind, attributes["link.kind"])
assert.Equal(t, tt.expectedHash, attributes["ptr.hash"])
} else {
require.Len(t, spans, 1)
tags := spans[0].Tags()
_, exists := tags["_dd.span_links"]
assert.False(t, exists, "Expected no span links to be set")
}
mt.Reset()
})
}
}
9 changes: 9 additions & 0 deletions ddtrace/mocktracer/mockspancontext.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,12 +23,21 @@ type spanContext struct {
spanID uint64
traceID uint64
span *mockspan // context owner

spanLinks []ddtrace.SpanLink
}

func (sc *spanContext) TraceID() uint64 { return sc.traceID }

func (sc *spanContext) SpanID() uint64 { return sc.spanID }

// SpanLinks implements ddtrace.SpanContextWithLinks
func (sc *spanContext) SpanLinks() []ddtrace.SpanLink {
cp := make([]ddtrace.SpanLink, len(sc.spanLinks))
copy(cp, sc.spanLinks)
return cp
}

func (sc *spanContext) ForeachBaggageItem(handler func(k, v string) bool) {
sc.RLock()
defer sc.RUnlock()
Expand Down
Loading