Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[serverless] Add DynamoDB Span Pointers #4912

Merged
merged 35 commits into from
Dec 18, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
3f4ca97
Add span pointer support for updateItem and deleteItem
nhulston Nov 19, 2024
e1848ba
putItem support
nhulston Nov 19, 2024
6dc50ab
transactWriteItem support
nhulston Nov 19, 2024
81d17db
batchWriteItem support
nhulston Nov 19, 2024
90bfefa
Add unit+integration tests (very large commit)
nhulston Nov 19, 2024
535544a
Move `DD_AWS_SDK_DYNAMODB_TABLE_PRIMARY_KEYS` parsing logic to config.js
nhulston Dec 6, 2024
f5649ec
Code refactoring
nhulston Dec 6, 2024
035f9fb
Move util functions to packages/datadog-plugin-aws-sdk/
nhulston Dec 9, 2024
b733782
Merge branch 'master' into nicholas.hulston/dynamodb-span-pointers
nhulston Dec 9, 2024
e62ba37
lint
nhulston Dec 9, 2024
0d33cf2
log when encountering errors in `encodeValue`; fix test
nhulston Dec 9, 2024
af6ada3
Send config env var as string to telemetry; handle parsing logic in d…
nhulston Dec 9, 2024
2e910b7
Update config_norm_rules.json
nhulston Dec 9, 2024
1ce911d
fix test
nhulston Dec 9, 2024
94a2c72
Add unit tests for DynamoDB generatePointerHash
nhulston Dec 10, 2024
2e8ae20
Merge branch 'master' into nicholas.hulston/dynamodb-span-pointers
nhulston Dec 10, 2024
eb2b5ed
better logging + checks
nhulston Dec 10, 2024
8769785
Add span pointer support for updateItem and deleteItem
nhulston Nov 19, 2024
b565475
putItem support
nhulston Nov 19, 2024
b0d1c31
transactWriteItem support
nhulston Nov 19, 2024
de8a3f6
batchWriteItem support
nhulston Nov 19, 2024
95ebee6
Add unit+integration tests (very large commit)
nhulston Nov 19, 2024
268920c
Move `DD_AWS_SDK_DYNAMODB_TABLE_PRIMARY_KEYS` parsing logic to config.js
nhulston Dec 6, 2024
064367a
Code refactoring
nhulston Dec 6, 2024
529e108
Move util functions to packages/datadog-plugin-aws-sdk/
nhulston Dec 9, 2024
b1f0cf3
lint
nhulston Dec 9, 2024
93e34ef
log when encountering errors in `encodeValue`; fix test
nhulston Dec 9, 2024
fcaf519
Send config env var as string to telemetry; handle parsing logic in d…
nhulston Dec 9, 2024
035afd4
Update config_norm_rules.json
nhulston Dec 9, 2024
fe7e662
fix test
nhulston Dec 9, 2024
d16a68d
Add unit tests for DynamoDB generatePointerHash
nhulston Dec 10, 2024
707b9c0
better logging + checks
nhulston Dec 10, 2024
cc6b9d7
Merge remote-tracking branch 'origin/nicholas.hulston/dynamodb-span-p…
nhulston Dec 17, 2024
953ac98
Refactor `calculateHashWithKnownKeys` and `extractPrimaryKeys` for si…
nhulston Dec 17, 2024
bcc78fa
Merge branch 'master' into nicholas.hulston/dynamodb-span-pointers
nhulston Dec 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
152 changes: 152 additions & 0 deletions packages/datadog-plugin-aws-sdk/src/services/dynamodb.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
'use strict'

const BaseAwsSdkPlugin = require('../base')
const log = require('../../../dd-trace/src/log')
const { DYNAMODB_PTR_KIND, SPAN_POINTER_DIRECTION } = require('../../../dd-trace/src/constants')
const { extractPrimaryKeys, generatePointerHash } = require('../../../dd-trace/src/util')

class DynamoDb extends BaseAwsSdkPlugin {
static get id () { return 'dynamodb' }
Expand Down Expand Up @@ -48,6 +51,155 @@ class DynamoDb extends BaseAwsSdkPlugin {

return tags
}

addSpanPointers (span, response) {
const request = response?.request
const operationName = request?.operation
nhulston marked this conversation as resolved.
Show resolved Hide resolved

const hashes = []
switch (operationName) {
case 'putItem': {
const hash = DynamoDb.calculatePutItemHash(
request?.params?.TableName,
request?.params?.Item,
DynamoDb.getPrimaryKeyConfig()
)
if (hash) hashes.push(hash)
break
}
case 'updateItem':
case 'deleteItem': {
const hash = DynamoDb.calculateHashWithKnownKeys(request?.params?.TableName, request?.params?.Key)
if (hash) hashes.push(hash)
break
}
case 'transactWriteItems': {
const transactItems = request?.params?.TransactItems || []
for (const item of transactItems) {
if (item.Put) {
const hash =
DynamoDb.calculatePutItemHash(item.Put.TableName, item.Put.Item, DynamoDb.getPrimaryKeyConfig())
if (hash) hashes.push(hash)
} else if (item.Update || item.Delete) {
const operation = item.Update ? item.Update : item.Delete
const hash = DynamoDb.calculateHashWithKnownKeys(operation.TableName, operation.Key)
if (hash) hashes.push(hash)
}
}
break
}
case 'batchWriteItem': {
const requestItems = request?.params.RequestItems || {}
for (const [tableName, operations] of Object.entries(requestItems)) {
if (!Array.isArray(operations)) continue
for (const operation of operations) {
if (operation?.PutRequest) {
const hash =
DynamoDb.calculatePutItemHash(tableName, operation.PutRequest.Item, DynamoDb.getPrimaryKeyConfig())
if (hash) hashes.push(hash)
} else if (operation?.DeleteRequest) {
const hash = DynamoDb.calculateHashWithKnownKeys(tableName, operation.DeleteRequest.Key)
if (hash) hashes.push(hash)
}
}
}
break
}
}

for (const hash of hashes) {
span.addSpanPointer(DYNAMODB_PTR_KIND, SPAN_POINTER_DIRECTION.DOWNSTREAM, hash)
}
}

/**
* Loads primary key config from the `DD_AWS_SDK_DYNAMODB_TABLE_PRIMARY_KEYS` env var.
* Only runs when needed, and warns when missing or invalid config.
* @returns {Object|null} Parsed config from env var or null if empty/missing/invalid config.
*/
static getPrimaryKeyConfig () {
const config = DynamoDb.dynamoPrimaryKeyConfig || {}
// Return cached config if it exists
if (Object.keys(config).length > 0) {
return config
}

const primaryKeysEnvVar = process.env.DD_AWS_SDK_DYNAMODB_TABLE_PRIMARY_KEYS
nhulston marked this conversation as resolved.
Show resolved Hide resolved
if (!primaryKeysEnvVar) {
log.warn('Missing DD_AWS_SDK_DYNAMODB_TABLE_PRIMARY_KEYS env variable')
return null
}

try {
const parsedConfig = JSON.parse(primaryKeysEnvVar)
nhulston marked this conversation as resolved.
Show resolved Hide resolved
for (const [tableName, primaryKeys] of Object.entries(parsedConfig)) {
if (Array.isArray(primaryKeys) && primaryKeys.length > 0) {
config[tableName] = new Set(primaryKeys)
} else {
log.warn(`Invalid primary key configuration for table: ${tableName}`)
}
}

DynamoDb.dynamoPrimaryKeyConfig = config
return config
} catch (err) {
log.warn('Failed to parse DD_AWS_SDK_DYNAMODB_TABLE_PRIMARY_KEYS:', err)
return null
}
}

/**
* Calculates a hash for DynamoDB PutItem operations using table's configured primary keys.
* @param {string} tableName - Name of the DynamoDB table.
* @param {Object} item - Complete PutItem item parameter to be put.
* @param {Object.<string, Set<string>>} primaryKeyConfig - Mapping of table names to Sets of primary key names
* loaded from DD_AWS_SDK_DYNAMODB_TABLE_PRIMARY_KEYS.
* @returns {string|undefined} Hash combining table name and primary key/value pairs, or undefined if unable.
*
* @example
* // With env var DD_AWS_SDK_DYNAMODB_TABLE_PRIMARY_KEYS='{"UserTable":["userId","timestamp"]}'
* calculatePutItemHash(
* 'UserTable',
* { userId: { S: "user123" }, timestamp: { N: "1234567" }, name: { S: "John" } },
* { UserTable: new Set(['userId', 'timestamp']) }
* )
*/
static calculatePutItemHash (tableName, item, primaryKeyConfig) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UpdateItem and DeleteItem require the primary key(s) as parameters, which makes our lives much easier. However, for PutItem, we have no way of determining which keys are the primary keys, so the user must set an environment variable specifying the primary key(s).

if (!tableName || !item || !primaryKeyConfig) {
log.debug('Unable to calculate hash because missing required parameters')
return
}
const primaryKeySet = primaryKeyConfig[tableName]
if (!primaryKeySet || !(primaryKeySet instanceof Set) || primaryKeySet.size === 0 || primaryKeySet.size > 2) {
log.warn(`Invalid dynamo primary key config for table ${tableName}: ${JSON.stringify(primaryKeyConfig)}`)
nhulston marked this conversation as resolved.
Show resolved Hide resolved
nhulston marked this conversation as resolved.
Show resolved Hide resolved
return
}
const keyValues = extractPrimaryKeys(primaryKeySet, item)
if (keyValues) {
return generatePointerHash([tableName, ...keyValues])
}
}

/**
* Calculates a hash for DynamoDB operations that have keys provided (UpdateItem, DeleteItem).
* @param {string} tableName - Name of the DynamoDB table.
* @param {Object} keys - Object containing primary key/value attributes in DynamoDB format.
* (e.g., { userId: { S: "123" }, sortKey: { N: "456" } })
* @returns {string|undefined} Hash value combining table name and primary key/value pairs, or undefined if unable.
*
* @example
* calculateKeyBasedOperationsHash('UserTable', { userId: { S: "user123" }, timestamp: { N: "1234567" } })
*/
static calculateHashWithKnownKeys (tableName, keys) {
if (!tableName || !keys) {
log.debug('Unable to calculate hash because missing parameters')
return
}
const keyValues = extractPrimaryKeys(keys, keys)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we're treating the keys as a set and as a data object at the same time? this is a little bit surprising to read. maybe we create a helper for this sort of situation to make it less surprising?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is a little confusing. We could do something like this:

const keyNamesSet = new Set(Object.keys(keysObject));
const keyValues = extractPrimaryKeys(keyNamesSet, keys)

which is definitely more readable. However, we're sacrificing a little bit of memory and runtime to do so, so I'm not sure this change is worth it. WDYT?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i generally lean towards readability over raw performance, but perhaps the tracer team has a different approach here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could also bury that logic (and not even do the set creation) under a new function name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do feel like performance is more important than readability, since this will be ran on every Lambda execution that updates S3

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree with readability in this case, it is a little confusing. I think we're missing an additional helper function prior to calling extractPrimaryKeys(keys, keys). Specifically, a function that prepares the key set to be passed into extractPrimaryKeys—rather than using the same object for both parameters WDYT?

if (keyValues) {
return generatePointerHash([tableName, ...keyValues])
}
}
}

module.exports = DynamoDb
Loading
Loading