Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(util-dynamodb): marshall to convert JavaScript object into DynamoDB Record #1531

Merged
merged 35 commits into from
Sep 28, 2020

Conversation

trivikr
Copy link
Member

@trivikr trivikr commented Sep 18, 2020

Issue #, if available:
Refs: #1223

Description of changes:
Adds utility function marshall to convert JavaScript object into DynamoDB Record

const { DynamoDB } = require("@aws-sdk/client-dynamodb");
const { marshall } = require("@aws-sdk/util-dynamodb");

const client = new DynamoDB(clientParams);
const params = {
  TableName: "Table",
  Item: marshall({
    HashKey: "hashKey",
    NumAttribute: 1,
    BoolAttribute: true,
    ListAttribute: [1, "two", false],
    MapAttribute: { foo: "bar" },
    NullAttribute: null,
  }),
};

await client.putItem(params);

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@codecov-commenter
Copy link

codecov-commenter commented Sep 18, 2020

Codecov Report

Merging #1531 into master will increase coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1531      +/-   ##
==========================================
+ Coverage   79.80%   79.84%   +0.03%     
==========================================
  Files         298      304       +6     
  Lines       11502    11753     +251     
  Branches     2475     2511      +36     
==========================================
+ Hits         9179     9384     +205     
- Misses       2323     2369      +46     
Impacted Files Coverage Δ
...l_tests/aws-json/commands/EmptyOperationCommand.ts 90.00% <0.00%> (-10.00%) ⬇️
...ts/aws-query/commands/NoInputAndNoOutputCommand.ts 90.00% <0.00%> (-10.00%) ⬇️
.../aws-restxml/commands/NoInputAndNoOutputCommand.ts 90.00% <0.00%> (-10.00%) ⬇️
...aws-restjson/commands/NoInputAndNoOutputCommand.ts 90.00% <0.00%> (-10.00%) ⬇️
protocol_tests/aws-ec2/commands/XmlBlobsCommand.ts 95.00% <0.00%> (-5.00%) ⬇️
protocol_tests/aws-ec2/commands/XmlEnumsCommand.ts 95.00% <0.00%> (-5.00%) ⬇️
protocol_tests/aws-ec2/commands/XmlListsCommand.ts 95.00% <0.00%> (-5.00%) ⬇️
...rotocol_tests/aws-query/commands/XmlMapsCommand.ts 95.00% <0.00%> (-5.00%) ⬇️
...otocol_tests/aws-query/commands/XmlBlobsCommand.ts 95.00% <0.00%> (-5.00%) ⬇️
...otocol_tests/aws-query/commands/XmlEnumsCommand.ts 95.00% <0.00%> (-5.00%) ⬇️
... and 130 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 17620d6...015fd9b. Read the comment docs.

@russell-dot-js
Copy link
Contributor

russell-dot-js commented Sep 18, 2020

It would be really nice to expose some way to expose custom marshalling/unmarshalling config. Even if it's just a function for custom marshalling. One of the big drawbacks of the current DocumentClient implementations is that JS Dates serialize to maps, and the data gets completely hosed (saves as {} IIRC). What if we had something like this?

premarshall: ({key: string, value: any}) => any;

new DocumentClient({
  premarshall: ({key, value}) => {
    if(value instanceof Date) {
      return value.toIsoString(); // or convert to number depending on use case
    }
    if(value instanceof moment) {
      return value.format();
    }

    if(key === 'someKeyIDontWantInDynamo') {
      return undefined;
    }
    return value;
  }
})

Then the library would call premarshall prior to converting to AttributeValues

IMO it's more important to have this on the way in to dynamo on the way out, because there's always some casting from DB values to runtime values anyway (e.g. converting iso string back to date, stripping database concerns, etc). But some way to inject some casting logic that could be applied to Item, Attributes, Items, etc based on endpoint is a secondary item on my wishlist.

Obviously this is something that consumers can do themselves before calling UpdateItem / PutItem, but it creates additional complexity if, for example, every consumer has to wrap the documentClient with their own functionality to marshall. Or, even worse, have one-off code like such:

documentClient.putItem({
  Key,
  Item: {
     ...value,
     createdAt: value.createdAt.toISOString(),
   }
})

Code like the above becomes highly error prone, because even with typescript, adding a new Date value to typeof T will not break any assumptions and consumers have to jump through additional hoops to keep their DB logic in line with their types, when in reality it should "just work" (without having to cross the line into a full blown ORM).

What are your thoughts?

@trivikr
Copy link
Member Author

trivikr commented Sep 18, 2020

@russell-dot-js Custom premarshaller is a good idea. Can you create a feature-request for it?
Link: https://github.com/aws/aws-sdk-js-v3/issues/new?assignees=&labels=feature-request&template=---feature-request.md&title=

Since v3 is is prerelease, we can still modify the marshaller (convertToAttr).
Is Date the only type you have concern about? Or are there generic JavaScript types which should be included in default implementation?

@russell-dot-js
Copy link
Contributor

@russell-dot-js Custom premarshaller is a good idea. Can you create a feature-request for it?
Link: https://github.com/aws/aws-sdk-js-v3/issues/new?assignees=&labels=feature-request&template=---feature-request.md&title=

Since v3 is is prerelease, we can still modify the marshaller (convertToAttr).
Is Date the only type you have concern about? Or are there generic JavaScript types which should be included in default implementation?

@trivikr I'll take the discussion to the issue, and I'm happy to implement the feature, but should there be a default implementation? I wouldn't want to break consumers who pass dates blindly and expect a number or a string... then again, right now, dates don't work at all. Maybe even logging a warning when an unmarshalled date makes it to convertToAttr would be good enough...

My concern about default implementation is that everyone has a different way of using dynamo. Some people may cram 4 attributes into a sort key, then convert back to whatever those fields were to begin with on the way out. So it might be tough for a default implementation to make an assumption on how other native JS types should be handled - but we can at least identify when we receive those and don't know how to marshall them.

Copy link
Contributor

@russell-dot-js russell-dot-js left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM, but may be regressions from current DocumentClient Converter's handling of objects

other misc. comments added

packages/util-dynamodb/src/convertToAttr.spec.ts Outdated Show resolved Hide resolved
packages/util-dynamodb/src/convertToAttr.ts Outdated Show resolved Hide resolved
return { L: data.map((item) => convertToAttr(item)) };
} else if (data?.constructor?.name === "Set") {
return convertToSetAttr(data as Set<any>, options?.convertEmptyValues);
} else if (data?.constructor?.name === "Object") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this code borrowed from the current (v2) DocumentClient? or is this a new implementation?

if I'm reading correctly, this will only serialize pure JS Objects to JSON.
Anyone who's using ES6 classes or other easily serializable objects would be hosed without custom premarshall?

Again, borrowing the rules from JSON serialization would be $

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this code borrowed from the current (v2) DocumentClient? or is this a new implementation?

This PR is new implementation in TypeScript based on the behavior of v2 DocumentClient.

Anyone who's using ES6 classes or other easily serializable objects would be hosed without custom premarshall?

Can you provide an example?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure
Here is the source, btw:
https://github.com/aws/aws-sdk-js/blob/5155fde557ecc1d166bb5040823289254bea1641/lib/dynamodb/converter.js#L27

The old DocumentClient defaulted to treating objects as map. Thus

class FooObj {
  constructor() {
    this.foo = 'foo';
  }
}

JSON.stringify(new FooObj())
"{"foo":"foo"}"

// new implementation needs constructor name to be Object or it errors
(new FooObj()).constructor.name
"FooObj"

// old implementation checked typeof
typeof new FooObj()
"object"

See https://github.com/aws/aws-sdk-js/blob/cc29728c1c4178969ebabe3bbe6b6f3159436394/lib/dynamodb/types.js#L3

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a miss.

However, using JSON.stringify() for serializing and JSON.parse() for deserializing is a norm in JavaScript world.

I created an issue to discuss this internally, and will post updates in #1535

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code doesn't check the prototype chain. Maybe if ( data instanceof Object){} works better here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not using data instanceof Object, as it applies to binary data as shown in code below:

const data = new ArrayBuffer();
data instanceof Object;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would checking in order of set, list, map, scalars help?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would checking in order of set, list, map, scalars help?

The check for binary happens under scalars, so map check should be moved after it.
Currently, check for scalars is in an else block, so check for map will have to be moved inside scalars.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share an example where data?.constructor?.name === "Object" will fail for JS for maps because of prototype chain?

@trivikr
Copy link
Member Author

trivikr commented Sep 18, 2020

should there be a default implementation for Date?

The recommended way in DynamoDB is to store dates is ISO 8601 strings.
Documentation: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.NamingRulesDataTypes.html#HowItWorks.DataTypes.String

I'll store date by converting it to ISO 8601 string using toISOString() in convertToAttr function
MDN: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/toISOString

@russell-dot-js
Copy link
Contributor

russell-dot-js commented Sep 18, 2020

should there be a default implementation for Date?

The recommended way in DynamoDB is to store dates is ISO 8601 strings.
Documentation: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.NamingRulesDataTypes.html#HowItWorks.DataTypes.String

I'll store date by converting it to ISO 8601 string using toISOString() in convertToAttr function
MDN: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date/toISOString

Reading those docs, it doesn't say the recommended way is ISO 8601. It says you "can use ISO 8601". Immediately before that, in the numbers docs, it says you can use numbers also to represent date. It does not make a recommendation.

We use ISO 8601, but I don't think the default sdk should make assumptions on what you want to do. In other languages (e.g. Go, .NET, Java), the dynamo client uses JSON serialization by default. This is why I am in favor of a pre marshall config rather than baking assumptions into DocumentClient.

@trivikr
Copy link
Member Author

trivikr commented Sep 18, 2020

I created a feature request for supporting date #1534

This PR will continue throwing error for date - as it fixes the bug with v2 DynamoDB client which converts Date to an empty map as follows:

Code
const AWS = require("aws-sdk");

(async () => {
  const region = "us-west-2";

  const TableName = `test-doc-client`;
  const id = "id1";
  const date = new Date();

  const client = new AWS.DynamoDB.DocumentClient({ region });
  await client.put({ TableName, Item: { id, date } }).promise();
  console.log(await client.get({ TableName, Key: { id } }).promise());
})();
Output
{ Item: { id: 'id1', date: {} } }

packages/util-dynamodb/src/convertToAttr.ts Outdated Show resolved Hide resolved
return { L: data.map((item) => convertToAttr(item)) };
} else if (data?.constructor?.name === "Set") {
return convertToSetAttr(data as Set<any>, options?.convertEmptyValues);
} else if (data?.constructor?.name === "Object") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code doesn't check the prototype chain. Maybe if ( data instanceof Object){} works better here.

@trivikr
Copy link
Member Author

trivikr commented Sep 23, 2020

The build failure is related to intermittent test failure of NodeHttpHandler
Created issue at #1539

@trivikr trivikr changed the title feat(util-dynamodb): convertToAttr from JavaScript value to DynamoDB AttributeValue type feat(util-dynamodb): marshall to convert JavaScript object into DynamoDB Record Sep 23, 2020
Copy link
Contributor

@AllanZhengYP AllanZhengYP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a blocker, but having cross-platform test cases here is necessary. It will validate this works for Buffer as well as Blob and File

packages/util-dynamodb/package.json Outdated Show resolved Hide resolved
@trivikr
Copy link
Member Author

trivikr commented Sep 25, 2020

It's not a blocker, but having cross-platform test cases here is necessary. It will validate this works for Buffer as well as Blob and File

Created an issue for follow-up #1542

Copy link
Contributor

@AllanZhengYP AllanZhengYP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks for putting this together!

@trivikr trivikr merged commit 04b219a into aws:master Sep 28, 2020
@trivikr trivikr deleted the util-dynamodb-dev branch September 28, 2020 18:08
@github-actions
Copy link

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants