Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for lambda data client #2147

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

stocaaro
Copy link
Member

Problem

Users would like to use the Gen2 data client interface to access their APIs from application functions (lambda's).

Our docs offer instructions on this experience, which stop short of indicating how the customer should provide the model introspection schema needed to make the whole experience work.

This gap is because the model introspection schema hasn't been available at the time the lambda's get built, which forces customers into suboptimal work-arounds.

This change introduces a new target experience where customer handlers can be configured as follow.

Example: amplify/functions/todo-count/handler.ts

import type { Handler } from "aws-lambda";
import { Amplify } from "aws-amplify";
import { generateClient } from "aws-amplify/data";
import type { Schema } from "../../data/resource";
import { resourceConfig, libraryOptions } from "$amplify/client-config/todo-count"

Amplify.configure(resourceConfig, libraryOptions);

const client = generateClient<Schema>();

export const handler: Handler = async () => {
  const todos = (await client.models.Todo.list()).data;
  return todos.length;
};

Changes

  1. Make the data factory a provider by adding provides = 'DataResources' so it can be consumed by the function factory
  2. Build the model introspection schema (MIS) into the data container entry by using the generateModelsSync behavior
  3. Consume the MIS in the functions factory and build generated model-introspection-schema and client-config files for each function that consumes them in its handler entry file.

Corresponding docs PR, if applicable:

  1. TODO

Validation

  1. Manually validated using a test-project
  2. TODO Added complete unit testing
  3. TODO Added integration testing

Checklist

  • If this PR includes a functional change to the runtime behavior of the code, I have added or updated automated test coverage for this change.
  • If this PR requires a change to the Project Architecture README, I have included that update in this PR.
  • If this PR requires a docs update, I have linked to that docs PR above.
  • If this PR modifies E2E tests, makes changes to resource provisioning, or makes SDK calls, I have run the PR checks with the run-e2e label set.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link

changeset-bot bot commented Oct 24, 2024

🦋 Changeset detected

Latest commit: 1b981eb

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@aws-amplify/backend Patch
@aws-amplify/backend-function Patch
@aws-amplify/backend-data Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Member Author

@stocaaro stocaaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open questions:

  • Is there a way to share code so I don't have to repeat the client-code over and over again?
  • Any concerns about pulling the file into memory?
  • Do we support multi-file handlers? If so, could a non-entry file be the import point for client-config?
  • I am considering adding a defineFunction param for force mis / client-config, to ensure the mutli-file case has a way to just work without using these imports at the top level. Thoughts?

Comment on lines 113 to 117
usedBy(scriptContent: string) {
// Matches an import of client-config
const clientConfigImportRegex = /import.*from.*client-config/g;
return clientConfigImportRegex.test(scriptContent);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've talked about only generating these files if the customer is using them. This does that, but leaves me with concerns.

  1. What happens if the customer has multiple files where the handler doesn't import the file we're looking for?
  2. What happens if the customer us using a different code consumption pattern such as require?

I considered doing something more complicated like loading the script AST, but settled on starting with this and following up with a discussion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And they can also use good old require :( .

I left comment somewhere else to use access grants for this purpose.

If we keep this approach one option is to run bundling twice and inspect the bundled file. then regenerate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talking with @josefaidt yesterday, the impact of generating these files for every function is minimal. I will experiment with resource permission based filtering this morning. This is discussed in a couple places and I'll use this thread for followups on this topic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the impact of generating these files for every function is minimal

performance wise perhaps yes. But it creates unwanted link in resource graph making the "cycling dependency problem" more likely to happen. This is not "minimal".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have context on the cycling dep problem. It may be related to the error we're chasing right now. I expect to followup on this in another update.

I think the goal is to use the resource auth rules to limit which functions we build files for, however these permissions are causing the current problem, so we need to overcome that before I can test any solution that would constrain based on these auth rules.

packages/backend-function/src/function_mis_generator.ts Outdated Show resolved Hide resolved
@@ -627,7 +627,7 @@ export type CustomClientConfig = {
export const DEFAULT_CLIENT_CONFIG_VERSION: ClientConfigVersion;

// @public
export const generateClientConfig: <T extends "1" | "1.1" | "1.2" | "1.3" | "0">(backendIdentifier: DeployedBackendIdentifier, version: T, awsClientProvider?: AWSClientProvider<{
export const generateClientConfig: <T extends "1.3" | "1.2" | "1.1" | "1" | "0">(backendIdentifier: DeployedBackendIdentifier, version: T, awsClientProvider?: AWSClientProvider<{
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what happened here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this happens sometimes, clean-build usually solves it.
or just remove this from diff please. Whatever makes GH actions happy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generated the update because the commit hook wouldn't let me push without rebuilding the API docs. I'll try a clean build locally and see if anything shakes loose.

packages/backend-function/src/factory.ts Outdated Show resolved Hide resolved
packages/backend-function/src/factory.ts Outdated Show resolved Hide resolved
packages/backend-function/src/factory.ts Outdated Show resolved Hide resolved
packages/backend-function/src/function_mis_generator.ts Outdated Show resolved Hide resolved
@josefaidt
Copy link
Contributor

Do we support multi-file handlers? If so, could a non-entry file be the import point for client-config?

yes, and non-entrypoint files will be bundled. Are we guarding imports for the env object today?

I am considering adding a defineFunction param for force mis / client-config, to ensure the mutli-file case has a way to just work without using these imports at the top level. Thoughts?

imports should work fine without the need for an additional prop across handler files

We've talked about only generating these files if the customer is using them.

I think it's okay to always generate if the function is granted access to the data schema or a specific model (TBD). It adds to the footprint of gitignored files on the local filesystem, but if it is not imported it will not needlessly bloat the bundle

package.json Outdated Show resolved Hide resolved
packages/backend-data/src/factory.ts Outdated Show resolved Hide resolved
packages/backend-function/src/factory.ts Outdated Show resolved Hide resolved
packages/backend-function/src/factory.ts Outdated Show resolved Hide resolved
Comment on lines 188 to 189
// TODO this should be conditional on some signal,
// for example that any access to data was granted for this function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be solved eagerly.

Otherwise our "circular dependency problem" between auth-data-functions is going to be amplified by this. (We've been focusing on this problem since original poc of MIS in functions). @rtpascual please correct me if I'm wrong here.

We need some mechanism (implicit is best) to detect which function really needs MIS at runtime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't experienced this issue yet (unless this is the auth issue we're troubleshooting right now), do you have instructions/examples of what is happening in these cases?

Comment on lines 113 to 117
usedBy(scriptContent: string) {
// Matches an import of client-config
const clientConfigImportRegex = /import.*from.*client-config/g;
return clientConfigImportRegex.test(scriptContent);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And they can also use good old require :( .

I left comment somewhere else to use access grants for this purpose.

If we keep this approach one option is to run bundling twice and inspect the bundled file. then regenerate.

packages/backend-function/src/function_mis_generator.ts Outdated Show resolved Hide resolved
packages/backend-function/src/factory.ts Outdated Show resolved Hide resolved
Comment on lines 165 to 167
type DataResources = ResourceProvider<Record<string, unknown>> & {
modelIntrospectionSchema?: string;
};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm testing this against a modified version of amplify-category-api. Its unclear to me whether the MIS should be passed through at the root object level or within .resources. Open to both, I left it at the top level to avoid additional plumbing, but I'm not sure what is best practice in this object.

Comment on lines 113 to 117
usedBy(scriptContent: string) {
// Matches an import of client-config
const clientConfigImportRegex = /import.*from.*client-config/g;
return clientConfigImportRegex.test(scriptContent);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have context on the cycling dep problem. It may be related to the error we're chasing right now. I expect to followup on this in another update.

I think the goal is to use the resource auth rules to limit which functions we build files for, however these permissions are causing the current problem, so we need to overcome that before I can test any solution that would constrain based on these auth rules.

Comment on lines 7 to 10
) => `// This file is auto-generated by Amplify. Edits will be overwritten.
import { env } from "../env/${functionName}";

export const libraryOptions = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to have lost a comment from @sobolk suggesting this content be pushed down into a file to be imported.

Which package would this file be in? I had been advocating for amplify-js, however, that doesn't seem like its going to be possible.

Since this is essentially customer code, could this behavior be usable/consumable from the backend package? It seems safe to assume this will be present during the build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not have to be "imported" in literal sense. See

/**
* This code concatenates the contents of the ssm resolver and invoker into a single line that can be used as the esbuild banner content
* This banner is responsible for resolving the customer's SSM parameters at runtime
*/
const bannerCode = readFileSync(ssmResolverFile, 'utf-8')
.concat(readFileSync(invokeSsmResolverFile, 'utf-8'))
.split(new RegExp(`${EOL}|\n|\r`, 'g'))
.map((line) => line.replace(/\/\/.*$/, '')) // strip out inline comments because the banner is going to be flattened into a single line
.join('');
.

The point is to get it under compiler and be able to write unit tests for it.

Copy link
Member Author

@stocaaro stocaaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have posted updates and resolved a bunch of conversations. The current active threads I'm tracking/following up on are:

  • Resource authorization causes build errors (maybe related to the import cycles problem between auth/data)
  • Confirm the resource interface and ensure we get the latest version from @aws-amplify/data-construct
  • We should not build client config / mis for every function as this will cause more interactions resulting in import cycles
  • Should we relocated the lambdaConfigTemplate code to somewhere we can import from?

packages/backend/src/internal/index.ts Outdated Show resolved Hide resolved
@@ -203,6 +213,7 @@ class FunctionFactory implements ConstructFactory<AmplifyFunction> {
schedule: this.resolveSchedule(),
bundling: this.resolveBundling(),
layers,
generateDataClientConfig: this.props.generateDataClientConfig ?? false,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this because the function factory can't communicate with the data factory, but it looks like there is some crosstalk between them at build using the SSM variables.

TODO: Revisit this and see if we can remove this param by using the presence of the SSM vars to trigger the file to be built.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by using the presence of the SSM vars to trigger the file to be built.

I'd rather always embed snippet from "data client config generator" and just make it no-op if env vars in function runtime are not populated.

This optimization should be deferred (seems premature) until:

  1. Generated snippet is large enough to cause visible impact on cold start.
  2. OR we pivot to "solution 2" where wiring between data and func factory becomes a must.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I have 2 concerns with always generating the files.

  1. I think it might cause a 1MB build size diff, even if unused. I will test this today.
  2. If this is built conditionally, when it is build and permissions are missing, type feedback tells the developer things aren't working, where if we always built it with permissive types feedback for misconfiguration will be at runtime and the error message shows up in the lambda logs, which makes debug/troubleshooting cycle larger.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An idea:
maybe we should precompile a tree-shaken, minified config generator snippet when we build amplify-backend repo and have it as an assert in backend-function package we ship to npm.

as far as I can tell the content of snippet is constant and depends on env vars only.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I found a happy middleground. I'm using a typeGuard to return valid configuration inputs if all the env vars are present and returns unknown configuration inputs when they aren't.

I verified that a user who doesn't import the file doesn't see any additional lambda size, so I have no concern exporting these for every function.

I would like the invalid unknown configuration inputs to give a clear error message telling the dev to add resource authorization to get it to work, but I couldn't find a way to include this in the tooltip error message here.

I'm open to exploring the precompilation path if you think that lands somewhere better. Do you have an example of how that would work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example of how that would work?

No. But this would be another layer of optimization if the middleground from above works.
This might be worth pursuing later, to reduce snippet+s3sdk bundle size later, in scenarios where these are supposed to be active.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw. we were considering this for SSM snippet in the past but we didn't pursue this because:

  1. SSM snippet can use lambda provided SSM SDK for its logic.
  2. SSM snippet is small. so this would be micro optimization.

Having said that however:
Have we considered including this logic in a same way we include SSM snippet? I.e. have this become part of banner and rely on Lambda provided SDK to make s3 call ? Instead of bundling ?

packages/backend-data/src/factory.ts Outdated Show resolved Hide resolved
@@ -203,6 +213,7 @@ class FunctionFactory implements ConstructFactory<AmplifyFunction> {
schedule: this.resolveSchedule(),
bundling: this.resolveBundling(),
layers,
generateDataClientConfig: this.props.generateDataClientConfig ?? false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by using the presence of the SSM vars to trigger the file to be built.

I'd rather always embed snippet from "data client config generator" and just make it no-op if env vars in function runtime are not populated.

This optimization should be deferred (seems premature) until:

  1. Generated snippet is large enough to cause visible impact on cold start.
  2. OR we pivot to "solution 2" where wiring between data and func factory becomes a must.

@@ -0,0 +1,61 @@
/* eslint-disable @typescript-eslint/naming-convention */

This comment was marked as outdated.

},
"./internal": {
"types": "./lib/internal/index.d.ts",
"import": "./lib/internal/index.js",
"require": "./lib/internal/index.js"
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be exporting this from backend. Customers won't be using this directly?
These are not necessary as long as backend-function is visible to bundler. In other words, having this in backend-function should be enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Customers apps will be using it directly via the generated files built to .amplify/generated/client-config/NAME.ts. I did this because of a concern @josefaidt raised about some build tools not supporting transitive dependencies, which might force some customers to depend on @aws-amplify/backend-function directly in their package.json to make things happy.

I can revert to consuming this from @aws-amplify/backend-function/internal for now if your confident this won't cause issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Customers apps will be using it directly via the generated files built to .amplify/generated/client-config/NAME.ts

that's not direct usage.

I did this because of a concern @josefaidt raised about some build tools not supporting transitive dependencies

we need a proof (poc app or user complaint) not speculation. until then we should defer.

Also, if this (https://github.com/aws-amplify/amplify-backend/pull/2147/files#r1833214679) is by any chance possible then perhaps we won't need exports at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link isn't taking me to a specific place in the diff (maybe because I just posted an update). Can you point me at the code that means we won't need an export again?

Copy link
Member

@sobolk sobolk Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry wrong link, here is correct one #2147 (comment) .
It's leading to comment I left in a thread above.

packages/backend/src/internal/index.ts Outdated Show resolved Hide resolved
@stocaaro
Copy link
Member Author

Closing this change out, splitting it into two changes:

  1. Data factory shares the MIS via S3/SSM
  2. Function factory generates data config shim making use trivial

@stocaaro stocaaro force-pushed the stocaaro/client-schema/main branch from a21cdfe to 7a48cec Compare November 13, 2024 18:45
…runtime-config' into stocaaro/client-schema/function-runtime-config
commit d381f73
Author: Aaron S <[email protected]>
Date:   Thu Nov 14 09:37:08 2024 -0600

    feat: Add mis build to the data factory

commit 5d873a1
Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Date:   Wed Nov 13 16:56:18 2024 -0800

    Version Packages (#2229)

    Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

commit bc6dc69
Author: Kamil Sobol <[email protected]>
Date:   Wed Nov 13 16:13:16 2024 -0800

    Fix case where tool use does not have input while streaming (#2226)

commit 9fd0642
Author: Kamil Sobol <[email protected]>
Date:   Tue Nov 12 13:07:24 2024 -0800

    Add retry mechanism to create amplify e2e test (#2220)

    * Add retry mechanism to create amplify e2e test

    * Add retry mechanism to create amplify e2e test

    * this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants