Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider caching expensive functions #109

Open
mmalenic opened this issue Aug 3, 2022 · 4 comments
Open

Consider caching expensive functions #109

mmalenic opened this issue Aug 3, 2022 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@mmalenic
Copy link
Member

mmalenic commented Aug 3, 2022

We should consider caching expensive computations, or functions that download lots of data. For example, we could cache s3 get object requests, or computations involving searching through entire index files. This could be done using Rust's cached crate, and may improve performance on successive queries with similar parameters.

@mmalenic mmalenic added the enhancement New feature or request label Aug 3, 2022
@brainstorm
Copy link
Member

brainstorm commented Aug 3, 2022

Worth reviewing samtools/hts-specs#325 before tackling it.

Slightly different topic though as I guess your focus here is more about memoization instead of network/query response payload caching?

@mmalenic
Copy link
Member Author

mmalenic commented Aug 3, 2022

We could definitely explore http caching. Maybe some data could even be cached client side? I think the main benefit of caching is to reduce any delays between the lambda functions and aws s3, in the htsget-http-lambda crate, although there might be an aws-specific solution for this.

The htsget-http-actix crate already has access to the file system so it's not as big a deal there. Although, both crates would benefit from memoization of expensive functions.

@brainstorm
Copy link
Member

brainstorm commented Aug 10, 2022

Before introducing those I would definitely profile first, @victorskl can help you out with AWS XRays profiling tools. Local profiling can be done with cargo-instruments, flamegraphs et al.

@brainstorm
Copy link
Member

Also, take into account cache headers from S3, just bumped into this: https://www.sam.today/blog/always-set-cache-control

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants