Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hma] Test production deployment - Nice-to-haves and Need-to-haves #1440

Open
Dcallies opened this issue Oct 6, 2023 · 5 comments
Open

[hma] Test production deployment - Nice-to-haves and Need-to-haves #1440

Dcallies opened this issue Oct 6, 2023 · 5 comments
Labels
hma Items related to the hasher-matcher-actioner system

Comments

@Dcallies
Copy link
Contributor

Dcallies commented Oct 6, 2023

If you would consider deploying OMM to your production environment to evaluate it, please add to this issue a comment with the following.

If you would rather remain anonymous, contact me by some other method, and send me the list and indicate that it's okay to add to this issue.

Need-to-haves

  1. You won't attempt deployment without these things
  2. Use numbered lists

Nice-to-haves

  1. These are items you would be willing to wait longer on or try without them
  2. Also helps if you use numbered lists!
@benoit-yubo
Copy link
Contributor

Need-to-haves

  1. A solution for rebuilding / refreshing the index in background or without service interruption. With the current demo, any existing index is loaded at start up and replaced when a manual index rebuild is done. In production, we could have curator instances rebuilding the index but we need a way to either:
    • have the matchers reload newly built indices asynchronously while still serving requests
    • have the matchers restart in a rolling fashion (easy in k8s, just need a trigger)

Nice-to-haves

  1. Have the loading from exchanges implemented and schedulable (NCMEC, IWF, GIFCT)
  2. Using FastAPI instead of Flask to obtain better performance would be appreciated, but it's my understanding that SQLAlchemy is not built with asyncio so that would probably mean using workarounds (or rewriting most of the Postgres queries) to really leverage the power of FastAPI.

@sbruens
Copy link
Collaborator

sbruens commented Jan 10, 2024

Must-Haves

  1. Load exchanges (e.g. GIFCT) and configure them at runtime (i.e. provide credentials).
  2. Image -> GIFCT matches, original source, metadata.

Nice-to-haves

  1. Provide feedback on match.
  2. Distance of matches (might be too hard)
  3. Create new bank and add hashes to it.
  4. Add ability to configure other classifiers

Will add more as we come across them.

@Dcallies
Copy link
Contributor Author

Progress report 2/20/2024

@benoit-yubo

Need-to-haves

  1. [Solution ready, needs validation] The option TASK_INDEX_CACHE=True which spawns local threads which pre-load and keep a live copy of the index in memory should solve this, let me know if this got it

Nice-to-haves

  1. [Solution ready, needs validation] Support for this is now pretty robust, all exchanges in py-tx should be supported
  2. No progress.

@sbruens

Need-to-haves

  1. [Solution ready, needs validation] This should be ready, with CRUD for setting credentials at runtime
  2. Not started

Nice-to-haves

  1. Not started
  2. Not started
  3. [Solution ready, needs validation] banking CRUD should handle this
  4. [Solution ready, needs validation] SignalExchangeAPI is supported, which APIs are available just needs to be set at install time

@juanmrad
Copy link
Collaborator

juanmrad commented Feb 28, 2024

Need-to-haves

  • Know which hash was it that it got matched against
  • Distance between hash in bank and image hash
  • Ability to configure distance per bank.

Nice-to-haves

  • API to enable/disable signal types
  • API to enable/disable hashbanks
  • ability to contribute back hashes to exchanges

@Dcallies Dcallies changed the title [omm] Test production deployment - Nice-to-haves and Need-to-haves [hma] Test production deployment - Nice-to-haves and Need-to-haves Mar 14, 2024
@Dcallies Dcallies added the hma Items related to the hasher-matcher-actioner system label Mar 14, 2024
@julietshen
Copy link

Nice-to-haves

  1. Right to Left language support
  2. Ability to understand if the same file is hashed on multiple banks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hma Items related to the hasher-matcher-actioner system
Projects
None yet
Development

No branches or pull requests

5 participants