Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backend reloads or crashes while loading multiple TrOCR models #431

Open
AkashDataScience opened this issue Feb 7, 2024 · 0 comments
Open

Comments

@AkashDataScience
Copy link

I am using ML backend for a multi model project. I am creating two instance of TrOCR model (Hand written and Machine printed). While loading 2nd model hugginface makes changes in cache folder. This triggers app reload deleting all loaded model or app crashes. Loading a single model works fine.
ML_Backend

mp_trocr is for machine printed and hw_trocr is for hand written.

I have created a sub-class of LabelStudioMLBase called ABC. I created sub-class of ABC for each model. I am loading models in init of each model class.

class BaseClass(LabelStudioMLBase):
    def __init__(self, **kwargs):
        super(BaseClass, self).__init__(**kwargs)
        global CLASSIFICATION_MODEL, DETECTION_MODEL, RECOGNITION_MODEL, MP_TROCR_MODEL, HW_TROCR_MODEL
        self.config_path = kwargs['config_path']
        self.opt = self._get_config(self.config_path)
        # if not CLASSIFICATION_MODEL or not DETECTION_MODEL or not RECOGNITION_MODEL:
        self.classification_model, self.detection_model, self.recognition_model, self.mp_trocr_model , self.hw_trocr_model = self.load()
        CLASSIFICATION_MODEL = self.classification_model
        DETECTION_MODEL = self.detection_model
        RECOGNITION_MODEL = self.recognition_model
        MP_TROCR_MODEL = self.mp_trocr_model
        HW_TROCR_MODEL = self.hw_trocr_model

    def load(self, **kwargs):
        classification_model = None
        detection_model = None
        recognition_model = None
        mp_trocr_model = None
        hw_trocr_model = None
        classification_model = ClassificationType({'choice': self.parsed_label_config['hoice']})
        detection_model = Detection({'label_rectangles': self.parsed_label_config['label_rectangles']})
        recognition_model = Recognition({'transcription': self.parsed_label_config['transcription']})
        mp_trocr_model = TextTrocrRecognition({'transcription': self.parsed_label_config['transcription']})
        hw_trocr_model = TextTrocrRecognition({'transcription': self.parsed_label_config['transcription']}, mp_trocr = mp_trocr_model)
            
        return classification_model, detection_model, recognition_model, mp_trocr_model, hw_trocr_model

This is a replica of actual code. I trying to create a config driven project. Config paths are passed while creating objects.

Tried and Failed

  • Loading Hand-written model first and then Machine-printed.
  • Read documentation to preventing caching while loading model (Found a way to change cache path but no information to prevent caching)
  • Search online to restrict Fask to a folder to scan for changes (Found information limited to production environment but not for debug)

Is there any way to restrict this reloading to a particular folder?

Label studio version 1.11.0
ML backend version 1.0.8rc3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant