Backend reloads or crashes while loading multiple TrOCR models #431

AkashDataScience · 2024-02-07T07:36:52Z

I am using ML backend for a multi model project. I am creating two instance of TrOCR model (Hand written and Machine printed). While loading 2nd model hugginface makes changes in cache folder. This triggers app reload deleting all loaded model or app crashes. Loading a single model works fine.

mp_trocr is for machine printed and hw_trocr is for hand written.

I have created a sub-class of LabelStudioMLBase called ABC. I created sub-class of ABC for each model. I am loading models in init of each model class.

class BaseClass(LabelStudioMLBase):
    def __init__(self, **kwargs):
        super(BaseClass, self).__init__(**kwargs)
        global CLASSIFICATION_MODEL, DETECTION_MODEL, RECOGNITION_MODEL, MP_TROCR_MODEL, HW_TROCR_MODEL
        self.config_path = kwargs['config_path']
        self.opt = self._get_config(self.config_path)
        # if not CLASSIFICATION_MODEL or not DETECTION_MODEL or not RECOGNITION_MODEL:
        self.classification_model, self.detection_model, self.recognition_model, self.mp_trocr_model , self.hw_trocr_model = self.load()
        CLASSIFICATION_MODEL = self.classification_model
        DETECTION_MODEL = self.detection_model
        RECOGNITION_MODEL = self.recognition_model
        MP_TROCR_MODEL = self.mp_trocr_model
        HW_TROCR_MODEL = self.hw_trocr_model

    def load(self, **kwargs):
        classification_model = None
        detection_model = None
        recognition_model = None
        mp_trocr_model = None
        hw_trocr_model = None
        classification_model = ClassificationType({'choice': self.parsed_label_config['hoice']})
        detection_model = Detection({'label_rectangles': self.parsed_label_config['label_rectangles']})
        recognition_model = Recognition({'transcription': self.parsed_label_config['transcription']})
        mp_trocr_model = TextTrocrRecognition({'transcription': self.parsed_label_config['transcription']})
        hw_trocr_model = TextTrocrRecognition({'transcription': self.parsed_label_config['transcription']}, mp_trocr = mp_trocr_model)
            
        return classification_model, detection_model, recognition_model, mp_trocr_model, hw_trocr_model

This is a replica of actual code. I trying to create a config driven project. Config paths are passed while creating objects.

Tried and Failed

Loading Hand-written model first and then Machine-printed.
Read documentation to preventing caching while loading model (Found a way to change cache path but no information to prevent caching)
Search online to restrict Fask to a folder to scan for changes (Found information limited to production environment but not for debug)

Is there any way to restrict this reloading to a particular folder?

Label studio version 1.11.0
ML backend version 1.0.8rc3

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backend reloads or crashes while loading multiple TrOCR models #431

Backend reloads or crashes while loading multiple TrOCR models #431

AkashDataScience commented Feb 7, 2024

Backend reloads or crashes while loading multiple TrOCR models #431

Backend reloads or crashes while loading multiple TrOCR models #431

Comments

AkashDataScience commented Feb 7, 2024

Tried and Failed