Skip to content

v0.1.5

Compare
Choose a tag to compare
@dayyass dayyass released this 21 Oct 12:47
· 10 commits to main since this release
3353a53

Release v0.1.5 🥳🎉🍾

  • added pymorphy2 lemmatization (#81)
  • added token frequency support (#85)
  • added threshold selection for binary classification (#86)
  • added arbitrary save folder name (#83)

pymorphy2 lemmatization (config.yaml)

# preprocessing
# (included in resulting model pipeline, so preserved for inference)
preprocessing:
  lemmatization: pymorphy2

token frequency support

  • text_clf.token_frequency.get_token_frequency(path_to_config) -
    get token frequency of train dataset according to the config file parameters

threshold selection for binary classification

  • text_clf.pr_roc_curve.get_precision_recall_curve(path_to_model_folder) -
    get precision and recall metrics for precision-recall curve
  • text_clf.pr_roc_curve.get_roc_curve(path_to_model_folder) -
    get false positive rate (fpr) and true positive rate (tpr) metrics for roc curve
  • text_clf.pr_roc_curve.plot_precision_recall_curve(precision, recall) -
    plot precision-recall curve
  • text_clf.pr_roc_curve.plot_roc_curve(fpr, tpr) -
    plot roc curve
  • text_clf.pr_roc_curve.plot_precision_recall_f1_curves_for_thresholds(precision, recall, thresholds) -
    plot precision, recall, f1-score curves for probability thresholds

arbitrary save folder name (config.yaml)

experiment_name: model