I want to be able to add a user-defined dictionary #66

lisongxi · 2023-11-14T02:01:15Z

It is desirable to be able to add a user-defined dictionary to indicate which words can be considered the same and which words can be considered very different

mjpieters · 2024-01-20T17:41:44Z

You can already do this; either with a custom scorer or a custom processor.

You could use a wrapping technique to apply your similar words dictionary lookups with either a scorer or a processor.

E.g., using a wrapper function to let you use any default scorer:

import typing as t

def similar_words_scorer(similar_words: t.Mapping[str, str], scorer: t.Callable[[str, str], float]) -> t.Callable[[str, str], float]:
    def wrapper(s1, s2, *args, **kwargs):
        s1 = similar_words.get(s1, s1)
        s2 = similar_words.get(s2, s2)
        return scorer(s1, s2, *args, **kwargs)
    return wrapper

If you wanted to use the default scorer:

from thefuzz import process

similar_words = {"foo": "fooz", ...}
result = process.extractOne(some_query, choices, scorer=similar_words_scorer(similar_words, process.default_scorer))

Or, you could use a processor to do the same; here is an example processor that uses the same wrapping technique to first process the input and then map the processed result through a similar word mapping:

import typing as t

def similar_word_processor(similar_words: t.Mapping[str, str], processor: t.Callable[[str], str]) -> t.Callable[[str], str]:
    def wrapper(value):
        processed = processor(value)
        return similar_words.get(processed, processed)
    return wrapper

and then use that with, say, the default processor:

from thefuzz import process

similar_words = {"foo": "fooz", ...}
result = process.extractOne(some_query, choices, processor=similar_word_processor(similar_words, process.default_processor))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I want to be able to add a user-defined dictionary #66

I want to be able to add a user-defined dictionary #66

lisongxi commented Nov 14, 2023

mjpieters commented Jan 20, 2024 •

edited

Loading

I want to be able to add a user-defined dictionary #66

I want to be able to add a user-defined dictionary #66

Comments

lisongxi commented Nov 14, 2023

mjpieters commented Jan 20, 2024 • edited Loading

mjpieters commented Jan 20, 2024 •

edited

Loading