implement shap-feature-importance #16385

wendycwong · 2024-09-11T13:40:53Z

We have shap-summary-plots but user wants to see the actual values. Here is the answer according to @tomasfryda

tomf
Yesterday at 11:59 PM
AFAIK we don’t have a method/function to do that. Usually mean absolute contribution is used for variable importance (https://christophm.github.io/interpretable-ml-book/shap.html#shap-feature-importance) but I don’t think there is just one correct way to do it.
Also I would probably recommend shap summary plot instead as it shows more information without additional computation.
The calculation itself is quite trivial:
contr = model.predict_contributions(test)#, background_frame=train)
feature_importances = dict(zip(contr.names, contr.abs().mean()))

import matplotlib.pyplot as plt
fi = sorted(feature_importances.items(), key=lambda x: x[1])
plt.barh([x[0] for x in fi], [x[1] for x in fi])
plt.title("Feature Importances")
plt.show()
For tree models you don’t have to specify the background frame. Calculation with background frame is usually much slower (IIRC the number of operations is number of rows in background frame * number of operations without background frame).
Generally, it’s recommended to use background_frame as the choice of background frame influences the results. The problem with not using background frame is that you don’t know how important individual splits in the trees are (e.g., if the model denies mortgage for people taller than 3m (~10 ft) the contributions calculated without background frame would consider this split as important as other splits but with background frame we would find out that there are no people that tall (or at least there is not many people like that) so the contribution would end up lower.
Screenshot 2024-09-11 at 8.51.44.png

Screenshot 2024-09-11 at 8.51.44.png

christophm.github.io
9.6 SHAP (SHapley Additive exPlanations) | Interpretable Machine Learning
Machine learning algorithms usually operate as black boxes and it is unclear how they derived a certain decision. This book is a guide for practitioners to make machine learning decisions interpretable.

Implement this for R and Python clients.

wendycwong added the feature label Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement shap-feature-importance #16385

implement shap-feature-importance #16385

wendycwong commented Sep 11, 2024 •

edited

Loading

implement shap-feature-importance #16385

implement shap-feature-importance #16385

Comments

wendycwong commented Sep 11, 2024 • edited Loading

wendycwong commented Sep 11, 2024 •

edited

Loading