Handling Correlations in Random Forests: which Impacts on Variable Importance and Model Interpretability?
Chavent, Marie; Lacaille, Jerome; Mourer, Alex; Olteanu, Madalina (2021), Handling Correlations in Random Forests: which Impacts on Variable Importance and Model Interpretability?, ESANN 2021 - Proceedings, i6doc.com, p. 569-574. 10.14428/esann/2021.ES2021-155
TypeCommunication / Conférence
Conference title29th European Symposium on Artificial Neutral Networks, Computational Intelligence and Machine Learning
Book titleESANN 2021 - Proceedings
Number of pages675
MetadataShow full item record
Méthodes avancées d’apprentissage statistique et de contrôle [ASTRAL]
Safran Aircraft Engines
Statistique, Analyse et Modélisation Multidisciplinaire (SAmos-Marin Mersenne) [SAMM]
CEntre de REcherches en MAthématiques de la DEcision [CEREMADE]
Abstract (EN)The present manuscript tackles the issues of model interpretability and variable importance in random forests, in the presence of correlated input variables. Variable importance criteria based on random permutations are known to be sensitive when input variables are correlated, and may lead for instance to unreliability in the importance ranking. In order to overcome some of the problems raised by correlation, an original variable importance measure is introduced. The proposed measure builds upon an algorithm which clusters the input variables based on their correlations, and summarises each such cluster by a synthetic variable. The effectiveness of the proposed criterion is illustrated through simulations in a regression context, and compared with several existing variable importance measures.
Showing items related by title and author.
Proceedings of 12th International Workshop on Self-Organizing Maps and Learning Vector Quantization, Clustering and Data Visualization (WSOM+) 2017 Lamirel, Jean-Charles; Cottrell, Marie; Olteanu, Madalina (2017) Ouvrage