Harnessing the consensus of density functional approximations

Chem. Sci, 12, 39, 13021-13036 (2021)

During the chemical discovery, people usually stick to a single choice of density functional approximation (DFA) in QC data generation because of the simplicity of the approach. However, this approach may introduce systematic bias to the dataset, especially for challenging materials space.


By investigating a large TMC dataset with DFAs at different rungs of the “Jacob’s ladder”, I found that good linear correlations exist among properties obtained by different DFAs, although their absolute predictions differ7. Therefore, ML can be used to reveal “universal” design rules in variance to the DFA choices. I found that lead compounds can be significantly dependent on DFA choices, demonstrating the risks of identifying lead compounds relying on a single choice of DFA. To alleviate the risks, I thus first proposed an approach that utilizes the consensus among multiple DFAs to discover robust (i.e., DFA-insensitive) lead compounds. These lead compounds discovered based on the DFA consensus are in much better agreement with experimentally observed leads compared to those identified by a single DFA.