Machine learning accelerated descriptor design for catalyst discovery in CO2 to methanol conversion

In order to discover potential new catalysts for converting CO₂ into methanol, we present the workflow depicted in Fig. 1. The key steps are summarized here, with detailed implementation procedures and configurations available in the Methods section.

**Fig. 1: Schematic of the workflow for adsorption energy distributions (AEDs) generation.**

Search space selection

To effectively reduce the search space for potential catalyst materials for CO₂ thermal conversion, we first isolated the metallic elements that have undergone prior experimentation for this process, as documented by Bahri et al.³⁶. To maintain prediction accuracy, these elements also had to be part of the Open Catalyst 2020 (OC20) database³³. The elements shortlisted are the following: K, V, Mn, Fe, Co, Ni, Cu, Zn, Ga, Y, Ru, Rh, Pd, Ag, In, Ir, Pt, and Au. We then proceeded to search through the Materials Project database³⁷ for stable and experimentally observed crystal structures associated with these metals and their bimetallic alloys. We compiled 216 stable phase forms involving both single metals and bimetallic alloys corresponding to our set of 18 elements. A detailed listing of these materials is provided in Supplementary Tables S1 and S2 of the Supplementary Information. We performed bulk DFT optimization at the RPBE³⁸ level to align with OC20 for the obtained materials. Optimization of 22 materials was not successful, and therefore, they were excluded from the materials list, as detailed in Supplementary Table S2 in the Supplementary Information.

To identify the most crucial adsorbates for AEDs calculations, we perused the existing literature. An experimental investigation by Amman et al.³⁰ highlighted the presence of surface-bound radicals such as *H (hydrogen atom), *OH (hydroxy group), *OCHO (formate), and *OCH₃ (methoxy) as essential reaction intermediates in the thermocatalytic reduction of CO₂ to methanol. Based on these findings, we selected these adsorbates for our AEDs calculations. Please note that the notation for formate can vary across the literature, e.g., *HCOO^30,39 and HCOO*⁴⁰. With the help of fairchem repository tools by OCP⁴¹, we created surfaces with their Miller index ∈ { − 2, − 1, . . . , 2} and calculated their total energy using OCP MLFF. If we encountered multiple cuts for the same facet, we selected the one with lowest energy for further calculations. Then we engineered surface-adsorbate configurations for the most stable surface terminations across all facets within our defined Miller index range for the materials, as described in the Methods section, and optimized these configurations using the OCP MLFF. During this process, we discovered that seven materials exhibited so large surface-adsorbate supercells, that their calculations were infeasible on available GPU resources, even with the effective OCP MLFF. Consequently, they were excluded from our study.

Validation and data cleaning

In our work, we have employed the OCP equiformer_V2 MLFF. Its reported accuracy for the adsorption energy of small molecular fragments is 0.23 eV⁴². However, *OCHO was not included in the OC20 database used for training the equiformer_V2, raising concerns about the accuracy of our adsorption energy predictions in this work. To benchmark equiformer_V2 for our use case, we chose Pt, Zn, and NiZn and performed explicit DFT calculations (see Methods section for details). The comparison between predicted and DFT-calculated adsorption energies can be found in Fig. 2 and Table 1. The predictions for Pt are precise, whereas the NiZn results show some outliers, and there is a noticeable degree of scatter for Zn. Despite this, the overall mean absolute error (MAE) for the adsorption energies of the selected materials is 0.16 eV, which is impressive and falls within the reported accuracy of the employed MLFF.

Table 1 Mean absolute error (MAE) between OCP MLFF predictions and single-point DFT calculations for three selected materials – Pt, Zn and NiZn alloy

To affirm the reliability of our predicted AEDs across a broader range of materials along with maintaining computational practicality, we integrated a validation step within our analysis workflow. We sampled the minimum, maximum, and median adsorption energies for each adsorbate-material pair from the predicted AEDs. We performed single-point DFT calculations on these selected systems and compared with the adsorption energy predictions of the OCP MLFF. The difference is compiled in an ‘estimated MAE’ (EMAE). Comparisons between EMAE and the all-encompassing MAE for our complete test set are presented in Table 1. While the EMAE may differ from the actual MAE by up to a factor of three for specific adsorbates, it generally remains in close proximity to the actual MAE, thus serving as a reliable gauge of data quality.

The validation step is connected with the final data cleaning when we exclude any material with an EMAE surpassing the threshold of 0.25 eV. Consequently, 29 materials were expunged from our dataset, retaining 158 materials. Most materials flagged for significant EMAEs exhibited magnetic properties, exemplified by materials like MnCo, MnGa, or FeCo. Magnetism presents significant challenges for the non-spin-polarized DFT calculations used in OC20 and in this work. A complete list of estimated MAEs for the remaining 158 materials is accessible in ref. ⁴³.

Adsorption energy distributions

Lastly, to compile the AEDs, we examined the relaxed configurations. For many distinct initial configurations of identical adsorbates, materials, and facets that converged to the same final structure, only one of them is considered in the AED. In our final compilation, we transformed all AEDs into histograms that depict the probability distribution of adsorption sites falling within 0.1 eV energy intervals. Each AED was normalized, ensuring that the aggregate probability of adsorption sites per adsorbate and material equaled one. This standardization facilitates direct comparisons across materials with different numbers of adsorption sites, which can range from several tens to nearly 10,000 for a single material, depending upon the complexity and symmetry of its bulk structure. For illustrative purposes, Fig. 3 displays examples of AEDs for selected materials. The AEDs for all investigated materials is shown in Supplementary Fig. S1 in the Supplementary Information.

**Fig. 3: AEDs for selected subset of materials.**

Inspection of Fig. 3 and Supplementary Fig. S1 reveals that adsorption energies span a wide interval from −7.42 eV to 2.40 eV. We included energies above zero, although positive adsorption energies are typically indicative of molecular desorption. However, the adsorption energies reported in this work do not include entropy and pressure terms, which could shift the energies to more negative values. Secondly, the adsorption energy of radicals is somewhat ill-defined, if different desorption channels are conceivable. Since our objective is to achieve a qualitative comparison across materials, the price energy zero is of no relevance, as long as it is chosen consistently.

The AEDs exhibit varying dispersion and forms, indicating fluctuations in adsorption energy and related activity levels across the material space. The adsorption energies of *OCH₃ (E_ads) are generally the lowest, followed by those of *OCHO, which are approximately 0.5–1 eV higher. However, certain materials, such as K (illustrated in Fig. 3b), show unique distribution overlaps for *OCHO and *OCH₃. Meanwhile, *H and *OH have comparatively higher E_ads values, although their order is inconsistent. For instance, in some cases, *H has the highest E_ads, particularly for K and Y₃In₅, whereas the opposite trend is observed for other materials like Ni. Single metal distributions are generally narrower and higher, as seen in the examples of K and Ni. Similarly, alloys composed of elements with high symmetry, such as CuZn, also exhibit narrow AEDs.

If the AEDs of a material predominantly align around the adsorption energy linked to maximal activity according to the Sabatier principle, the material is a strong candidate for a good catalyst. Conversely, complex alloys with low symmetry, such as Y₃In₅ (shown in the lower section of Fig. 3a and in Fig. 3c), display broad AED spreads. Extremely low adsorption energies can lead to catalyst poisoning, while excessively high energies can significantly reduce catalytic activity. Therefore, broad distributions are less desirable, as only a small portion of the material’s surface contributes effectively to catalytic processes.

Unsupervised learning: catalyst discovery

Although the ideal AEDs for the four adsorbates remain unknown, it is feasible to approximate their reactivity using AEDs based on their resemblance to previously identified, efficacious catalytic materials. In this context, our AEDs can be conceptualized as four-dimensional probability distributions. To quantify similarities across AEDs of different materials, we employ the Wasserstein distance as the metric³⁵. By computing Wasserstein distances for all possible material combinations, we construct a distance matrix. To interpret the distance matrix, we apply hierarchical agglomerative clustering with Ward linkage⁴⁴, which facilitates the identification of materials with similar AEDs. The outcomes of this clustering analysis are depicted in Fig. 4.

**Fig. 4: Hierarchical clustering of the materials, based on the Wasserstein distances.**

For a clustering threshold corresponding to a Wasserstein distance of 2.5 × 10⁻³, we arrive at a total of 19 distinct clusters, with potassium (K) forming its own, isolated, unnumbered cluster. The separation between clusters 11–19 and clusters 1–10 is considerable. The distinguishing feature is the broadness of the AEDs. The distributions in clusters 11–19 are noticeably broader than in clusters 1–10. Representative examples are depicted in Fig. 3a. The four materials at the bottom of the figure (Y₃Zn₁₁, V₆Ga₅, Y₃In₅, V₄Zn₅) pertain to cluster 18, whereas the rest belong to clusters 8–10. Further details are available in Supplementary Fig. S1 of the Supplementary Information, presenting the clustering of all considered materials. AEDs exhibit variability across distinct clusters (1–10) but show remarkable similarity within each individual cluster. For example, the AEDs for Ni, CuZn, NiZn, and ZnRh illustrated in Fig. 3d–g belong to the same cluster.

Clusters 8 through 10 aggregate into a larger cluster, from hereon denoted the macro-cluster, with relatively homogeneous AEDs. It encompasses materials such as Cu, a notably active component within known Cu/ZnO/Al₂O₃ catalysts^30,36. The clusters also contain non-Cu materials such as Zn-Pd, Pd-In, Pt-In, and Ni-Zn in different compositions that have been reported as catalytic converters of CO₂ to methanol^36,45. Also different compositions of the bimetallic alloys Ga-Ag, In-Ag, K-In, Zn-Rh, and Zn-Pt, which, to our knowledge, have not been tested for CO₂ to methanol conversion, are grouped with these materials. While most of the materials in the macro-cluster have either shown good catalytic performance or some have not been tested, potassium (K) (the lone non-numbered cluster), as a pure metal, is likely to undergo rapid oxidation under reaction conditions. Therefore, we anticipate that the macro-cluster, consisting of clusters 8-10, is likely too diverse to pinpoint only catalytically active materials.

Upon closer inspection, ZnRh and ZnPt₃ stand out as new candidates. They are part of cluster 10, which also includes Ga₂Cu, NiZn, InPt₃, Ni and mainly CuZn, but have not been tested for CO₂ to methanol conversion. While the exact material composition of the Cu and ZnO-based catalysts during the proceeding reaction is still debated^46,47,48, some studies suggest that the formation of a Cu-Zn alloy enhances the activity. This Cu-Zn alloy, which is part of cluster 10, is believed to contribute to increased activity^30,47,49. Similarly, NiZn has also been identified as an effective CO₂ catalyst⁴⁵. Catalysts such as Cu/Ga/ZnO⁵⁰, Cu/Ga/SiO₂⁵¹ and Pt/In₂O₃⁵², known for their high methanol yield, may include Ga₂Cu and InPt₃ alloys, respectively. Finally, Ni is often part of catalysts for CO₂ transformation to methane^53,54. The strong catalytic activity of Ga₂Cu, NiZn, InPt₃, Ni, and mainly CuZn in this cluster suggest that ZnRh and ZnPt₃ should also have high activity.

To conclude this section, we reiterate that our approach groups catalyst materials according to their computed AED similarity. To ascribe meaning to certain similarities we observe, we currently rely on experimental reports of catalytically active materials. Our proposals for interesting candidates are based on the assumption that AED similarity with a known good catalyst is a meaningful indicator for promising catalytic activity. Since the catalyst composition and microstructure are often not reported or not known, “active sites” or details of the catalytic mechanisms also remain opaque^30,46,48,55. In this context, our AED descriptor remains an attempt to find proxies for complex processes. It goes beyond the common practice of focusing on single adsorption energies in “active sites”, but could certainly be extended in future work and in collaboration with more detailed experimental investigations.

Statistical analysis and discussion

AEDs could serve as a descriptor of activity, however, the vast number of parameters (at least 388 bins in the distribution) makes it challenging to analyze them manually. To further our insight into the generated data, we conducted a statistical analysis of AEDs (SAAEDs) that facilitates comparison with previous adsorption energy-based studies. An example can be seen in Table 2, where we present the minimum adsorption energies for a subset of materials featured in Fig. 3a.

Table 2 Extract of the statistical analysis on the AEDs

Our SAAED analysis revisits individual binding energies and connects to the Sabatier principle. For instance, the results for *OH, *OCHO, and *OCH₃ can be compared to the volcano plot in Studt et al.⁴⁹, that relates the catalytic activity of the studied materials to the oxygen adsorption energy. In line with our approach, their work compares potential catalyst materials to Cu, although their focus lies on single-facet surfaces. Following previous findings that the Cu(211) facet is more active than the close-packed Cu(111) surface⁴⁷, Studt et al. use Cu(211) as their reference. The catalytic activity of Cu is further enhanced when Zn is added to the Cu(211) surface (referred to as Cu+Zn in the article). The oxygen adsorption energy decreases upon Zn addition, which indicates that the optimal oxygen adsorption energy should be lower than its minimal adsorption energy on the Cu(211) surface. Our data is consistent with those findings for the majority of our promising candidate materials. The minimal adsorption energy (\({E}_{ads}^{\min }\)) for all the oxygen-containing adsorbates on the majority of the materials in cluster 10 (highlighted in Table 2), including ZnRh, lies below that of Cu (our Cu data also covers the (211) surface) and is closely aligned across the materials. The exceptions are InPt₃ and ZnPt₃, in which the minima lie slightly above those of Cu, while both materials exhibit similar \({E}_{ads}^{\min }\) for all other adsorbates. This difference suggests that InPt₃ and ZnPt₃ may feature slightly different CO₂ conversion mechanisms.

Using minimum adsorption energies derived from ML models is comparable to previously studied methods for identifying global minima^13,21. Although the techniques by Lan et al.²¹ and Chen et al.¹³ might be more appropriate for the straightforward application of the Sabatier principle, our approach excels in providing more comprehensive information on various facets of catalytic materials. We have compiled this information for selected materials in Table S1 in the Supplementary Information. For example, the AED spread across energies, which can be deduced from the standard deviation \({E}_{\,\text{ads}}^{\text{std}\,}\), provides information about the percentage of the surface area usable for catalytic conversion.

Ultimately, both AEDs and SAAEDs, available on Zenodo⁴³, can serve as material fingerprints. The SAAED acts like a materials descriptor, similar to the Magpie descriptor¹⁶, but can be adapted to specific reactions through the choice of adsorbates, offering more detailed and relevant material information. Optionally, specific descriptors (AED, SAAED) and general descriptors (like Magpie¹⁶) may be combined to enhance the information that might be lacking in ML models from theoretical calculations.

Both catalyst descriptors are tailored to perform extensive searches for catalytically active candidates. They do not, however, include effects of the support, additives, preparation procedures and operando states that could change the morphology of the catalyst (e.g., nanoparticle sizes or areas of different facets). Our proposed AED descriptor does not take the facet area into account and is therefore insensitive to morphology changes of the catalysts under reaction conditions. In principle, a Wulff construction could introduce better facet information. However, it also cannot account for support effects, additives or preparation conditions.

Our methodology facilitates high-throughput screening of metallic catalyst candidates. At present, the effects of co-operating oxides such as ZnO, In₂O₃, and ZrO₂ that have been observed experimentally are not considered. Such oxides affect the electronic structure and adsorption energy landscape at the metal-oxide interface^{46,55,56,57,58} and should be included in future versions of our descriptor. For instance, incorporating general descriptors (e.g., Magpie) for co-catalysts and support materials could provide additional information to decide which active material-support combination should be investigated further in experimental testing.

Additionally, the choice of adsorbates can also influence the effectiveness of the AED descriptor. Our study focuses on the four most relevant intermediates, observed on the Cu(211) surface³⁰. Studies on different materials, such as Ni-ZrO₂^58,59, suggest that other intermediates or by-products like CO could play an important role in the hydrogenation mechanism. Thus, further investigations could extend the set of adsorbates to better capture various reaction paths and, therefore material-specific activity.

Our workflow clusters materials with high CO₂ conversion efficiency, but the materials can vary in their selectivity towards methanol or methane^{36,45,50,52,60}. As the reaction conditions, preparation procedures, or the interaction with support materials seem to affect the selectivity^45,60, our proposed catalyst candidates should thus be tested under various conditions to investigate their optimal selectivity towards methanol.

To finalize the analysis of our results, the similarity of the SAAED and Wasserstein distances of ZnRh and ZnPt₃ to good catalysts in the literature suggests that they could be potential catalyst candidates. As Cu-based catalysts are known for their vulnerability to degradation⁵, it is therefore reasonable to pre-examine these materials also in terms of stability. Given the harsh reaction conditions, mainly temperatures around 800 K³⁶, the melting temperature of the catalyst is directly related to the stability of the catalyst. The melting temperature of both ZnRh and ZnPt₃ is higher than that of pure copper or CuZn³⁷, suggesting that our new candidates could also be more durable.

Summary

In summary, we have established a fast and reliable computational approach for discovering new catalyst candidates for the conversion of CO₂ to methanol utilizing data-driven methodologies such as MLFFs and hierarchical clustering. Beginning with a list of potential metallic elements, we extracted experimentally verified materials from the Materials Project database. By integrating tools from fairchem, mainly OCP MLFFs, we created an extensive database of adsorption energies for a wide range of material facets and possible adsorption sites. We compiled this information to obtain a novel material descriptor, AED, which offers a more effective representation of the complex nature of heterocatalysts compared to standard methods. By carefully choosing the adsorbates, the descriptor can be tailored to provide the most information for any heterocatalytic reaction under study. Through efficient sampling for validation, we were able to quantify the quality of our workflow with a minimal number of DFT calculations while ensuring the high quality of our database. We grouped the materials by their AED similarity using statistical methods and clustering. This allowed us to pinpoint promising new candidates, namely ZnRh and ZnPt₃, based on their resemblance to known effective catalysts. Our results indicate that AEDs, together with statistical analysis, can serve as material fingerprints, aiding in the prediction of catalyst activity and accelerating the discovery process.

Source link

binance skapa konto commented on MTN SA upgrades call centres with AI, ML capabilities: Your article helped me a lot, is there any more re
create binance account commented on Telco leaders join forces to discuss next steps towards highly autonomous networks: Your point of view caught my eye and was very inte
最佳Binance推荐代码 commented on New Microsoft Teams App is Now Available: I don't think the title of your article matches th
"oppna ett binance-konto commented on Why the Apple UK hiring spree “makes sense” for the company: Your article helped me a lot, is there any more re
Реферальная программа binance commented on Amazon, Google Among Firms Focusing on AI Lobbying in States: I don't think the title of your article matches th

Machine learning accelerated descriptor design for catalyst discovery in CO2 to methanol conversion

Search space selection

Validation and data cleaning

Adsorption energy distributions

Unsupervised learning: catalyst discovery

Statistical analysis and discussion

Summary

Leave a Reply

RECENT POSTS

“AI will create new bottlenecks that will require humans to solve”: Monaghan – BNN Bloomberg

Vance calls Pope Leo’s AI warning ‘serious’

Could Workday’s new Sana AI solution for enterprises boost stock prices?

Search space selection

Validation and data cleaning

Adsorption energy distributions

Unsupervised learning: catalyst discovery

Statistical analysis and discussion

Summary

Related Posts

Leave a Reply