AI-Driven Screening and Prediction for Selected Advanced Energy Materials
This repository contains the database, documentation, Python library (coming soon), and notebooks used to build the Energy-GNoME database.
The purpose of this repository is to enable reproducibility and, more importantly, to support the continuous integration of your data points for model training, as the database is designed as a living database.
For further details, refer to the associated article:
De Angelis P., Trezza G., Barletta G., Asinari P., Chiavazzo E. "Energy-GNoME: A Living Database of Selected Materials for Energy Applications". arXiv, November 15, 2024. doi: 10.48550/arXiv.2411.10125.
If you find this project valuable, please consider citing the following pre-print work:
De Angelis P., Trezza G., Barletta G., Asinari P., Chiavazzo E. "Energy-GNoME: A Living Database of Selected Materials for Energy Applications". arXiv November 15, 2024. doi: 10.48550/arXiv.2411.10125.
@misc{deangelis_energy-gnome:_2024,
title = {Energy-{GNoME}: {A} {Living} {Database} of {Selected} {Materials} for {Energy} {Applications}},
shorttitle = {Energy-{GNoME}},
url = {http://arxiv.org/abs/2411.10125},
doi = {10.48550/arXiv.2411.10125},
abstract = {Artificial Intelligence (AI) in materials science is driving significant advancements in the discovery of advanced materials for energy applications. The recent GNoME protocol identifies over 380,000 novel stable crystals. From this, we identify over 33,000 materials with potential as energy materials forming the Energy-GNoME database. Leveraging Machine Learning (ML) and Deep Learning (DL) tools, our protocol mitigates cross-domain data bias using feature spaces to identify potential candidates for thermoelectric materials, novel battery cathodes, and novel perovskites. Classifiers with both structural and compositional features identify domains of applicability, where we expect enhanced accuracy of the regressors. Such regressors are trained to predict key materials properties like, thermoelectric figure of merit (zT), band gap (Eg), and cathode voltage (\${\textbackslash}Delta V\_c\$). This method significantly narrows the pool of potential candidates, serving as an efficient guide for experimental and computational chemistry investigations and accelerating the discovery of materials suited for electricity generation, energy storage and conversion.},
urldate = {2024-12-03},
publisher = {arXiv},
author = {De Angelis, Paolo and Trezza, Giovanni and Barletta, Giulio and Asinari, Pietro and Chiavazzo, Eliodoro},
month = nov,
year = {2024},
note = {arXiv:2411.10125},
keywords = {Condensed Matter - Materials Science, Condensed Matter - Other Condensed Matter, Computer Science - Machine Learning},
}
Additional articles to cite:
-
GNoME Database: Additionally, please consider citing the foundational GNoME database work:
Merchant, A., Batzner, S., Schoenholz, S.S. et al. "Scaling deep learning for materials discovery". Nature 624, 80-85, 2023. doi: 10.1038/s41586-023-06735-9.
-
E(3)NN Model: And the E(3)NN Graph Neural Network model
Chen Z., Andrejevic N., Smidt T. et al. " Direct Prediction of Phonon Density of States With Euclidean Neural Networks." Advanced Science 8 (12), 2004214, 2021. 10.1002/advs.202004214
- Databases:
- Cathodes
- Perovskites
- Thermoelectrics
- Dashboards
- Cathodes
- Perovskites
- Thermoelectrics
-
energy-gnome
python library- Data handlers Objects
- Model handlers Objects
- CLI
e-gnome
-
jupyter
notebooks tutorials- Cathodes
- Perovskites
- Thermoelectrics
├── LICENSE <- Open-source license if one is chosen
├── Makefile <- Makefile with convenience commands like `make data` or `make train`
├── README.md <- The top-level README for developers using this project.
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
│
├── docs <- A default mkdocs project; see www.mkdocs.org for details
│
├── models <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
│ the creator's initials, and a short `-` delimited description, e.g.
│ `1.0-jqp-initial-data-exploration`.
│
├── pyproject.toml <- Project configuration file with package metadata for
│ energy_gnome and configuration for tools like black
│
├── references <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated graphics and figures to be used in reporting
│
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`
│
├── setup.cfg <- Configuration file for flake8
│
└── energy_gnome <- Source code for use in this project.
│
├── __init__.py <- Makes energy_gnome a Python module
│
├── config.py <- Store useful variables and configuration
│
├── dataset.py <- Scripts to download or generate data
│
├── features.py <- Code to create features for modeling
│
├── modeling
│ ├── __init__.py
│ ├── predict.py <- Code to run model inference with trained models
│ └── train.py <- Code to train models
│
└── plots.py <- Code to create visualizations
The database, along with all associated dashboards and code, will be made available on November 30th.
For further information or to join the waiting list, please contact Paolo De Angelis.