RESEARCH STATEMENT
Broad Area of Research: Computational Metallurgy
Project Title: Computational, Statistical & Experimental studies of MPEAs for Dental Implants using Machine Learning & Deep Learning
Abstract: MPEAs are trending in the world of Metallurgy and it opens up wide dimensions to explore. Currently Titanium, which is costly, is used widely for dental implants. The aim of this project is to make MPEAs viable and feasible for the application of dental implants thereby improving the efficiency, reducing cost and increasing bio feasibility. Designing Multiprincipal Element Alloys (MPEAs) for dental applications undergoes several stages such as Phase Analysis, Structure Analysis, Biocompatibility Check, Synthesis, Economical Optimization and checking the sociopolitical feasibility. Researchers have found that MPEAs are comparatively better potential candidates than conventional alloys but these are not explored much. The purpose of current project is to find out alternatives for dental implants which are currently synthesized using Titanium based alloys. In order to explore feasibility of MPEAs it is highly important that each step of design and development is well scrutinized. The present status of work is that using the data sets collected from various experimental works machine learning algorithms are used to do the phase and structure prediction. Categorically classifying the structures and phases helps to identify which combination of elements results in which phase or structure. Six different algorithms are used using SCIKIT LEARN (Machine Learning Library) namely, Decision Tree, KNN, Logistic Regression, Gaussian Naïve Bayes, Random Forest and Support Vector Machine for both structure and phase prediction. Findings showed that for phase prediction SVM model is best suitable and for structure prediction Random Forest model is best suitable. Using the datasets for phase and structure prediction both Machine Learning models and Deep learning models were successfully prepared using Scikit Learn, Tensorflow & ANN.
Literature: MultiPrincipal Element Alloys (MPEAs) are a novel class of materials having 3 or more than 3 Principal elements blended in equal or near equal proportions having atomic percentage ranging from 535 %. Other than Principal elements minor elements could also be present but each having a proportion less than 5%. The term "High Entropy Alloys" was invented because when there are many elements with nearly equal proportions in the mixture, the entropy increase of mixing is substantially higher. Also, the alloys are named as HEAs in light of the fact that their random solid solution or liquid states have altogether higher mixing entropies than those in conventional alloys. In this way, the impact of entropy is considerably more articulated in HEAs as it stabilizes the solid solution phase.
In the case of MPEAs, random field of different elements unlocks multiple pathways for dislocation movements. In case of Conventional alloys, this feature is not present. The multiprincipal element character of HEAs leads to some important effects that are much less pronounced in conventional alloys. These can be considered as 4 core effects: High Entropy Effect, SluggishDiffusion Effect, Lattice Distortion Effect, Cocktail Effect.
Work Progress: This project is divided into several sections. The first section being computational modelling where at present four models were successfully developed namely, ML model for MPEA Phase prediction using Scikit Learn, ML model for MPEA structure prediction using Scikit Learn. DL model for MPEA Phase prediction using Tensorflow & ANN, DL model for MPEA structure prediction using Tensorflow & ANN. A comparison was made which model is best suitable for which prediction. XGBoost & Random forest classifiers were also studied. Accuracy, Precision, Recall & F1 Scores were calculated.
Results & Findings: The data set for Phase prediction & Structure prediction was collected from various research journals and repositories. The dataset of Phase prediction contains Alloy name, No. of Components, Atomic Fraction, Atomic Radius, Atomic Size difference, Melting Temperature, Enthalpy of Mixing, Entropy of Mixing, Electronegativity, Valence Electron Concentration, Bulk Modulus, Omega Parameter and Gibbs’s Free Energy. There are several other factors which is planned to be taken in near future. The data was processed, cleaned and standardization to avoid errors. The correlation matrix (Pearson & Spearman) was made and all important features were selected for iteration. The output variable was labelled with numerical values. Data was trained, tested and split. Six different types of Machine Learning algorithms were used namely Decision Tree, KNN, Logistic Regression, Gaussian Naïve Bayes, Random Forest and Support Vector Classification (SVC). For Decision Tree Algorithm, accuracy level obtained was 79.56, Precision as 0.86, Recall as 0.86 and F1 Score as 0.86. For KNN Algorithm, accuracy level obtained was 83.42, Precision as 0.83, Recall as 0.90 and F1 Score as 0.86. For Logistic Regression Algorithm, accuracy level obtained was 81.77, Precision as 0.81, Recall as 0.86 and F1 Score as 0.83. For Logistic Regression ROC curve was also made and it was shown that the area under curve (AUC) is maximum for class 0 and class 4 (0.97). The AUC value for other classes i.e. class 2 and class 3 is same (0.94) followed by class 1 which has least AUC value (0.89). The classes labelled as follows: class 0= Amorphous phase, class 1= (Amorphous+ Intermetallic) phase, class 2= Intermetallic phase, class 3= (Intermetallic + Solid Solution) phase, class 4= Solid Solution phase. This indicates that the Logistic Regression Model can clearly distinguish the Solid Solution phase and amorphous phase from all other phases with comparatively higher accuracy. For Gaussian Naïve Bayes, accuracy level obtained was 70.16, Precision as 0.89, Recall as 0.84 and F1 Score as 0.86. For Random Forest Algorithm, accuracy level obtained was 83.98, Precision as 0.84, Recall as 0.86 and F1 Score as 0.85.For SVC Algorithm, accuracy level obtained was 85.63, Precision as 0.85, Recall as 0.87 and F1 Score as 0.86. Out of all the Six Machine Learning algorithms, for phase prediction of MPEA, SVC algorithm was found to be the best classifier for the phases in the prepared dataset and Gaussian Naïve Bayes algorithm was found to be the weakest classifier. The dataset of Structure prediction contains Alloy name, No. of Components, Atomic Fraction, Atomic Size difference, Enthalpy of Mixing, Entropy of Mixing, Electronegativity RMS Values & Valence Electron Concentration. There are several other factors which is planned to be taken in near future. The data was processed, cleaned and standardization to avoid errors. The correlation matrix (Pearson) was made and all important features were selected for iteration. The output variable was labelled with numerical values. Data was trained, tested and split. Six different types of Machine Learning algorithms were used namely Decision Tree, KNN, Logistic Regression, Gaussian Naïve Bayes, and Random Forest & Support Vector Classification (SVC).
For Decision Tree Algorithm, accuracy level obtained was 89.13, Precision as 0.93, Recall as 0.93 and F1 Score as 0.93. For KNN Algorithm, accuracy level obtained was 93.48, Precision as 0.93, Recall as 0.93 and F1 Score as 0.93. For Logistic Regression Algorithm, accuracy level obtained was 93.47%, Precision as 0.93, Recall as 1 and F1 Score as 0.97. For Logistic Regression ROC curve was also made and it was shown that the area under curve (AUC) is maximum for class 0 and class 3 (1). The AUC value for other classes i.e. class 1 and class 2 is same (0.97). The classes labelled as follows: class 0= BCC structure, class 1= FCC structure, class 2= (FCC+BCC) structure, class 3= HCP structure. This indicates that the Logistic Regression Model can clearly distinguish the BCC structure and FCC structure from all other structure with comparatively higher accuracy. For Gaussian Naïve Bayes, accuracy level obtained was 93.48, Precision as 0.88, Recall as 1.00 and F1 Score as 0.93. For Random Forest Algorithm, accuracy level obtained was 97.82, Precision as 0.93, Recall as 1.00 and F1 Score as 0.97. For SVC Algorithm, accuracy level obtained was 89.13, Precision as 0.82, Recall as 1.00 and F1 Score as 0.90. Out of all the Six Machine Learning algorithms, for structure prediction of MPEA, Random Forest algorithm was found to be the best classifier amongst the structures for the prepared dataset and SVC algorithm was found to be the weakest classifier.
When Deep Learning models were developed using Tensorflow and ANN the following conclusions were drawn.
MPEA PHASE PREDICTION: The data is highly imbalanced and the number of samples are less. Due highly imbalanced dataset, it doesn't perform well in Deep learning. (Accuracy 41%) As the samples are less, it’s better to train in Machine Learning model like Random forest (Accuracy obtained was 86%) From TSNE Image we can conclude we can’t use linear model for this dataset. ML Models like Random forest or other Nonlinear models will give good results for phase prediction.
MPEA STRUCTURE PREDICTION: The data points are very low. Both ML and DL models were working well. If we have more Data point Deep Learning Model would have given better results. The linear models will also worker fine but it will not able to separate Class 1 and Class 3. XGBoost is giving the best result of 94%. Deep learning model is giving 84%.if we have more sample then it may give higher accuracy.
FUTURE PLANS :

Application of CALPHAD for Phase prediction of MPEAs using ThermoCalc

Application of AIIDA – Quantum Espresso to MPEAs

MonteCarlo Simulation for Biomedical MultiPrincipal Element Alloys (DataDriven Approach)

Synthesis of MPEAs for Biomedical Applications (Elements to be decided later)

Development of database for MPEA’s using Materials Genome initiative (MGI).

To incorporate Vibrational entropy, magneticdipole entropy and electronic randomness effect along with configurational entropy in Phase prediction of MPEA’s.

Comparative and combined study of DFT computation, MD calculation, CALPHAD modelling and highthroughput methods of designing MPEA’s.

Plotting Stressstrain curves for MPEA’s under serration effect.

Developing a new original model for newly composed MPEA’s and predicting their properties.

Experimental validation of the above prepared model by synthesis of Titanium based MPEA’s and studying their properties to inspect their applicability in dental implants.