Application of Ensemble Learning Paradigms in Predicting Interfacial Tension of H2/Cushion Gas Systems and the Implications on Subsurface H2 Storage Conference Paper uri icon

abstract

  • Abstract The role of hydrogen geo-storage and production in addressing global warming and energy demand concurrently cannot be understated. Diverse factors such as interfacial tension (IFT) and wettability influence safe and effective hydrogen geo-storage and production. The IFT controls the maximum H2 storage column height, capacity, and capillary entry pressure. Current laboratory experimental techniques for IFT determination of H2/cushion gas systems are resource-intensive. Nonetheless, the extensive experimental IFT data supports machine learning (ML) deployment to determine IFT time-efficiently and cost-effectively. Hence, this work evaluated the predictive capabilities of supervised ML paradigms including random forest, extra trees regression, gradient boosting regression (GBR), and light gradient boosting machine, wherein the novelty of the study lies. An extensive and comprehensive dataset comprising 2564 IFT instances was gathered from the literature, encompassing independent variables: pressure 0.10–45 MPa), temperature (20–176 °C), brine salinity (0–20 mol/kg), and hydrogen, methane, carbon dioxide, and nitrogen mole fractions (0-100 mol.%). The data was pre-processed and split into 70% for model training and 30% for testing. Statistical metrics and visual representations were utilized for quantitative and qualitative assessments of the models. The Leverage approach was subsequently applied to classify the different data categories and verify the statistical validity of the database and the reliability of constructed paradigms. The impact of the independent variables on IFT prediction was evaluated using Spearman correlation, permutation importance, and Shapley Additive Explanations (SHAP). Nitrogen and CO2 mole fractions demonstrated the least and greatest impact on H2/cushion gas/brine IFT based on correlation analysis, permutation importance, and SHAP. Generally, the developed paradigms successfully captured the underlying relationships between the independent variables and IFT, recording an overall R2 > 0.97, MAE < 1.30 mN/m, RMSE < 2 mN/m, and AARD < 2.3% Nonetheless, the GBR model demonstrated superior predictive performance, yielding the highest R2 and lowest MAE, RMSE, and AARD of 0.987, 0.507 mN/m, 0.901 mN/m, and 0.906%, respectively. GBR also provided more accurate IFT results for pure H2/water and H2/cushion gas systems than empirical and molecular dynamics-based correlations developed by other scholars. Only 0.43–2.11% of the dataset was outside the validity range, underscoring the statistical validity of the database and reliability of the models. The developed paradigms are beneficial tools in the toolbox of domain experts, which could fast-track workflows and minimize uncertainties surrounding conventional IFT determination techniques for aqueous H2 systems. This progress is promising for mitigating hydrogen loss and optimizing strategies in H2 geo-storage and production.

authors

  • Turkson, Joshua Nsiah
  • Md Yusof, Muhammad Aslam bin
  • Tackie-Otoo, Bennet Nii
  • Darkwah-Owusu, Victor
  • Sokama-Neuyam, Yen Adams
  • Fjelde, Ingebret

publication date

  • 2025

start page

  • D012S002R001