Cross-well machine learning prediction of sonic logs in Newfoundland and Labrador

Sci Rep. 2026 Jan 15. doi: 10.1038/s41598-026-36053-9. Online ahead of print.

Abstract

Predicting compressional slowness (DTCO) from non-sonic logs can reduce acquisition cost, fill data gaps, and support field planning. We evaluate blind cross-well DTCO prediction on two offshore Newfoundland & Labrador wells using a strictly leakage-free, features-only strategy: causal lag windows are built from past non-sonic logs and all sonic/sonic-derived channels are excluded. The pipeline includes deterministic depth conditioning, relative-depth features, multi-scale depth derivatives, rank-aggregated feature selection, and time-aware validation on the training well. We compare three model families: Random Forest (RF), Extreme Gradient Boosting (XGBoost), and a BiLSTM. In this setting, tuned XGBoost with the top 20 predictors and a 10-sample lag attains blind cross-well performance of [Formula: see text], MAE[Formula: see text], RMSE[Formula: see text] when trained on Well 1 and tested on Well 2; the reverse direction is lower, indicating inter-well distribution shift. RF performs competitively in several configurations, whereas BiLSTM underperforms on these data. Overall, rigorous leakage control, depth-aware feature engineering, and principled feature selection are key drivers of performance, and tree-based ensembles provide strong, data-efficient baselines for cross-well pseudo-sonic prediction.