Opening the Black Box: How SubsurfaceAI Makes Machine Learning Predictions Explainable for Porosity Prediction

The Promise and Problem of Machine Learning in Geoscience

Machine learning (ML) has transformed the way geoscientists interpret subsurface data. From seismic attribute volumes to well logs and core data, algorithms can now uncover subtle patterns that might take humans months—or years—to detect manually.
One of the most exciting applications is predicting reservoir properties such as Porosity directly from seismic attributes. Accurate Porosity prediction helps map sweet spots, plan wells, and estimate reserves more confidently.

But there’s a problem: black boxes don’t build trust.

A neural network might deliver beautifully accurate predictions, but if no one understands why it works, few geoscientists will rely on it for decision-making. In geoscience, interpretability isn’t optional—it’s essential.

That’s why, in SubsurfaceAI, we’ve built a suite of explainable machine learning (X-ML) tools that turn opaque neural networks into transparent, scientifically grounded models. These tools help geoscientists see exactly how seismic attributes drive predictions and understand the relationships learned by each model.

This post takes you inside that explainability system, focusing on neural network–based Porosity prediction from seismic attributes. We’ll show how SubsurfaceAI helps users visualize, interpret, and validate their models—so that every prediction can be explained with confidence.

What Is SubsurfaceAI?

SubsurfaceAI is an intelligent subsurface analytics platform designed to help geoscientists, petrophysicists, and data scientists use AI to extract insight from seismic, well, and geological data.
The platform offers a rich family of ML algorithms—ranging from Random Forests and XGBoost to neural networks—for predicting reservoir properties and facies from seismic attributes.

What sets SubsurfaceAI apart is its commitment to transparency. Every model comes with a Performance window that shows not only how well it predicts (accuracy, correlation, errors) but also why it predicts the way it does—through explainability tabs such as:

  • Attribute Importance
  • Partial Dependence (PD) & Individual Conditional Expectation (ICE)
  • SHAP (Shapley Additive Explanations)
  • Error Distribution and Sample Tables

These tools make complex models interpretable for geoscientists, bridging the gap between data science and geological reasoning.

Why Explainability Matters in Reservoir Modeling

In machine learning, explainability means understanding which inputs drive model outputs and how they interact.
For geoscientists, this translates to answering questions like:

  • Which seismic attributes influence Porosity the most?
  • Do those relationships make geological sense?
  • How consistent is the model’s behaviour across the field?
  • Why did it predict higher Porosity in one zone and lower in another?

Without clear answers, a neural network is just a number generator.
But with explainable tools, we can connect its internal logic back to geological concepts—like impedance contrasts, amplitude strength, or structural curvature.

When geoscientists can see that “increasing RMS amplitude tends to raise predicted Porosity up to a threshold”, or “coherence has a mild negative effect on Porosity in well-layered zones,” trust is built. And trusted models are the only ones that get used.

Inside SubsurfaceAI’s Explainability Framework

When a user trains a neural network model in SubsurfaceAI—say, to predict Porosity from a set of seismic attributes—the platform automatically computes a rich collection of diagnostic and interpretability outputs, grouped under the Performance window.

Let’s explore each component.

  1. The Performance Window — Seeing How Well the Model Works

The Performance window is your starting dashboard. It summarizes model accuracy and stability across data partitions: training, validation, and test.

It provides several key plots:

  • Cross-plots of predicted vs. actual Porosity for each data subset.
    Each cross-plot displays a correlation coefficient (R) so you can instantly assess model fit. A tight cluster along the 45-degree line means strong predictive performance.
  • Error distribution histogram, showing the spread of prediction errors (predicted − actual).
    A symmetric, narrow distribution around zero indicates unbiased predictions.
  • Sample tables listing the actual, predicted, and residual values for every data point.
    These help identify outliers—cases where the model may have extrapolated or encountered unseen conditions.
  • Summary metrics, such as RMSE, MAE, and R² for each partition, give a quantitative sense of model performance and generalization.

Together, these visuals answer the question: Does the model perform well—and consistently—across training, validation, and testing?

Only once the model passes this sanity check does it make sense to dig deeper into why it behaves as it does.

  1. Attribute Importance — Which Seismic Inputs Matter Most?

Next, the Attribute Importance tab ranks input seismic attributes by their relative influence on the model’s predictions.

For neural networks, SubsurfaceAI computes permutation importance—it measures how much the model’s accuracy drops when each attribute’s values are randomly shuffled. The larger the drop, the more important that attribute.

For example, in a Porosity model trained with ten attributes, you might see results like:

Attribute

Relative Importance

RMS Amplitude

28 %

Acoustic Impedance

23 %

Envelope

16 %

Coherence

10 %

Curvature

8 %

Depth

6 %

Others

< 5 %

This table immediately gives insight: Porosity is primarily driven by amplitude-related attributes and impedance contrasts—intuitively consistent with physical reasoning.

Attribute importance tells you what matters, but not how it matters. For that, we turn to partial dependence and ICE plots.

  1. Partial Dependence (PD) Plots — Understanding Average Effects

A Partial Dependence Plot (PDP) shows how the model’s predicted Porosity changes as one attribute varies, while all other attributes are held constant (averaged over the dataset).

It answers the question: “On average, how does this attribute affect the predicted Porosity?”

For instance, imagine a PDP for RMS amplitude:

  • At low amplitude values (0–15 dB), predicted Porosity increases sharply.
  • Between 15–40 dB, the rise slows and begins to plateau.
  • Beyond 40 dB, the effect flattens or slightly decreases.

This suggests that stronger amplitudes generally indicate higher Porosity—but only up to a point. Beyond that, the relationship saturates, perhaps because other lithologic factors take over.

In SubsurfaceAI, PDPs are interactive: you can select any attribute, zoom into value ranges, and compare across models.
They offer an intuitive, one-at-a-time look at how seismic attributes influence model output.

  1. ICE (Individual Conditional Expectation) Plots — Seeing Sample-Level Variability

While PD plots show the average effect, they can hide individual variations.
That’s where ICE plots come in.

Each line on an ICE plot represents one sample (for example, one well or trace). It shows how that specific sample’s predicted Porosity would change as a single attribute varies.

If all ICE lines move in the same direction, the relationship is stable across the dataset.
If they diverge widely, it means the effect of that attribute depends on other attributes—there are interactions or non-stationarity.

Example: In our Porosity model, ICE curves for RMS amplitude may cluster into two groups:

  • Cluster A: predictions rise steeply with amplitude (high-Porosity sands).
  • Cluster B: predictions flatten early (tight shales).

This tells the geoscientist there are distinct behavioural regimes within the data—valuable geological insight that might motivate building zonal or facies-specific models.

  1. SHAP (Shapley Additive Explanations) — Opening the Black Box

The most powerful explainability feature in SubsurfaceAI is the SHAP tab family.

SHAP values come from cooperative game theory.
Imagine each seismic attribute as a “player” in a team game where the “score” is the model’s predicted Porosity.
The SHAP algorithm assigns each player a fair share of the total score based on how much they contributed compared to all possible combinations of players.

Mathematically complex, yes—but the intuition is simple: each SHAP value tells you how much that attribute pushed the prediction up or down relative to the model’s baseline.

SubsurfaceAI displays several SHAP views:

  1. SHAP Summary Plot

A colourful scatter plot ranking attributes by their mean absolute SHAP value (global importance).
Each point represents one sample; colour encodes the actual attribute value.
You can immediately see which attributes drive the model most and whether high values push predictions up or down.

Example interpretation:
In a Porosity model,

  • High RMS amplitude (red dots) → positive SHAP values → increases predicted Porosity.
  • High coherence (red) → negative SHAP values → decreases predicted Porosity (consistent with fewer fractures).

This view links statistics with geology at a glance.

  1. SHAP Dependence Plot

This plot dives into one attribute, showing how its value relates to its SHAP contribution.
It often reveals non-linear relationships or interactions—you can even colour by another attribute to see combined effects.
E.g., curvature may only increase Porosity when coherence is low—visible as colour gradients.

  1. SHAP Waterfall or Force Plot (Per Sample)

For a single sample—say, one well—you can view a waterfall diagram showing how each attribute contributed to its specific Porosity prediction.

Example:
A prediction of 21 % Porosity might be explained as:

  • Baseline (mean) prediction: 17 %
    • 3 % from RMS amplitude
    • 2 % from impedance
  • − 1 % from depth
    = 21 % final prediction

For geoscientists, this is transformative: you can finally answer “Why did the model predict this Porosity here?” in quantitative, attribute-based terms.

  1. Error Distribution and Sample Listings

Even with good models, not every prediction will be perfect.
That’s why SubsurfaceAI includes detailed error analysis tools:

  • Error histogram: shows whether the model is biased high or low.
  • Residuals vs. predicted values: helps detect heteroscedasticity (variance increasing with value).
  • Sample tables: allow sorting by highest errors to inspect outliers, which can reveal poor data quality, missing attributes, or extrapolation.

Interpreting errors geologically often leads to new understanding—sometimes what looks like “error” is actually real heterogeneity in the rock.

🏗️ A Practical Example: Explaining Porosity Prediction

Let’s walk through a realistic case study using SubsurfaceAI’s explainability suite.

Step 1. Build a ML Model

A geoscientist selects a dataset with well-measured Porosity values and multiple seismic attributes:

  • RMS Amplitude
  • Envelope
  • Acoustic Impedance
  • Coherence
  • Curvature
  • Depth
  • AVO Gradient
  • Phase Attribute

The neural network architecture chosen is a 4-layer MLP with ReLU activations and dropout regularization.
The data is split 70 % for training, 15 % validation, and 15 % testing.

After training, the Performance window shows:

Metric

Training

Validation

Test

Correlation (R)

0.88

0.82

0.80

RMSE (%)

2.7

3.1

3.3

A clean cross-plot with points tightly around the diagonal confirms strong performance. The test performance close to training indicates good generalization.

Step 2. Inspect Attribute Importance

The importance ranking shows:

Attribute

Importance (%)

RMS Amplitude

27

Acoustic Impedance

23

Envelope

16

Coherence

10

Curvature

8

Depth

6

Others

< 5

So amplitude-related attributes dominate, consistent with the physics of reflection strength responding to Porosity and lithology changes.

Step 3. Study PD and ICE Plots

The PDP for RMS amplitude shows a strong positive relationship up to about 40 dB, after which it levels off.
ICE curves reveal two behaviour clusters:

  • In high-Porosity sand zones, the slope remains positive longer.
  • In low-Porosity shales, it plateaus earlier.

This split tells us the model internally distinguishes between lithofacies—something that might not be explicit in the input data.

The PDP for depth shows a weak negative trend: deeper zones predict slightly lower Porosity, matching expected compaction effects.

Step 4. Dive into SHAP Plots

Global view

The SHAP summary plot mirrors the importance ranking: RMS amplitude and impedance have largest contributions.
Colour gradients show that high RMS amplitude increases predicted Porosity (positive SHAP), while high coherence reduces it.

Local view

Inspecting a specific test sample (Well B at 2,400 m):

Attribute

Value

SHAP Contribution (%)

RMS Amplitude

42 dB

+3.2

Acoustic Impedance

7.4 km/s·g/cc

+1.5

Envelope

0.85

+0.6

Depth

2,400 m

−1.0

Coherence

0.93

−0.8

Others

+0.1

Total (Prediction)

+3.6 → 20.6 % Porosity

The waterfall plot makes this visual: red bars push prediction up, blue bars push it down.
For Well B, amplitude and impedance lift the prediction above baseline; coherence and depth slightly reduce it.

When the geoscientist compares this to geological interpretation, it aligns perfectly: strong amplitudes in this interval correspond to porous, unconsolidated sands, while high coherence indicates well-layered, less fractured rock.

The result: the model is not only accurate but geologically explainable.

Step 5. Review Error Distribution

The residual histogram is centred near zero with a small positive skew—occasional high predictions.
Investigating outliers using the sample table reveals they occur where the AVO attribute was missing or noisy.
This insight prompts cleaning that attribute or weighting it less in retraining.

Interpreting the Model Geologically

With these explainability tools, geoscientists can go far beyond statistical validation. They can link model behaviour to physical causes:

  • Amplitude → Porosity correlation: Higher amplitudes correspond to greater acoustic contrast, typically higher Porosity.
  • Coherence → Structure control: Lower coherence (discontinuities) corresponds to fractured or faulted zones, enhancing Porosity.
  • Depth trend: Deeper zones show reduced Porosity due to compaction.
  • Curvature interaction: Positive curvature regions (anticlines) sometimes correlate with higher Porosity due to stress-related fracturing.

By confirming that the ML model reproduces these known geological relationships—and quantifying them—SubsurfaceAI enables interpretable, trustworthy predictions.

💡 Best Practices for Explainable Modeling in SubsurfaceAI

  1. Start simple. Use a manageable number of seismic attributes to make interpretability easier.
  2. Validate every step. Don’t rely on a single metric—use cross-plots, PD, SHAP, and error analysis together.
  3. Look for geological sense. If the model suggests a physically implausible relationship, investigate—maybe data bias, maybe a discovery.
  4. Use ICE to detect interactions. Divergent curves often hint at facies mixing or structural influence.
  5. Interrogate outliers. Large residuals often reveal either data quality issues or unique geological features.
  6. Compare models. Train XGBoost or Random Forest on the same dataset and compare SHAP patterns to see if the neural net is discovering consistent behaviour.
  7. Document explainability. Export plots and include them in reports—stakeholders love seeing the “why” behind predictions.

The Bigger Picture — From Predictions to Insight

Explainable machine learning isn’t just about transparency. It’s about discovery.
By revealing the relationships hidden inside neural networks, geoscientists can gain new insight into subsurface processes.

For example:

  • A non-linear PD curve might reveal an amplitude threshold beyond which Porosity no longer increases—hinting at lithology change.
  • SHAP dependence plots might show curvature effects only matter when coherence is low—indicating fracture-dominated Porosity.
  • ICE clusters might correspond to different facies families—suggesting new classification workflows.

In this way, explainability turns ML from a prediction engine into a knowledge engine—bridging human geological reasoning and data-driven learning.

Conclusion — Transparency Builds Trust

The future of AI in geoscience isn’t just more accurate models—it’s more understandable ones.

SubsurfaceAI’s explainability framework brings clarity to complex neural networks through:

  • Interactive Performance dashboards
  • Visual Attribute Importance ranking
  • Partial Dependence and ICE plots for trend understanding
  • Comprehensive SHAP visualizations for both global and local insight
  • Detailed error and sample analysis

For Porosity prediction—and indeed any reservoir property—these tools let geoscientists see how and why the model works, aligning data science with geological intuition.

When you can show that a neural network’s predictions are not magic but measurable, consistent, and physically interpretable, you don’t just get better models—you get trust, adoption, and ultimately better decisions.

SubsurfaceAI: where machine learning meets geology—with transparency built in.