Creating Scatter Plots (Crossplots) of Well Log Data using matplotlib in Python
Creating Scatter Plots (Crossplots) of Well Log Data using matplotlib in Python
Use scatter plots to visualise the relationship between variables
Introduction
Scatter plots are a commonly used data visualisation tool. They allow us to identify and determine if there is a relationship (correlation) between two variables and the strength of that relationship.
Within petrophysics scatter plots, are commonly known as crossplots. They are routinely used as part of the interpretation workflow and can be used for
- clay and shale end points identification for our clay or shale volume calculations
- outlier detection
- lithology identification
- hydrocarbon identification
- rock typing
- regression analysis
- and more
In this short tutorial we will see how to display histograms from one of the Volve DatasetWells.
The notebook for this tutorial can be found here.
The accompanying video for this tutorial can be found on my new YouTube channel at:
Importing Libraries and Loading LAS Data
The first stage of any python project or notebook is generally to import the required libraries. In this case we are going to be using lasio
to load our las file, pandas
for storing our well log data, and matplotlib
for visualising our data.
import pandas as pd
import matplotlib.pyplot as plt
import lasio
The data we are using for this short tutorial comes from the publicly released Equinor Volve dataset. Details of which can be found here
To read the data we will use the lasio library which we explored in the previous notebook and video.
las = lasio.read("Data/15-9-19_SR_COMP.LAS")
The next step is to convert our las file into a pandas dataframe. This is quickly achieved by calling upon the .df()
method from the lasio library.
To confirm we have the correct data, we can then call upon the .describe()
method which will give us information about the data contained within it.
df = las.df()
df.describe()
We can see that we have seven logging curves within this file.
- AC for acoustic compressional slowness
- CALI for borehole caliper
- DEN for bulk density
- GR for gamma ray
- NEU for neutron porosity
- RDEP for deep resisitivity
- RMED for medium resistivity
We can also view the first 10 rows of the dataframe by calling upon df.head(10)
. This returns the first 10 rows of the dataframe. In our example we can see that only one column, GR, contains valued values. All of the others contain NaN or Not a Number. This is quite common within well logging datasets, especially at the top of the well where some measurements are not required.
df.head(10)
Creating a Crossplot / Scatter plot
Now that we have our data loaded, we can begin creating our first scatter plot/crossplot of our logging data. In particular, we will use the density and neutron porosity measurements. These two measurements are often plotted together when carrying out a petrophysical workflow. From this data we identify a number of different things about the intervals logged, including hydrocarbon presence, lithology, and bad data etc.
To create the scatter plot we can call upon the following code.
# Set up the scatter plot
plt.scatter(x='NEU', y='DEN', data=df)
plt.show()
We can see above that we now have a very simple but not a very informative scatter plot / crossplot. Firstly, the values and the way data is displayed is different to what we would expect. For a density neutron crossplot, we would expect the bulk density (DEN) on y-axis to be inverted and go from 3.0 to 2.0 g/cc, and we would generally not expect to see neutron porosity (NEU) on the x-axis to be above 60%.
We need to reflect these scale ranges on our plot by using xlim
and ylim
.
Also, to make our plots easy to read and see we can set the default plot size for our scatterplots using plt.rcParams
.
plt.rcParams['figure.figsize'] = (8, 8)
# Set up the scatter plot
plt.scatter(x='NEU', y='DEN', data=df)
# Change the X and Y ranges
plt.xlim(-5, 60)
# For the y axis, we need to flip by passing in the scale values in reverse order
plt.ylim(3.0, 1.5)
plt.show()
Adding Labels to the Axes
The scatter plot above is not much use to anyone else, as there are no labels or units on the axes. The reader will not have any idea what each of the axes represents. So we need to tell the reader of the plot what is plotted against what.
We can add these in using plt.xlabel
and plt.ylabel
.
# Set up the scatter plot
plt.scatter(x='NEU', y='DEN', data=df)
# Change the X and Y ranges
plt.xlim(-5, 60)
# For the y axis, we need to flip by passing in the scale values in reverse order
plt.ylim(3.0, 1.5)
# Add in labels for the axes
plt.ylabel('Bulk Density (DEN) - g/cc', fontsize=14)
plt.xlabel('Neutron Porosity (NEU) - %', fontsize=14)
plt.show()
Excellent! We now know what data is plotted on our plot, and what units they are plotted in.
Adding Colour to the Scatter Plot
We can add a third variable onto our scatterplot through the use of colour. This will allow us to gain additional insights into our data.
For this plot, we will add in the c
argument and pass it the Gamma Ray (GR) column from the dataframe.
To control the range of colours shown we need to pass in values to vmin
and vmax
. In this example, we will set these to 0 and 100.
# Set up the scatter plot
plt.scatter(x='NEU', y='DEN', data=df, c='GR', vmin=0, vmax=100)
# Change the X and Y ranges
plt.xlim(-5, 60)
# For the y axis, we need to flip by passing in the scale values in reverse order
plt.ylim(3.0, 1.5)
# Add in labels for the axes
plt.ylabel('Bulk Density (DEN) - g/cc', fontsize=14)
plt.xlabel('Neutron Porosity (NEU) - %', fontsize=14)
plt.show()
The plot is now colourful, but we do not know what the colours mean. Do the purple/blue colours represent high or low values for our third variable? Also the reader of the plot does not immediately know what the third variable means. To resolve this, we can add a colourbar.
Changing Colormap and Adding Colourbar
There are a few ways to add colorbars to our plot. As we are just using plt.scatter
which is a single figure, we can call upon plt.colorbar()
and then pass in the label we want to display alongside it.
To change the colour map we are using, we can set it to one of the ones at the webpage below using the cmap
argument in plt.scatter()
. For this example, we will use the rainbow colormap. This will allow low Gamma Ray values to appear in purple/blue and high values to appear in red.
Matplotlib has a number of built-in colormaps accessible via . There are also external libraries like [palettable] and…matplotlib.org
# Set up the scatter plot
plt.scatter(x='NEU', y='DEN', data=df, c='GR', vmin=0, vmax=100, cmap='rainbow')
# Change the X and Y ranges
plt.xlim(-5, 60)
# For the y axis, we need to flip by passing in the scale values in reverse order
plt.ylim(3.0, 1.5)
# Add in labels for the axes
plt.ylabel('Bulk Density (DEN) - g/cc', fontsize=14)
plt.xlabel('Neutron Porosity (NEU) - %', fontsize=14)
# Make the colorbar show
plt.colorbar(label='Gamma Ray - API')
plt.show()
Now, we have a much better looking plot. We have our axes labelled, and our colourbar plotted and labelled.
Next we will see how we can style it further by using style sheets.
Adding Gridlines & Plot Styling
Style sheets allow us to control the look and feel of the plots. You can find a full list of examples on the matplotlib website at:
To set a style sheet we can use plt.style.use('bmh')
. ‘bmh’ is a particular style that can be found in the reference link above.
#Set the style sheet to bmh
plt.style.use('bmh')
# Set up the scatter plot
plt.scatter(x='NEU', y='DEN', data=df, c='GR', vmin=0, vmax=100, cmap='rainbow')
# Change the X and Y ranges
plt.xlim(-5, 60)
# For the y axis, we need to flip by passing in the scale values in reverse order
plt.ylim(3.0, 1.5)
# Add in labels for the axes
plt.ylabel('Bulk Density (DEN) - g/cc', fontsize=14)
plt.xlabel('Neutron Porosity (NEU) - %', fontsize=14)
plt.colorbar(label='Gamma Ray - API')
plt.show()
Changing the Data
If we wanted to view other curves on our plot, we can swap our variables in the plt.scatter
line. In this example, we have switched the NEU data for AC (Acoustic Compressional Slowness). Once we have done this, we can quickly update the scales and the labels.
#Set the style sheet to bmh
plt.style.use('bmh')
# Set up the scatter plot
plt.scatter(x='AC', y='DEN', data=df, c='GR', vmin=0, vmax=100, cmap='rainbow')
# Change the X and Y ranges
plt.xlim(40, 240)
# For the y axis, we need to flip by passing in the scale values in reverse order
plt.ylim(3.0, 1.5)
# Add in labels for the axes
plt.ylabel('Bulk Density (DEN) - g/cc', fontsize=14)
plt.xlabel('Acoustic Compressional (AC) - us/ft', fontsize=14)
plt.colorbar(label='Gamma Ray - API')
plt.show()
This gives us a plot that is formatted in the same way as the density vs neutron porosity scatter plot. The process of maintaining a standard plot format for our data can bring together the look and feel of a report. Additionally, as we are re-using code, we could create a function that will take a few arguments and will save us time by eliminating repetition.
Summary
In this short tutorial, we have covered the basics of how to display a scatter plot / crossplot of well log data, how to improve it by adding labels, and by adding a colour bar to provide additional information. This gives us a consistent and visually appealing plot that we can present to others in the form of presentations or in technical reports.
Thanks for reading!
If you have found this article useful, please feel free to check out my other articles looking at various aspects of Python and well log data. You can also find my code used in this article and others at GitHub.
If you want to get in touch you can find me on LinkedIn or at my website.
Interested in learning more about python and well log data or petrophysics? Follow me on Medium.