Exploring Well Log Data Using the Welly Python Library

A Python library dedicated to loading and exploring well log LAS files

The welly library was developed by Agile Geoscience to help with loading, processing, and analysing well log data from a single well or multiple wells. The library allows exploration of the metadata found within the headers of las files and also contains a plotting function to display a typical well log. Additionally, the welly library contains tools for identifying and handling data quality issues.

The Welly library can be found at the Agile Geoscience GitHub at https://github.com/agile-geoscience/welly

In this short tutorial, we will see how to load a well from the Volve field and exploring some of the functionality available within this library.

A video version of this article using different data can be found below.

The Dataset

The dataset we are using comes from the publicly available Equinor Volve Field dataset released in 2018. The file used in this tutorial is from well 15/19-F1B. Details on the Volve Dataset can be found here.

This tutorial forms part of my Python and Petrophysics series. Links to previous articles can be found here.

The notebook accompanying this article can found within the GitHub Repository at: https://github.com/andymcdgeo/Petrophysics-Python-Series

Importing Libraries and Data

The first step in this tutorial will be to load in the required modules, Well and Curve, from the Welly library. These modules are used to work with well log data and with individual curves.

from welly import Well
from welly import Curve

import matplotlib.pyplot as plt

Our LAS file can be loaded in using the Well.from_las() method. This will create a new well object.

well = Well.from_las('Data/15_19_F1B_WLC_PETRO_COMPUTED_INPUT_1.LAS')

Data Exploration

File and Well Information

Now that our data has been loaded in we can begin exploring the contents and metadata for the selected well. If we call upon our well object we will be presented with a summary table which contains the wellname, location, coordinates, and a list of curve mnemonics.

well

We can also call upon specific functions to access the required information.

The first is the header which will return key header information, including the well name, Unique Well Identifier (UWI), the field name and company name.

well.header

This returns:

{'name': '15/9-F-1 B', 'uwi': '', 'field': 'VOLVE', 'company': 'Statoil Petroleum'}

Let’s now have a look at the location information for this well. To do so we can call upon the .location method for our data object.

well.location

This returns a location object in the form of a dictionary.

Location({'position': None, 'crs': CRS({}), 'location': 'Maersk Inspirer', 'country': '', 'province': '', 'county': '', 'latitude': '058 26\' 29.907" N    DMS', 'longitude': '001 53\' 14.708" E    DMS', 'api': '', 'td': None, 'deviation': None})

The file we are using does not contain much information about the location of the well, but we do have information about the latitude and longitude. These can be extracted by appending .latitude and .longitude to the location method and put into an easier to read format.

lati = well.location.latitude
long = well.location.longitude

print(lati)
print(long)

Using the print function for these methods provides a nicer output to read.

058 26' 29.907" N    DMS
001 53' 14.708" E    DMS

Exploring the Data

We saw in the previous section when looking at the well header that we had a number of curves. We can get an idea of how many by calling upon the count_curves() function.

well.count_curves()

This returns a total count of 22 curves.

We can also obtain a list of the curve mnemonics within the las file using the method _get_curve_menmonics().

well._get_curve_mnemonics()

This returns a list of all mnemonics within the las file.

['ABDCQF01',
 'ABDCQF02',
 'ABDCQF03',
 'ABDCQF04',
 'BS',
 'CALI',
 'DRHO',
 'DT',
 'DTS',
 'GR',
 'NBGRCFM',
 'NPHI',
 'PEF',
 'RACEHM',
 'RACELM',
 'RD',
 'RHOB',
 'RM',
 'ROP',
 'RPCEHM',
 'RPCELM',
 'RT']

Another way to view all of the curves is by calling upon .data. This returns a dictionary object containing the well name, along with the first 3 and the last 3 values for that curve.

well.data

As seen in the example below, many of the first and last values are listed as nan, which stands for Not a Number.

{'ABDCQF01': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'ABDCQF02': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'ABDCQF03': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'ABDCQF04': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'BS': Curve([36. , 36. , 36. , ...,  8.5,  8.5,  8.5]),
 'CALI': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'DRHO': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'DT': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'DTS': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'GR': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'NBGRCFM': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'NPHI': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'PEF': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'RACEHM': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'RACELM': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'RD': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'RHOB': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'RM': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'ROP': Curve([    nan,     nan,     nan, ..., 29.9699, 29.9903,     nan]),
 'RPCEHM': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'RPCELM': Curve([nan, nan, nan, ..., nan, nan, nan]),
 'RT': Curve([nan, nan, nan, ..., nan, nan, nan])}

We can delve a little deeper into each of the curves within the las file by passing in the name of the curve like so:

well.data['GR']

This provides us with some summary statistics of the curve, such as:

what the null value is
the curve units
the curve data range
the step value of the data
the total number of samples
the total number of missing values (NaNs)
Min, Max and Mean of the curve
Curve description
A list of the first 3 and last 3 values

Data QC

Checking the quality of well log data is an important part of the petrophysics workflow.

The borehole environment can be a hostile place with high temperatures, high pressures, irregular borehole shapes etc all of which can have an impact on the logging measurements. This can result in numerous issues such as missing values, outliers, constant values and erroneous values.

The welly library comes with a number of quality control checks which will allow us to check all of the data or specific curves for issues.

The quality control checks include:

Checking for gaps / missing values : .no_nans(curve)
Checking if the entire curve is empty or not : .not_empty(curve)
Checking if the curve contains constant values : .no_flat(curve)
Checking units: check_units(list_of_units)
Checking if values are all positive : all_positive(curve)
Checking if curve is within a range : all_between(min_value, max_value)

The full list of methods can be found within the Welly help documents at: https://code.agilescientific.com/welly/

Before we start we will need to import the quality module like so:

import welly.quality as wq

Before we run any quality checks we first need to create a list of what tests we want to run and on what data we want to run those tests.

To do this we can build up a dictionary, with the key being the curve(s) we want to run the checks on. If want to run it on all of the curves we need to use the key Each.

For every curve we will check if there are any flatline values, any gaps and making sure the curve is not empty. For the gamma ray (GR) and bulk density (RHOB) curves we are going to check that all of the values are positive, that they are between standard ranges and that the units are what we expect them to be.

tests = {'Each': [wq.no_flat,
                 wq.no_gaps,
                 wq.not_empty],
        'GR': [
                wq.all_positive,
                wq.all_between(0, 250),
                wq.check_units(['API', 'GAPI']),
        ],
        'RHOB': [
                wq.all_positive,
                wq.all_between(1.5, 3),
                wq.check_units(['G/CC', 'g/cm3']),
        ]}

We could run the tests as they are, however, the output is not easy to read. To make easier and nicer, we will using the HTML function from IPython.display to make a pretty table.

Once the module is imported we can create a variable called data_qc_table to store the information in. Assigned to this variable will be data.qc_table_html(tests) which generates the table from the tests dictionary we created above.

from IPython.display import HTML
data_qc_table = well.qc_table_html(tests)
HTML(data_qc_table)

After running the tests we can see that we have a coloured HTML table returned. Anything highlighted in green is True and anything in red is False.

From the table we can see that the BS (BitSize) curve failed on one of the three tests. Under the no_flat column we have a False value flagged which suggests that this curve contains constant/flat values. This has been correctly flagged as the bitsize curve measures the drill bit diameter, which will be constant for a given run or series of runs.

We can also see that a number of curves have been flagged as containing gaps.

The tests that were run just for GR and RHOB can also be seen in the table. When we run specific tests on specific curves, the remainder of the results will be greyed out.

We can run another test to identify the fraction of the data that is not nan. For this we setup a new test and apply to all curves using Each.

tests_nans = {'Each': [wq.fraction_not_nans]}

data_nans_qc_table = well.qc_table_html(tests_nans)
HTML(data_nans_qc_table)

Once we run these tests we are presented with a table similar to the one above. In the last column we have the total fraction of values for each curve this is not a nan. These values are in decimal, with a value of 1.0 representing 100% completeness. The Score column contains a rounded version of this number.

We can write a short loop and print the percentage values out for each curve. This provides a cleaner table to get an idea of missing data percentage for each curve.

print((f'Curve \t % Complete').expandtabs(10))
print((f'----- \t ----------').expandtabs(10))

for k,v in well.qc_data(tests_nans).items():
    
    for i,j in v.items():
        values = round(j*100, 2)
    print((f'{k} \t {values}%').expandtabs(10))

This returns a nice

Curve      % Complete
-----      ----------
ABDCQF01   9.72%
ABDCQF02   9.72%
ABDCQF03   9.72%
ABDCQF04   9.72%
BS         100.0%
CALI       10.6%
DRHO       10.62%
DT         12.84%
DTS        11.48%
GR         97.91%
NBGRCFM    39.73%
NPHI       10.28%
PEF        10.37%
RACEHM     25.72%
RACELM     25.72%
RD         96.14%
RHOB       10.37%
RM         96.14%
ROP        97.57%
RPCEHM     25.72%
RPCELM     25.72%
RT         25.72%

From the results we can see that a number of curves have a high percentage of missing values. This could be attributable to some of the measurements not starting until deeper in the well. We will be able to determine this in the next section with plots.

Data Plotting

Visualising well log data is at the heart of petrophysics, with log plots being one of the most common display formats. The welly library allows fast and easy generation of well log plots.

First we generate a list of data that we want to display in each track. If we want to display more than one curve in a track we can embed another list e.g. ['MD', ['DT', 'DTS']]. The curves within the inner list will be plotted on the same track and on the same scale.

Next, we can call upon the plot function and pass in the tracks list.

tracks = ['MD', 'GR', 'RHOB', 'NPHI', ['DT', 'DTS']]
well.plot(tracks=tracks)

As discussed in the data quality section, our assumption that some of the logging curves do not extend all the way to the top of the well. This is very common practice and avoids the need for and the cost of running tools from the top of the well to the bottom.

Let’s zoom in a little bit closer on the lower interval. To do this we can use a regular matplotlib function to set the y-axis limits. Note that we do need to reverse the numbers so that the deeper value is first, and the shallower one second.

tracks = ['MD', 'BS', 'CALI', 'GR', 'RHOB', 'NPHI', ['DT', 'DTS']]
well.plot(tracks=tracks)
plt.ylim(3500, 3000)

(3500.0, 3000.0)

We can see from the result that we now have a nice looking plot with very little effort.

However, the control over the plot appearance is limited with the current implementation not allowing granular control over the plot such as colours, scales and displaying curves with reversed scales (e.g. Neutron & Density curves).

Well Log Data to Pandas Dataframe

In this final section, we will look at exporting the well log data from welly to pandas. Pandas is one of the most popular libraries for storing, analysing and manipulating data.

The conversion is a simple process and can be achieved by calling .df() on our well object.

df = well.df()

We can confirm the data has been converted by calling upon the .describe() method from pandas to view the summary statistics of the data.

df.describe()

Summary

The welly library, developed by Agile-Geoscience is a great tool for working with and exploring well log files. In this example we have seen how to load a single las file, explore the meta information about the well and the curve contents, and display the log data on a log plot.

Welly has significantly more functionality that can handle multiple well logs as well as creating synthetic seismograms from the well data.

You can find and explore the welly repository here.

Thanks for reading!

If you have found this article useful, please feel free to check out my other articles looking at various aspects of Python and well log data. You can also find my code used in this article and others at GitHub.

If you want to get in touch you can find me on LinkedIn or at my website.

Interested in learning more about python and well log data or petrophysics? Follow me on Medium.

Exploring Well Log Data Using the Welly Python Library