python book
|

6 Python Libraries You Should Know About for Well Log Data & Petrophysics

Discover 6 great python libraries that you can start using today with well log data

One of the great things about Python is the vast number of open source libraries that have been developed to improve the way we work with data and make sense of it. Within the petrophysics and geoscience domains, there are a number of helpful libraries that can make things easier when working with well log data.

In this article, I will introduce you to 6 of my favourite and most well-used Python libraries to help with tasks such as data loading for specific file formats to visualisation of that data.

lasio — Loading, Editing and Creating .LAS Files

Log ASCII Standard (LAS) files are a common file format for storing and transferring well log data. In order to read these files a dedicated python library, called lasio, was developed by Kent Inverarity to read and write these types of files. When data is loaded into lasio, it can easily be converted to other file formats, including to a pandas dataframe.

How to Use lasio:

To install lasio, you simply open up a command prompt or terminal and type pip install lasio. Once the library has been installed you can easily import into your Jupyter Notebook by typing import lasio.

The example below illustrates how to load a las file and view the well header information.

First we need to import lasio, and then call upon lasio.read() Within the function’s brackets we pass in the path to the las file.

import lasio
las = lasio.read("15-9-19_SR_COMP.LAS")

Once the file has been loaded, we can create a simple for loop, which will loop over each item/entry within the las file well header section and print out the description, the mnemonic and it’s value

for item in las.well:
print(f"{item.descr} ({item.mnemonic}): {item.value}")

This returns:

Top Depth (STRT): 102.1568
Bottom Depth (STOP): 4636.514
Depth Increment (STEP): 0.1524
Null Value (NULL): -999.25
Field Name (FLD): Q15
NAME (WELL): 15/9-19
WELLBORE (WBN): 15/9-19 SR
COUNTRY (NATI): NOR
COUNTRY (CTRY): NOR
OPERATOR (COMP): STATOIL
PERM DATUM (PDAT): MSL
RIG NAME (COUN): NORTH SEA
STATE (STAT): NORWAY
PB WELL ID (PBWE): 15/9-19
PB WELLBORE ID (APIN): 15/9-19 SR
PB WELL NAME SET (PBWS): ALL

If we wanted to transfer the well log data from a lasio object to a pandas dataframe, we can simply do the following:

well = las.df()
well.head()

When we view the header contents of the dataframe we can see that we get back the first five rows of the data.

Returned dataframe from las.df() showing the first five rows of well log data.

Find out more about lasio:

For a more in-depth look at lasio you can find my article on working with well log las files here.

dlisio — Working With .DLIS Files

Another common file format for storing and transferring well log data is DLIS, which stands for Digitial Log Interchange Standard. These are binary files that are much more complex than LAS files and CSV files. They are capable of storing multi-dimensional array data, such as acoustic waveforms, borehole images, and nuclear magnetic resonance T2 distributions.

The dlsio python library was developed by Equinor to allow easy working with DLIS and older LIS files

How to Use dlisio:

To install dlsio, you simply open up a command prompt or terminal and type pip install dlisio. Once the library has been installed you can easily import into your Jupyter Notebook by typing import dlisio.

The example below illustrates how to load a dlis file and view the well header information.

When loading dlis files, it is worth bearing in mind that they can contain multiple sets of well log data, which are stored within logical files. This can include multiple wells, multiple datasets from the same well, and different levels of processed data. In order to account for this we need to use the syntax below, which allows the first file to be output to f and any subsequent logical files are placed into tail.

import dlisio
f, *tail = dlis.load('Data/NLOG_LIS_LAS_7857_FMS_DSI_MAIN_LOG.DLIS')

When executed, this code returns:

LogicalFile(00001_AC_WORK)
[]

To view the high-level contents of the file we can use the .describe() method. This returns information about the number of frames, channels, and objects within the Logical File. When we apply this to f we can see we have a file with 4frames and 484 channels (logging curves), in addition to a number of known and unknown objects.

f.describe()

Which returns:

------------
Logical File
------------
Description : LogicalFile(FMS_DSI_138PUP)
Frames : 4
Channels : 484
Known objects
--
FILE-HEADER : 1
ORIGIN : 1
AXIS : 50
EQUIPMENT : 27
TOOL : 5
PARAMETER : 480
CALIBRATION-MEASUREMENT : 22
CALIBRATION-COEFFICIENT : 12
CALIBRATION : 341
PROCESS : 3
CHANNEL : 484
FRAME : 4
Unknown objects
--
440-CHANNEL : 538
440-PRESENTATION-DESCRIPTION : 1
440-OP-CHANNEL : 573

As seen above, we have 4 separate frames within our data. These frames can also represent different data types, different logging passes and different stages of processed data. Each frame has it’s own properties and we can print these into an easy to read format using the following code:

for frame in f.frames:

# Search through the channels for the index and obtain the units
for channel in frame.channels:
if channel.name == frame.index:
depth_units = channel.units

print(f'Frame Name: \t\t {frame.name}')
print(f'Index Type: \t\t {frame.index_type}')
print(f'Depth Interval: \t {frame.index_min} - {frame.index_max} {depth_units}')
print(f'Depth Spacing: \t\t {frame.spacing} {depth_units}')
print(f'Direction: \t\t {frame.direction}')
print(f'Num of Channels: \t {len(frame.channels)}')
print(f'Channel Names: \t\t {str(frame.channels)}')
print('\n\n')

This returns the following summary. Which indicates that two frames exist within this file. With the first frame containing basic well log curves of bitsize (BIT), caliper (CAL), gamma ray (GR) and tension (TEN). The second frame contains the post-processed acoustic waveform data.

Frame Name: 		 60B
Index Type: BOREHOLE-DEPTH
Depth Interval: 0 - 0 0.1 in
Depth Spacing: -60 0.1 in
Direction: DECREASING
Num of Channels: 77
Channel Names: [Channel(TDEP), Channel(BS), Channel(CS), Channel(TENS), Channel(ETIM), Channel(DEVI), Channel(P1AZ_MEST), Channel(ANOR), Channel(FINC), Channel(HAZI), Channel(P1AZ), Channel(RB), Channel(SDEV), Channel(GAT), Channel(GMT), Channel(ECGR), Channel(ITT), Channel(SPHI), Channel(DCI2), Channel(DCI4), Channel(SOBS), Channel(DTCO), Channel(DTSM), Channel(PR), Channel(VPVS), Channel(CHR2), Channel(DT2R), Channel(DTRP), Channel(CHRP), Channel(DTRS), Channel(CHRS), Channel(DTTP), Channel(CHTP), Channel(DTTS), Channel(CHTS), Channel(DT2), Channel(DT4P), Channel(DT4S), Channel(SPCF), Channel(DPTR), Channel(DPAZ), Channel(QUAF), Channel(DDIP), Channel(DDA), Channel(FCD), Channel(HDAR), Channel(RGR), Channel(TIME), Channel(CVEL), Channel(MSW1), Channel(MSW2), Channel(FNOR), Channel(SAS2), Channel(SAS4), Channel(PWF2), Channel(PWN2), Channel(PWF4), Channel(PWN4), Channel(SVEL), Channel(SSVE), Channel(SPR2), Channel(SPR4), Channel(SPT4), Channel(DF), Channel(CDF), Channel(CLOS), Channel(ED), Channel(ND), Channel(TVDE), Channel(VSEC), Channel(CWEL), Channel(AREA), Channel(AFCD), Channel(ABS), Channel(IHV), Channel(ICV), Channel(GR)]
Frame Name: 		 10B
Index Type: BOREHOLE-DEPTH
Depth Interval: 0 - 0 0.1 in
Depth Spacing: -10 0.1 in
Direction: DECREASING
Num of Channels: 4
Channel Names: [Channel(TDEP), Channel(IDWD), Channel(TIME), Channel(SCD)]
Frame Name: 		 1B
Index Type: BOREHOLE-DEPTH
Depth Interval: 0 - 0 0.1 in
Depth Spacing: -1 0.1 in
Direction: DECREASING
Num of Channels: 84
Channel Names: [Channel(TDEP), Channel(TIME), Channel(EV), Channel(BA28), Channel(BA17), Channel(BB17), Channel(BC13), Channel(BD13), Channel(BB28), Channel(BA13), Channel(BB13), Channel(BC17), Channel(BD17), Channel(BA22), Channel(BA23), Channel(BA24), Channel(BC28), Channel(BA25), Channel(BA26), Channel(BA27), Channel(BA11), Channel(BA12), Channel(BA14), Channel(BA15), Channel(BA16), Channel(BA18), Channel(BA21), Channel(BC11), Channel(BC12), Channel(BC14), Channel(BC15), Channel(BC16), Channel(BC18), Channel(BC21), Channel(BC22), Channel(BC23), Channel(BC24), Channel(BC25), Channel(BC26), Channel(BC27), Channel(BB22), Channel(BB23), Channel(BB24), Channel(BD28), Channel(BB25), Channel(BB26), Channel(BB27), Channel(BB11), Channel(BB12), Channel(BB14), Channel(BB15), Channel(BB16), Channel(BB18), Channel(BB21), Channel(BD11), Channel(BD12), Channel(BD14), Channel(BD15), Channel(BD16), Channel(BD18), Channel(BD21), Channel(BD22), Channel(BD23), Channel(BD24), Channel(BD25), Channel(BD26), Channel(BD27), Channel(SB1), Channel(DB1), Channel(DB2), Channel(DB3A), Channel(DB4A), Channel(SB2), Channel(DB1A), Channel(DB2A), Channel(DB3), Channel(DB4), Channel(FCAX), Channel(FCAY), Channel(FCAZ), Channel(FTIM), Channel(AZSNG), Channel(AZS1G), Channel(AZS2G)]
Frame Name: 		 15B
Index Type: BOREHOLE-DEPTH
Depth Interval: 0 - 0 0.1 in
Depth Spacing: -15 0.1 in
Direction: DECREASING
Num of Channels: 12
Channel Names: [Channel(TDEP), Channel(TIME), Channel(C1), Channel(C2), Channel(U-MBAV), Channel(AX), Channel(AY), Channel(AZ), Channel(EI), Channel(FX), Channel(FY), Channel(FZ)]

Find out more about dlisio:

For a more in-depth look at working with DLIS files and the dlisio library check out my article Loading Well Log Data From DLIS using Python

Or you can see the video at:

welly — Dedicated Python Library for Well Log Data

The welly library was developed by Agile Scientific to help with loading, processing, and analysing well log data from a single well or multiple wells.

The library allows exploration of the metadata found within the headers of las files and also contains a plotting function to display a typical well log. Additionally, the welly library contains tools for identifying and handling data quality issues.

The Welly library can be found at the Agile Geoscience GitHub at https://github.com/agile-geoscience/welly

How to Use welly:

To install welly, you simply open up a command prompt or terminal and type pip install welly. Once the library has been installed we can begin to import specific modules of the welly library. For this example, we will work with the Well and Curve modules. These modules are used to work with well log data and with individual curves.

from welly import Well
from welly import Curve

Our LAS file can be loaded in using the Well.from_las() method. This will create a new well object.

well = Well.from_las('Data/15_19_F1B_WLC_PETRO_COMPUTED_INPUT_1.LAS')

Now that our data has been loaded in we can begin exploring the contents and metadata for the selected well. If we call upon our well object we will be presented with a summary table which contains the wellname, location, coordinates, and a list of curve mnemonics.

well
Well log header information generated by the welly python library.

If we want to have a closer look at one of the logging curves we can do so by passing in the name of the curve like so:

well.data['GR']
Individual well log curve header generated from the welly python library.

Find out more about welly:

For a more in-depth look at this library check out my article: Exploring Well Log Data Using the Welly Python Library

Or check out my YouTube series on welly at the following playlist:

missingno — Identify Missing Data

Missing data within well log measurements is a very common issue faced by many petrophysicists and geoscientists when working with well log data. Data can be missing for a variety of reasons, including tool & data vintage, tool sensor problems, tool failures etc.

The missingno python library is extremely useful, but very simple to use.

How to Use missingno:

To install missingno, you simply open up a command prompt or terminal and type pip install missingno.

Once the library has been installed we can import missingno along with pandas using the following conventions. We can also load in a CSV file to demonstrate the power of the missingno library.

import pandas as pd
import missingno as msno
df = pd.read_csv('xeek_train_subset.csv')

Within the missingno library, there are four types of plots for visualising data completeness: the barplot, the matrix plot, the heatmap, and the dendrogram plot. Each has its own advantages for identifying missing data.

For this article we will look at the barplot.

The barplot provides a simple plot where each bar represents a column within the dataframe. The height of the bar indicates how complete that column is, i.e, how many non-null values are present. It can be generated by calling upon:

msno.bar(df)

On the left side of the plot, the y-axis scale ranges from 0.0 to 1.0, where 1.0 represents 100% data completeness. If the bar is less than this, it indicates that we have missing values within that column.

On the right side of the plot, the scale is measured in index values. With the top right representing the maximum number of rows within the dataframe.

Along the top of the plot, there are a series of numbers that represent the total count of the non-null values within that column.

In this example we can see that a number of the columns (DTS, DCAL and RSHA) have a large amount of missing values. Other columns (e.g. WELL, DEPTH_MD and GR) are complete and have the maximum number of values.

Find out more about missingno:

For more information on the missingno library check out: Using the missingno Python library to Identify and Visualise Missing Data Prior to Machine Learning

Or you can watch the following video on my YouTube channel:

pandas — Working With Tabular Data

The pandas library is one of the most famous Python libraries for working with data.

How to Use pandas:

To install pandas, you simply open up a command prompt or terminal and type pip install pandas.

Once the library has been installed we can import pandas using the following convention:

import pandas as pd

If we have a CSV file containing data, such as regular core analysis (RCA) or deviation survey data we can simply load it in by doing:

df = pd.read_csv('data/spwla_volve_data.csv')

We can call upon .info()to provide a list of all of the columns within the dataframe, their data type (e.g, float, integer, string, etc.), and the number of non-null values contained within each column.

df.info()
RangeIndex: 27845 entries, 0 to 27844
Data columns (total 16 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 wellName 27845 non-null object
1 MD 27845 non-null float64
2 BS 27845 non-null float64
3 CALI 27845 non-null float64
4 DT 5493 non-null float64
5 DTS 5420 non-null float64
6 GR 27845 non-null float64
7 NPHI 27845 non-null float64
8 RACEHM 27845 non-null float64
9 RACELM 27845 non-null float64
10 RHOB 27845 non-null float64
11 RPCEHM 27845 non-null float64
12 RPCELM 27600 non-null float64
13 PHIF 27736 non-null float64
14 SW 27736 non-null float64
15 VSH 27844 non-null float64
dtypes: float64(15), object(1)
memory usage: 3.4+ MB

The next useful set of methods available to us is the head() and .tail() functions. These return the first / last five rows of the dataframe

df.head()
The first five rows of the dataframe of well log measurements.
df.tail()
The last five rows of the dataframe of well log measurements.

Find out more about pandas:

For a more in-depth look at a number of pandas related topics, you can check out my articles: Exploring Well Log Data Using Pandas, Matplotlib, and Seaborn

If you are interested in seeing what can be done using the pandas library, I put together this short video on how fill in missing data using pandas:

matplotlib — Data Visualisation

matplotlib is one of my favourite Python libraries for visualising well log data. Once the basics of how it works are understood it can be very powerful when working with well log data. Matplotlib is one of the most popular Python libraries for data visualisation and exploration, and is used by many data scientists, Python coders and machine learning enthusiasts.

The library can be used to generate well log plots, box plots scatter plots (crossplots) with a few lines of code.

How to Use matplotlib:

To install matplotlib, you simply open up a command prompt or terminal and type pip install matplotlib.

The first stage of any python project or notebook is to import the required libraries. In this case, we are going to be using lasio to load our las file, pandas for storing our well log data, and matplotlib for visualising our data.

import pandas as pd
import lasio
import matplotlib.pyplot as plt

To read the data we will use the lasio library which we explored in the previous notebook and video.

las = lasio.read("Data/15-9-19_SR_COMP.LAS")

We can easily create a simple plot by calling upon df.plot() and passing in two of our columns

df.plot('GR', 'DEPTH')
Simple line plot of Gamma Ray versus Depth. 

With more code, we can convert the simple plot above into a more comprehensive log plot, like the one below

Final well log plot showing Gamma Ray in track 1, Resistivity in track 2 on a logarithmic scale, and density/neturon in track 3 each with different scales

Find out more about matplotlib:

To find out more about using matplotlib with well log data check out the articles below:

Or checkout my matplotlib playlist on YouTube:

Conclusion

There are many great libraries available for Python, the ones listed in this article: lasio, dlisio, welly, missingno, pandas and matplotlib are a great starting point for when you come to work with well log data in Python. I highly recommend checking them out and exploring their capabilities.

Similar Posts

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *