Combining Formation Data With Well Log Measurements in Pandas
When working with subsurface data we often deal with datasets that have been sampled in different ways. For example, well log measurements are continuously recorded over intervals of the subsurface at regular increments (e.g. measurements every 0.1 m), whereas formation tops are single depth points.
Within this article, we will cover a way of integrating geological formation tops data with well log measurements. This will allow us to use that data for future machine learning processes and for data visualisation.
Video Tutorial
A video version of this tutorial can be found on my YouTube channel and is linked below.
Data Used Within This Tutorial
The data used within this tutorial is a subset of the Volve Dataset that was released by Equinor in 2018. Full details of the dataset, including licence, can be found at the link below.
https://www.equinor.com/energy/volve-data-sharing
The Volve data license is based on CC BY 4.0 license. Full details of the license agreement can be found here:
Importing Libraries and Data
The first step of this tutorial is to import the libraries we are going to be working with. In this example, we will be using a combination of pandas for loading and storing our formation data, and lasio for loading our well log data from a las file.
import lasio
import pandas as pd
Next, we will begin importing our well log data. To do this we need to call upon lasio.read()
and pass in the path and file name of the .las file.
As we are going to be working within pandas, we need to convert the lasioobject to a dataframe and then reset the index. This takes the current index, which is set to depth, and places it in the dataframe as a new column.
df_19SR = lasio.read('Data/15-9-19_SR_COMP.las').df()
df_19SR.reset_index(inplace=True)
Loading Formation Data
Often the formation data is stored within a simple table in a csv file, with the formation name and associated depth. As a result, we can use pandas to load in the csv by using pd.read_csv()
and passing in the file name.
In this example, the file does not have a header row, so one needs to be added and we can assign names by using the names
argument.
df_19SR_formations = pd.read_csv('Data/Volve/15_9_19_SR_TOPS_NPD.csv', header=None, names=['Formation', 'DEPT'])df_19SR_formations['DEPT'] = df_19SR_formations['DEPT'].astype(float)df_19SR_formations
Once the formation data has been loaded into a dataframe, we can view that dataframe by calling upon its name: df_19SR_formations
, which returns the following:
Merging Well Log and Formation Tops Data
Now have two dataframes containing the data we are wanting to work with, we need to combine them. We can achieve this by creating a function, and then using the .apply()
method to check each depth value and see what formation should occur at that depth level.
In the function below we are first creating lists of the formation depths and names.
Then we loop through each of these and check for three conditions:
- If we are at the last formation (item) within the list
- If we are at a depth before the first formation (item) in the list
- If we are between two formation depths
def add_formations_to_df(depth_value:float) -> str:
formation_depths = df_19SR_formations['DEPT'].to_list()
formation_names = df_19SR_formations['Formation'].to_list()
for i, depth in enumerate(formation_depths):
# Check if we are at last formation
if i == len(formation_depths)-1:
return formation_names[i]
# Check if we are before first formation
elif depth_value <= formation_depths[i]:
return ''
# Check if current depth between current and next formation
elif depth_value >= formation_depths[i] and depth_value <= formation_depths[i+1]:
return formation_names[i]
Once the function has been written, we can then create a new column in our dataframe called FORMATION, and apply the new function to the DEPT
column.
df_19SR['FORMATION'] = df_19SR['DEPT'].apply(add_formations_to_df)df_19SR
When we call upon the dataframe, we can see we have our new column with the formation data.
We can take a closer look at a specific depth range: 4339 to 4341 m to see the change in formation name.
df_19SR.loc[(df_19SR['DEPT'] >= 4339) & (df_19SR['DEPT'] <= 4341)]
As we can see above, the Skagerrak Fm starts after 4340 m, and before that, we have the Hugin Fm.
Summary
Integrating geological formation information into a well log dataset is a relatively straightforward process. Once we have this information within the main dataframe we can use it to visualise our data with respect to geology, and also utilise that information when we come to machine learning processes.
Check out this article if you would like to see how to visualise formation tops combined with well log measurements:
One Comment