3  Well Data File Formats

3.1 Introduction

Well log and petrophysical data come in several data formats. In many of my articles I have shared, we have mainly worked with CSV and LAS files. These formats are simple and easy to work with due to their flat structure. However, these files work well for simple logs, but not for array data. In LAS and CSV files, arrays get split into multiple columns rather than stored as a single block. DLIS files were designed to handle this complexity.

3.2 LAS Files

3.2.1 What is a LAS file?

At its core, a LAS file is simply a plain-text file designed to store well log data in a way that can be shared between companies, software packages, and decades of technology change.

The name comes from the Log ASCII Standard. That last bit matters: this isn’t a binary or fancy proprietary format. It’s raw text, which makes it human readable, scriptable, and durable.

This also means that they are flat files that can easily be opened and read within a simple text editor allowing you to read the contents without any specialised software.

This simplicity is the reason LAS has survived for so long, and also the reason it sometimes feels a bit too simplistic, especially when it comes to working with multi-dimensional arrays.

A typical LAS file contains:

  • Metadata about the well, including field name, well name, location, and company
  • Sometimes more detailed well metadata including information about mud types and processing parameters
  • Information about each log curve, including units, descriptions and mnemonics
  • Depth or time indexed logging measurements

3.2.2 A short history of LAS

The LAS standard emerged in the late 1980s, championed by industry groups like the Canadian Well Logging Society. At the time, the problem was painfully simple: companies were exchanging well logs on floppy disks, magnetic tapes, even printed listings. Without a solid standard, data would get mangled in transit, curves renamed, units lost, depth references shifted.

So the industry codified something basic, but workable across a variety of platforms. The idea wasn’t to build a full data model, it was to reduce the risk of information loss every time data changed hands.

Over the years a few versions of LAS have appeared:

LAS 1.2: The oldest variant released in 1989. It’s rigid and minimal. Many older files still in circulation are LAS 1.2.

LAS 2.0: The most commonly used today and released in 1992. It introduces more flexibility, better curve headers, clearer units, slightly more formal metadata handling. LAS 2.0 remains the most dominant version in use today.

LAS 3.0: Ambitious and richer version of LAS, with support for objects and more complex metadata released in 1999. In principle it modernises what LAS can express, but in practice adoption has been limited. Many tools and workflows still treat LAS 3.0 support as optional or partial.

In practice, the real world of LAS is messy: the standard sets the intended structure, but files from different vendors or vintage wells often bend those rules. Part of working with LAS (especially in code) is recognising that reality.

3.2.2.1 Why LAS endures

Despite its age and quirks, LAS is still the lingua franca of well logs for a few reasons:

  • It’s text. You don’t need special software to inspect it.
  • It’s ubiquitous. Virtually every subsurface package supports the basics.
  • It’s simple. There’s only so much you can do wrong before a human notices.

That doesn’t make it perfect, but it just makes it practical. And that’s why libraries like lasio exist: to bridge between this old-school format and modern Python workflows.

3.2.3 The LAS 2.0 format

Most LAS files you’ll encounter today follow LAS version 2.0. Understanding what LAS 2.0 expects helps explain both why files are structured the way they are.

At a high level, a LAS 2.0 file is a structured ASCII text file. It contains:

  • header made up of multiple sections
  • Followed by a single block of log data

LAS 2.0 is deliberately conservative about character encoding. It allows:

  • Carriage return (ASCII 13) and line feed (ASCII 10) for new lines
  • Standard printable ASCII characters (ASCII 32 to ASCII 126)

Having this standard matters, as LAS files were designed to survive being moved between operating systems, software packages, and even through decades. If a LAS file breaks, it’s usually because something ignored this rule or there has been .

3.2.3.1 One continuous interval per file

A key constraint in LAS 2.0 is that each file contains only one continuous data interval.

In practical terms:

  • A main pass and a repeat pass should be separate files
  • You shouldn’t expect multiple depth intervals stitched together in one ~A section

Real data doesn’t always behave, but this assumption is baked into many tools — including how libraries like lasio interpret the file.

3.2.3.2 File naming and recognition

LAS files conventionally end with .las.

3.2.3.3 Sections and the tilde rule

A LAS file is divided into multiple sections, each introduced by a line beginning with a tilde (~) as the first non-space character.

Example LAS file header from the Volve dataset

The character immediately after the tilde identifies the section type. In LAS 2.0, the reserved section identifiers are:

  • ~V — Version information: This tells you what version the LAS file is written in and how the rest of the file should be interpreted.
  • ~W — Well information: This section contains metadata about the well and the logging run, not the logs themselves. This includes the location of the well, start and stop depth of the file, field name.
  • ~C — Curve information: This section defines what each curve actually is and is split into: Curve Mnemonic, Units and a Short Description
  • ~P — Parameters: This section provides information about run-level parameters, tool settings, environmental corrections and processing constants. However, this section is not always present.
  • ~O — Other information: This section can include processing comments, notes from the logging engineer, remarks. This section may also be left blank or absent from the file.
  • ~A — ASCII data: This is the section that contains all of the measurements for the curves listed in the ~C section. Each row represents one depth level. In some instances data rows can be wrapped to reduce the amount of horizontal scrolling required.

Example of the ASCII data section from a LAS file

Each of these sections may appear only once per file.

Custom sections are allowed, but they must appear:

  • After the ~V section
  • Before the final ~A section

This ordering is important.

3.2.3.4 Comments and control characters

LAS uses two special characters at the start of a line:

  • # marks a comment
  • ~ marks the start of a section

Everything else is treated as content.

3.2.3.5 Header line structure

Several header sections — VERSION, WELL, CURVE, and PARAMETER use a specific line structure built around delimiters.

Each line is split using:

  1. The first dot (.)
  2. The first space after that dot
  3. The final colon (:)

This gives you, in order:

  • A mnemonic
  • Units
  • A value
  • A description

You don’t need to memorise the delimiters, but it helps to know they exist. When headers look odd, or units go missing, it’s usually because one of these delimiters has been missed.

3.2.4 Reading LAS Files with Python Using LASIO

There are a number of Python libraries available that can work with LAS files, but the most common is lasio.

lasio library developed by Kent Inverarity, to load a las file into Python and then explore its contents.

3.2.5 Installing and importing lasio

After understanding how LAS files are structured, the next step is load the data into Python in a way that can preserve this structure without flattening it into an anonymous table.

If you don’t already have lasio installed, it’s available via pip:

pip install lasio

Once installed, it is common to work with lasio alongside numpy and pandas:

import pandas as pd  
import matplotlib.pyplot as plt  
import lasio

3.2.6 Reading a LAS file

Reading a LAS file is very simple, and can be done using the .read() method from lasio:

las = lasio.read("path/to/well.las")

At this point lasio has simply parsed the file according to the LAS standard and exposed its contents. There is no resampling to changing of the data happening.

If you are working with older files or vendor exports, you may occasionally need to specify an encoding explicitly:

las = lasio.read("path/to/well.las", encoding="latin-1")

3.2.7 A Quick Contents Check

Before touching the las data, it’s worth asking a basic question: what did I actually load?

You can do that in a few simple ways.

A simple print statement will return back the lasio object

print(las)
<lasio.las.LASFile object at 0x000001383BA7E100>

Which doesn’t not reveal very much, but shows that a lasio LASFile object has been created.

To see what curves are available:

[c.mnemonic for c in las.curves]
['DEPTH',  
 'LFP_AI',  
 'LFP_AI_B',  
 'LFP_AI_G',  
 'LFP_AI_LOG',  
 'LFP_AI_O',  
 'LFP_AI_V',  
 'LFP_API',  
 'LFP_BADDATA',  
 'LFP_BVWE',  
 'LFP_BVWT',  
 'LFP_CALI',  
 'LFP_COAL',  
 'LFP_DT',  
 'LFP_DT_B',  
 'LFP_DT_G',  
 'LFP_DT_LOG',  
 'LFP_DT_O',  
 'LFP_DT_SYNT',  
 'LFP_DT_V',  
 'LFP_DTCORFSFLAG',  
 'LFP_DTLOGFLAG',  
 'LFP_DTS',  
 'LFP_DTS_B',  
 'LFP_DTS_G',  
...  
 'LFP_VSHDRY',  
 'LFP_VSHDRYC',  
 'LFP_VSHDRYWC',  
 'LFP_VSHGR',  
 'LFP_WATER']

And to inspect curve names, units, and descriptions together:

for c in las.curves:  
    print(f"{c.mnemonic:>8}  {c.unit:>8}  {c.descr}")
DEPTH  M  Measured Depth  
LFP_AI  kPa.s/m  v1  
LFP_AI_B  kPa.s/m  v1  
LFP_AI_G  kPa.s/m  v1  
LFP_AI_LOG  kPa.s/m  v1  
LFP_AI_O  kPa.s/m  v1  
LFP_AI_V  kPa.s/m  v1  
LFP_API  g/cm3  v1  
LFP_BADDATA  unitless  v1  
LFP_BVWE  v/v_decimal  v1  
LFP_BVWT  v/v_decimal  v1  
LFP_CALI  inches  v1  
LFP_COAL  unitless  v1  
LFP_DT  us/ft  v0 (auto-composite)  
LFP_DT_B  us/ft  v1  
LFP_DT_G  us/ft  v1  
LFP_DT_LOG  us/ft  v0 (auto-composite)  
LFP_DT_O  us/ft  v1  
LFP_DT_SYNT  us/ft  v1  
LFP_DT_V  us/ft  v1  
LFP_DTCORFSFLAG  unitless  v1  
LFP_DTLOGFLAG  unitless  v1  
LFP_DTS  us/ft  v0 (auto-composite)  
LFP_DTS_B  us/ft  v1  
LFP_DTS_G  us/ft  v1  
...  
LFP_VSHDRYC  v/v_decimal  v1  
LFP_VSHDRYWC  v/v_decimal  v1  
LFP_VSHGR  v/v_decimal  v1  
LFP_WATER  unitless  v1

This is often the first place you discover duplicated curves, unexpected units, or naming inconsistencies.

3.2.8 Understanding the index curve

LAS files typically use depth (or time) as an index. lasio makes this explicit:

las.index  
las.index_unit
array([3500.0183, 3500.1707, 3500.323 , ..., 4094.6831, 4094.8354,  
       4094.9878])  
'M'

Knowing the index curve and its units early avoids subtle mistakes later, especially when combining data from multiple wells.

3.2.9 Inspecting Header Metadata

We can also inspect the header metadata from the LAS file, for example the well section (**~W**):

for item in las.well:  
    print(item.mnemonic, item.unit, item.value, "-", item.descr)
STRT M 3500.0183 - START DEPTH  
STOP M 4094.9878 - STOP DEPTH  
STEP M 0.1524 - STEP  
NULL  -999.25 - NULL VALUE  
COMP  STATOIL PETROLEUM AS - COMPANY  
WELL  15/9-19 - WELL  
FLD  VOLVE - FIELD  
LOC  UNKNOWN - LOCATION  
CNTY  UNKNOWN - COUNTY  
STAT  UNKNOWN - STATE  
CTRY  Norway - COUNTRY  
SRVC  UNKNOWN - SERVICE COMPANY  
DATE  UNKNOWN - LOG DATE  
UWI  NO 15/9-19 A - UNIQUE WELL ID  
XCOORD  1.928158 - SURFACE X  
YCOORD  58.435286 - SURFACE Y  
LAT  58.435286 - LATITUDE  
LON  1.928158 - LONGITUDE  
ELEV M 25.0 - SURFACE ELEV  
ELEV_TYPE  KB - ELEV TYPE

The header is where you’ll often find:

  • Start and stop depths
  • Step size
  • Field name
  • Company name
  • Well Location
  • The NULL value used in the file

That NULL value is particularly important:

null_item = las.well.get("NULL")  
null_item
HeaderItem(mnemonic="NULL", unit="", value="-999.25", descr="NULL VALUE")

This tells us that any -999.25 values in the data section should be treated as missing data.

3.2.10 Accessing Curve Data

You can access individual curves by mnemonic:

las["GR"]
array([36.621, 36.374, 30.748, ...,    nan,    nan,    nan])

Or pull multiple curves together into a numpy array:

curves = ["DEPTH", "GR", "DTS"]  
data = np.vstack([np.asarray(las[c]) for c in curves]).T
array([[3500.0183,   36.621 ,  157.1754],  
       [3500.1707,   36.374 ,  158.9566],  
       [3500.323 ,   30.748 ,  159.7642],  
       ...,  
       [4094.6831,       nan,  128.407 ],  
       [4094.8354,       nan,  127.217 ],  
       [4094.9878,       nan,  127.758 ]])

Depth values are available separately via the index:

depth = np.asarray(las.index)
array([3500.0183, 3500.1707, 3500.323 , ..., 4094.6831, 4094.8354,  
       4094.9878])

At this stage you are working with raw numerical arrays, but still tied back to curve definitions and metadata.

3.2.11 Displaying LAS File Data in Other Formats

3.2.11.1 Displaying curve data and information using pandas

For the majority of data analysis tasks, a pandas DataFrame is a common and is also often the the most convenient way to represent tabular data.

lasio provides a way to convert the data section to a DataFrame directly:

df = las.df()  
df.head()

LAS Curve Data represented as a DataFrame using lasio.

By default:

  • The index is depth or time
  • Columns are curve mnemonics

If you prefer the index as a column:

df = df.reset_index()

Curve Data from a LAS file in a pandas DataFrame.

In addition to creating DataFrames of the curve data, we can quickly and easily construct them using the other metadata. For example, if we want to present the Curve Information section as a DataFrame:

curve_table = pd.DataFrame(  
    [{"mnemonic": c.mnemonic, "unit": c.unit, "description": c.descr}  
     for c in las.curves]  
)  
curve_table

Curve Information section from a LAS file as a pandas DataFrame

3.2.11.2 Displaying curve information as a rich table

Once you start inspecting LAS files regularly, printing curve metadata line by line gets old quickly. It works, but it’s not especially readable, particularly when you’re dealing with dozens of curves or comparing files.

This is where the rich library comes in handy. It lets you display structured information in the terminal in a way that’s clear, readable, and surprisingly effective for quick sanity checks.

If you don’t already have it installed:

pip install rich

Then import the bits we need:

from rich.console import Console  
from rich.table import Table

We can take the curve metadata already exposed by lasio and render it as a formatted table:

console = Console()  
table = Table(title="LAS Curve Summary")  
table.add_column("Mnemonic", style="cyan", no_wrap=True)  
table.add_column("Unit", style="green")  
table.add_column("Description", style="white")  
for c in las.curves:  
    table.add_row(  
        c.mnemonic,  
        c.unit or "",  
        c.descr or ""  
    )  
console.print(table)

What you get is a clean, scrollable summary of:

  • Curve mnemonics
  • Units
  • Descriptions

rich output showing the contents of the curve table

This puts all of the mnemonics and descriptions in one place, without having to mentally align columns of printed text manually.

3.3 DLIS Files

The Digital Log Interchange Standard (DLIS) is a structured binary format for storing well information and log data. It was developed by Schlumberger in the late 1980s and later published by the American Petroleum Institute in 1991 to provide a standardised format.

Working with DLIS can be awkward. The standard is decades old, and different vendors often add their own twists with extra data structures or object types.

A DLIS file typically holds large amounts of well metadata along with the actual log data. The data itself lives inside Frames, these are table-like objects representing passes, runs, or processing stages (e.g. Raw or Interpreted). Each frame has columns called channels, which are the individual logging curves. Channels can be single- or multi-dimensional, depending on the tool and measurement.

DLIS Structure from Viggen, Hårstad, and Kvalsvik (2020)

DLISIO is a python library that has been developed by Equinor ASA to read DLIS files and Log Information Standard79 (LIS79) files. Details of the library can be found here.

3.3.1 Using DLISIO

The library can be installed by using the following command:

pip install dlisio

3.3.2 Opening a DLIS File

Like most binary formats, you can’t just open a DLIS file in a text editor and scroll through the contents like we can with las and csv files. DLISIO handles the decoding of the binary file for you.

The first step when working with DLISIO is to load the file and check what’s inside.

from dlisio import dlis
with dlis.load("NLOG_LIS_LAS_7857_FMS_DSI_MAIN_LOG.DLIS") as (logical_files, *tail):
    print(logical_files)
    print(logical_files.describe())

A DLIS file can very often contain one or more logical files. So it is always good practice to check to see what is within them.

3.3.3 Exploring frames

Each logical file is organised into frames. You can think of a frame as a table that stores log data from a particular pass, run, or processing stage. For example, you might have one frame for the raw field data and another for an interpreted or processed version of the same run.

Let’s take the first logical file and look at its frames:

with dlis.load("NLOG_LIS_LAS_7857_FMS_DSI_MAIN_LOG.DLIS") as (logical_files, *tail):
    for frame in logical_files.frames:
        print(frame.describe())

This will list all the frames available. A file might only have one frame, but it’s common to see several like in the example below. Understanding which frame you need is an important first step before pulling out data.

By using the describe() method on the frame we are able to see all of the contents like below.

-----
Frame
-----
name   : 60B
origin : 41
copy   : 0
Channel indexing
--
Indexed by       : BOREHOLE-DEPTH
Index units      : 0.1 in
Index min        : 0 [0.1 in]
Index max        : 0 [0.1 in]
Direction        : DECREASING
Constant spacing : -60 [0.1 in]
Index channel    : Channel(TDEP)
Channels
--
TDEP      BS        CS        TENS      ETIM      DEVI      P1AZ_MEST ANOR
FINC      HAZI      P1AZ      RB        SDEV      GAT       GMT       ECGR
ITT       SPHI      DCI2      DCI4      SOBS      DTCO      DTSM      PR
VPVS      CHR2      DT2R      DTRP      CHRP      DTRS      CHRS      DTTP
CHTP      DTTS      CHTS      DT2       DT4P      DT4S      SPCF      DPTR
DPAZ      QUAF      DDIP      DDA       FCD       HDAR      RGR       TIME
CVEL      MSW1      MSW2      FNOR      SAS2      SAS4      PWF2      PWN2
PWF4      PWN4      SVEL      SSVE      SPR2      SPR4      SPT4      DF
CDF       CLOS      ED        ND        TVDE      VSEC      CWEL      AREA
AFCD      ABS       IHV       ICV       GR
-----
Frame
-----
name   : 10B
origin : 41
copy   : 0
Channel indexing
--
Indexed by       : BOREHOLE-DEPTH
Index units      : 0.1 in
Index min        : 0 [0.1 in]
Index max        : 0 [0.1 in]
Direction        : DECREASING
Constant spacing : -10 [0.1 in]
Index channel    : Channel(TDEP)
Channels
--
TDEP IDWD TIME SCD
-----
Frame
-----
name   : 1B
origin : 41
copy   : 0
Channel indexing
--
Indexed by       : BOREHOLE-DEPTH
Index units      : 0.1 in
Index min        : 0 [0.1 in]
Index max        : 0 [0.1 in]
Direction        : DECREASING
Constant spacing : -1 [0.1 in]
Index channel    : Channel(TDEP)
Channels
--
TDEP  TIME  EV    BA28  BA17  BB17  BC13  BD13  BB28  BA13  BB13  BC17  BD17
BA22  BA23  BA24  BC28  BA25  BA26  BA27  BA11  BA12  BA14  BA15  BA16  BA18
BA21  BC11  BC12  BC14  BC15  BC16  BC18  BC21  BC22  BC23  BC24  BC25  BC26
BC27  BB22  BB23  BB24  BD28  BB25  BB26  BB27  BB11  BB12  BB14  BB15  BB16
BB18  BB21  BD11  BD12  BD14  BD15  BD16  BD18  BD21  BD22  BD23  BD24  BD25
BD26  BD27  SB1   DB1   DB2   DB3A  DB4A  SB2   DB1A  DB2A  DB3   DB4   FCAX
FCAY  FCAZ  FTIM  AZSNG AZS1G AZS2G
-----
Frame
-----
name   : 15B
origin : 41
copy   : 0
Channel indexing
--
Indexed by       : BOREHOLE-DEPTH
Index units      : 0.1 in
Index min        : 0 [0.1 in]
Index max        : 0 [0.1 in]
Direction        : DECREASING
Constant spacing : -15 [0.1 in]
Index channel    : Channel(TDEP)
Channels
--
TDEP   TIME   C1     C2     U-MBAV AX     AY     AZ     EI     FX     FY     FZ

3.3.4 Inspecting channels

Within each frame you’ll find the actual channels — the individual logging curves such as GR, RHOB, NPHI, and so on. Each channel is stored as a column in the frame’s table, with values indexed by depth or time.

Here’s how to list them:

with dlis.load("NLOG_LIS_LAS_7857_FMS_DSI_MAIN_LOG.DLIS") as (logical_files, *tail):
    for frame in logical_files.frames:
        for channel in frame.channels:
            print(channel.describe())

This will give you the names of the channels and their measurement units. Just seeing this list is useful: it tells you what curves are available before you start extracting values. It’s also a quick way to check if the file contains the curves you expect.

The output from the above can be quite lengthy, especially if you have several frames. The example below is an overview of what cam ne obtained using the above:

-------
Channel
-------
name   : BC12
origin : 41
copy   : 0
Description : CALIBRATED DATA BUTTON C12
Sample dimensions         : 1
Maximum sample dimensions : 1
Property indicators       : 440-CUSTOMER
Source                    : Tool(MESTB)

-------
Channel
-------
name   : BC14
origin : 41
copy   : 0
Description : CALIBRATED DATA BUTTON C14
Sample dimensions         : 1
Maximum sample dimensions : 1
Property indicators       : 440-CUSTOMER
Source                    : Tool(MESTB)