Publications

You can find a list of my Journal and conference papers, with links to DOI or PDFs (via Researchgate) when available.

2024

  1. Fu, L.; Yu, Y.; Xu, C.; Ashby, M.; McDonald, A.; Pan, W.; Deng, T.; Szabo, I.

    Petrophysics, 65(1), 108-127

    Abstract

    Well logs are processed and interpreted to estimate in-situ reservoir properties, which are essential for reservoir modeling, reserve estimation, and production forecasting. While the traditional methods are mostly based on multimineral physics or empirical formulae, machine learning provides an alternative data-driven approach that requires much less a-priori geological or petrophysical information. From October 2021 to March 2022, the Petrophysical Data-Driven Analytics Special Interest Group (PDDA SIG) of the Society of Petrophysicists and Well Log Analysts (SPWLA) hosted a machine-learning contest aiming to develop data-driven models for estimating reservoir properties, including shale volume, porosity, and fluid saturation, based on a common set of well logs, including gamma ray, bulk density, neutron porosity, resistivity, and sonic. Log data from nine wells from the same field, together with the interpreted reservoir properties by petrophysicists, were provided as training data, and five additional wells were provided as blind test data. During the contest, various data-driven models were developed by the contestants to predict the three reservoir properties with the provided training data set. The top five performing models from the contest, on average, beat the performance of the benchmarked Random Forest model by 45% in the root-mean-square error (RMSE) score. In the paper, we will review these top-performing solutions, including their preprocessing techniques, feature engineering, and machine-learning models, and summarize their advantages and conditions.

2022

  1. McDonald, A.

    SPWLA 63rd Annual Logging Symposium

    Abstract

    Within the oil and gas industry, large volumes of data are gathered daily and will continue to grow into the future as technology develops. The quality of the data gathered has wide-ranging consequences that can impact future exploration, development, reserves estimation, and key financial decisions. Therefore, it is crucial that the data being used within petrophysical data driven machine learning models is of a high standard and free from invalid data. If poor quality data is fed into an algorithm, the output may be severely impacted in terms of statistical bias, and a reduction in accuracy and precision.

    Missing data is one of the most common issues faced when working with well log data sets. Gaps within logging curves can occur for a multitude of reasons: intermittent tool failure, complete tool failure, tool offsets recording data at different depths, and bad data being manually removed by interpreters. It is a common practice to either drop depth levels (listwise deletion) containing missing values or impute the values from empirical relationships or from machine learning models that have been developed using offset wells. Removal of depth levels containing missing values can reduce the amount of information available for training and validating data-driven machine learning algorithms. Imputation of values can introduce bias and impact the statistical distribution of the data.

    This study discusses the main causes behind missing data within well logs and the potential solutions that have been widely adopted within the data science and machine learning domains. To evaluate the impact of missing data on machine learning models, three commonly used algorithms, namely support vector regression, random forests, and artificial neural networks, were adopted for the prediction of bulk density. To understand the performance of the selected algorithms to missing data, the models were evaluated on a fixed test dataset and the training dataset was reduced in 10% increments to simulate varying levels of missing data.

    As the training dataset size is reduced, the performance of all three algorithms worsens, with results from the artificial neural network model being the most impacted and most variable. Results from the random forest algorithm were the least impacted and remained the most stable under decreasing training dataset size.

2021

  1. Arkalgud, R.; McDonald, A.; Brackenridge, R.

    SPWLA 62nd Annual Logging Symposium

    Abstract

    Automation is becoming an integral part of our daily lives as technology and techniques rapidly develop. Many automation workflows are now routinely being applied within the geoscience domain. The basic structure of automation and its success of modelling fundamentally hinges on the appropriate choice of parameters and speed of processing. The entire process demands that the data being fed into any machine learning model is essentially of good quality. The technological advances in well logging technology over decades have enabled the collection of vast amounts of data across wells and fields. This poses a major issue in automating petrophysical workflows. It necessitates to ensure that, the data being fed is appropriate and fit for purpose. The selection of features (logging curves) and parameters for machine learning algorithms has therefore become a topic at the forefront of related research. Inappropriate feature selections can lead erroneous results, reduced precision and have proved to be computationally expensive.

    Experienced Eye (EE) is a novel methodology, derived from Domain Transfer Analysis (DTA), which seeks to identify and elicit the optimum input curves for modelling. During the EE solution process, relationships between the input variables and target variables are developed, based on characteristics and attributes of the inputs instead of statistical averages. The relationships so developed between variables can then be ranked appropriately and selected for modelling process.

    This paper focuses on three distinct petrophysical data scenarios where inputs are ranked prior to modelling: prediction of continuous permeability from discrete core measurements, porosity from multiple logging measurements and finally the prediction of key geomechanical properties. Each input curve is ranked against a target feature. For each case study, the best ranked features were carried forward to the modelling stage, and the results are validated alongside conventional interpretation methods.

    Ranked features were also compared between different machine learning algorithms: DTA, Neural Networks and Multiple Linear Regression. Results are compared with the available data for various case studies. The use of the new feature selection has been proven to improve accuracy and precision of prediction results from multiple modelling algorithms.

  2. McDonald, A.

    Petrophysics: The SPWLA Journal of Formation Evaluation and Reservoir Description, 62(6), 585-613

    Abstract

    Decades of subsurface exploration and characterization have led to the collation and storage of large volumes of well-related data. The amount of data gathered daily continues to grow rapidly as technology and recording methods improve. With the increasing adoption of machine-learning techniques in the subsurface domain, it is essential that the quality of the input data is carefully considered when working with these tools. If the input data are of poor quality, the impact on precision and accuracy of the prediction can be significant. Consequently, this can impact key decisions about the future of a well or a field.

    This study focuses on well-log data, which can be highly multidimensional, diverse, and stored in a variety of file formats. Well-log data exhibits key characteristics of big data: volume, variety, velocity, veracity, and value. Well data can include numeric values, text values, waveform data, image arrays, maps, and volumes. All of which can be indexed by time or depth in a regular or irregular way. A significant portion of time can be spent gathering data and quality checking it prior to carrying out petrophysical interpretations and applying machine-learning models. Well-log data can be affected by numerous issues causing a degradation in data quality. These include missing data ranging from single data points to entire curves, noisy data from tool-related issues, borehole washout, processing issues, incorrect environmental corrections, and mislabeled data.

    Having vast quantities of data does not mean it can all be passed into a machine-learning algorithm with the expectation that the resultant prediction is fit for purpose. It is essential that the most important and relevant data are passed into the model through appropriate feature selection techniques. Not only does this improve the quality of the prediction, but it also reduces computational time and can provide a better understanding of how the models reach their conclusion.

    This paper reviews data quality issues typically faced by petrophysicists when working with well-log data and deploying machine-learning models. This is achieved by first providing an overview of machine learning and big data within the petrophysical domain, followed by a review of the common well-log data issues, their impact on machine-learning algorithms, and methods for mitigating their influence.

  3. Banas, R.; McDonald, A.; Perkins, T. J.

    SPWLA 62nd Annual Logging Symposium

    Abstract

    Subsurface analysis-driven field development requires quality data as input into analysis, modelling, and planning. In the case of many conventional reservoirs, pay intervals are often well consolidated and maintain integrity under drilling and geological stresses providing an ideal logging environment. Consequently, editing well logs is often overlooked or dismissed entirely.

    Petrophysical analysis however is not always constrained to conventional pay intervals. When developing an unconventional reservoir, pay sections may be comprised of shales. The requirement for edited and quality checked logs becomes crucial to accurately assess storage volumes in place. Edited curves can also serve as inputs to engineering studies, geological and geophysical models, reservoir evaluation, and many machine learning models employed today.

    As an example, hydraulic fracturing model inputs may span over adjacent shale beds around a target reservoir, which are frequently washed out. These washed out sections may seriously impact logging measurements of interest, such as bulk density and acoustic compressional slowness, which are used to generate elastic properties and compute geomechanical curves.

    Two classifications of machine learning algorithms for identifying outliers and poor-quality data due to bad hole conditions are discussed: supervised and unsupervised learning. The first allows the expert to train a model from existing and categorized data, whereas unsupervised learning algorithms learn from a collection of unlabeled data. Each classification type has distinct advantages and disadvantages.

    Identifying outliers and conditioning well logs prior to a petrophysical analysis or machine learning model can be a time-consuming and laborious process, especially when large multi-well datasets are considered. In this study, a new supervised learning algorithm is presented that utilizes multiple-linear regression analysis to repair well log data in an iterative and automated routine. This technique allows outliers to be identified and repaired whilst improving the efficiency of the log data editing process without compromising accuracy. The algorithm uses sophisticated logic and curve predictions derived via multiple linear regression in order to systematically repair various well logs.

    A clear improvement in efficiency is observed when the algorithm is compared to other currently used methods. These include manual processing by a petrophysicist and unsupervised outlier detection methods. The algorithm can also be leveraged over multiple wells to produce more generalized predictions. Through a platform created to quickly identify and repair invalid log data, the results are controlled through input and supervision by the user. This methodology is not a direct replacement of an expert interpreter, but complementary by allowing the petrophysicist to leverage computing power, improve consistency, reduce error and improve turnaround time.

2020

  1. Arkalgud, R.; McDonald, A.; Brackenridge, R.

    Abu Dhabi International Petroleum Exhibition & Conference (ADIPEC)

    Abstract

    Automation has impacted our everyday lives through increased speed of operations and execution of decisions. However, these processes and decisions are wholly dependent on choices made during automation model creation. Quick selection of input variables is key to the predictive modelling process; allowing for optimization of the final model. Experienced Eye, the new methodology proposed aims to identify the optimum input variables for modelling by identifying the relevant inputs and removing those that are irrelevant.

    To enable the ranking of input variables with varied ranges of spectral distributions a new methodology Experienced Eye is adopted based on Domain Transfer Analysis (DTA) techniques, novel transfer functions and advanced optimisation techniques. Once the dataset has been transformed into parametric space, Cholesky's method is adopted to resolve the matrix. During the solution process, inter-relationships between the input variables and the target variable are developed, to solve for the target variable. The comparison between variables can be adopted to rate and rank them, which will form the basic criteria, for selection in the automation cycle.

    The element of ranking random variables has been a challenging area of research and has been attempted for ages. The various statistical measures were developed to compare different variables such as mean, median and standard deviation. In the current scenario, we have a novel DTA derivative Experienced Eye, which ranks variables based on characteristics and attributes instead of traditional averages and data pooling methods. Traditional methods and procedures are applied for case studies with varied configurations. To demonstrate the method a case study was carried out using cased hole pulsed neutron data. Experienced Eye ranks variables based on physical characteristics and attributes rather than arithmetic averages which are discordant with physics of the system. Conclusions based on these numerical experiments have shown that Experienced Eye has provided satisfactory results in comparison to any other statistical or non-statistical method in market. Results are tabulated and compared with novel DTA derivative Experienced Eye. Application of this new method successfully identified the optimum input variables for the selected target variable. Once irrelevant inputs were removed, model accuracy and precision increased considerably due to the reduction of noise.

    Unique to DTA is "Trust Region Approach (TRA)", which has been developed and incorporated into the solution technique of predictability. A trust region is defined for each input variable and the target variable, in varied increments. In any given trust region, a unique solution is evolved. The solution produced is based upon the most appropriate conditionality defined by the system variables and is bound to satisfy the imposed constraints as well. This technique enables the DTA method to produce more relevant solutions than other prediction techniques.

2019

  1. Arkalgud, R.; McDonald, A.; Crombie, D.

    SPWLA 60th Annual Logging Symposium

    Abstract

    Today, many machine learning techniques are regularly employed in petrophysical modelling such as cluster analysis, neural networks, fuzzy logic, self-organising maps, genetic algorithm, principal component analysis etc. While each of these methods has its strengths and weaknesses, one of the challenges to most of the existing techniques is how to best handle the variety of dynamic ranges present in petrophysical input data. Mixing input data with logarithmic variation (such as resistivity) and linear variation (such as gamma ray) while effectively balancing the weight of each variable can be particularly difficult to manage.

    A novel method - Domain Transfer Analysis (DTA) - has been developed which uses a non-linear partial differential equation solver for predicting log curves, enabling more effective integration of disparate data types. DTA is conceived based on extensive research conducted in the field of CFD (Computational Fluid Dynamics).

    This paper is focused on the application of DTA to petrophysics and its fundamental distinction from various other statistical methods adopted in the industry. Case studies are shown, predicting porosity and permeability for a variety of scenarios using the DTA method and other techniques. The results from the various methods are compared, and the robustness of DTA is illustrated. The example datasets are drawn from public databases within the Norwegian and Dutch sectors of the North Sea, and Western Australia, some of which have a rich set of input data including logs, core, and reservoir characterisation from which to build a model, while others have relatively sparse data available allowing for an analysis of the effectiveness of the method when both rich and poor training data are available.

    The paper concludes with recommendations on the best way to use DTA in real-time to predict porosity and permeability. The future and ongoing applications of DTA for petrophysical analysis encompasses saturation, TOC, mineral volumes, and brittleness from the data that are available at varying stages of the drilling and completions process.

2018

  1. Law, S.; McDonald, A.; Castillo, E.; Mackay, E.; Fellows, S.

    SPE Europec featured at 80th EAGE Conference and Exhibition

    Abstract

    In the United Kingdom Continental Shelf (UKCS), a significant heavy oil prize of 9 billion barrels has been previously identified, but not fully developed. In the shallow unconsolidated Eocene reservoirs of Quads3 and 9, just under 3 billion barrels lie in the discovered, but undeveloped fields, of Bentley and Bressay. Discovered in the 1970s, they remain undeveloped due to the various technology challenges associated with heavy oil offshore and the presence of a basal aquifer. The Eocene reservoirs represent significant challenges to recovery due to the unconsolidated nature of the hydrocarbon bearing layers. The traditional view has been that such a nature represents a risk to successful recovery due to sand mobility; reservoir and near wellbore compaction; wormhole formation; and injectivity issues.

    We propose improving the ultimate oil recovery by a combination of aquifer water production and compaction drive. By interpreting public domain data from well logs, the range of geomechanical properties of Eocene sands have been determined. A novel approach to producing the heavy oil unconsolidated reservoirs of the UKCS is proposed by producing the aquifer via dedicated water producers situated close to the oil-water contact. The location was determined by sensitivity analysis of water producer location and production rates. By locating water producers at the OWC with a production rate of 20,000 bbls/day of fluids, the incremental recovery at the end of simulation is increased by 4.1% OOIP of the total modelrelative to the ‘no aquifer production’, casesuggesting a significant increase in recovery can be achieved by producing the aquifer. A rate of 30,000 bbld/day located at the OWC was found to increase incremental recovery by 5.8 %OOIP relative to the ‘no aquifer case’. In all cases, as the reservoir fluid pressure is reduced, oil recovery increases via compaction and reduced water influx into the oil leg. This reduced pressure leads to a higher tendency towards reservoir compaction which is expressed as a change in mean effective stress and porosity reduction.

2017

  1. Mulders, F.; Lemanczyk, R.; Johnstone, K.; Spencer, S.; Castillo, E.; McDonald, A.; Mawira, A.; Rubyanto, D.; Khadafi, A. Nugraha

    SPE/IATMI Asia Pacific Oil & Gas Conference and Exhibition, Jakarta, Indonesia,

    Abstract

    The drilling of wells offshore West Madura, East Java, can be challenging. The geological structure of the area often requires drilling at high deviations with large stepouts, through formations consisting of carbonates, shales and sands. As a result, wellbore stability issues are frequently encountered, such as total mud losses, stuck pipe, loss of bottom hole assemblies and associated sidetracking, leading to non-productive drilling time and unnecessary costs. In order to lower the associated risks the operator commissioned a geomechanics study, to identify the root causes of the wellbore stability issues, and provide recommendations for improved drilling of future development wells.

    Numerous wells had been drilled within the area of interest over more than three decades, resulting in a large variation in the availability and quality of data. Recently acquired 3D seismic data were also available. Therefore, a multidisciplinary approach was employed with geomechanics at its core, accompanied by well log conditioning, generation of synthesized shear sonic logs, simultaneous seismic inversion, and drilling engineering. The integration of the different disciplines ensured the development of robust 1D and 3D geomechanical models, which were applied to develop mud weight recommendations for the planned development wells.

    Firstly, a 1D geomechanical model was constructed. Two recently drilled wells had excellent data sets: extended leakoff and minifrac test results showed very consistent fracture closure pressures. This, combined with the presence of borehole breakouts and direct rock strength measurements on core, allowed the determination of the minimum and maximum horizontal stresses with only small ranges of uncertainty. The 1D geomechanical model was further calibrated by a detailed comparison with critically reviewed drilling incidents. Simultaneously, well logs were conditioned and pseudologs were created, which were used for 3D simultaneous seismic inversion, from which rock property volumes (P-impedance, S-impedance, and Vp/Vs) were derived in turn. Gardner’s relationship was used to transform the seismic velocity data to a density volume. The 1D geomechanical model was subsequently combined with the 3D seismic data via a structural model grid, resulting in a full 3D geomechanical model containing cubes of pore pressure, principal insitu stresses, elastic rock properties and rock strength. Finally, wellbore stability analyses were performed for the planned development wells, including a quantitative risk assessment to gauge the impact of uncertainties in various key variables on the overall potential drilling success. Well deviation and azimuth sometimes showed a counterintuitive effect on recommended mud weights, as illustrated by stereonet plots.

    A key factor in the execution of this project was the integration of data and expertise in petrophysics, seismic inversion, geomechanics and drilling engineering over a relatively short timeframe to deliver a technically robust set of mud weight windows, which, combined with recommendations based on a detailed review of passed drilling practices, should enable the successful drilling of the wells in this very challenging environment.

2015

  1. Law, S.; McDonald, A.; Fellows, S.; Reed, J.; Sutcliffe, P.

    SPE Offshore Europe Conference and Exhibition, Aberdeen, Scotland, UK

    Abstract

    A high level screening has been performed of UKCS oil fields to identify the most likely LSWF candidates utilising screening criteria with a focus on kaolinite clay content. The screening results suggest that approximately 57% of the fields have 6 % or higher kaolinite clay content. Of these fields 26 % were water-wet and 74 % were mixed-wet in terms of wettability. This suggests that a significant number of fields would fall within the eligibility for consideration of LSWF EOR although their suitability will depend on field maturity (current recovery factor and facilities constraints). The difficulty in applying LSWF in tertiary mode unlike secondary mode, is in obtaining a reasonable prediction of how the reservoir is likely to respond. The question of core availability and quality has been raised in a number of studies in terms of LSWF and electrical property testing. We propose a methodology which can be applied to compensate for the lack of usable core based on petrophysical log response. The logs can be utilised to determine the clay types present (including fractions) from which the cation exchange capacity can be calculated. Selected compositions from anonymised field data from core was used to provide quality control the log derived values. The most likely recovery mechanism, multi-component ion exchange (MIE), requires the input of key electrical properties into the models (cation exchange capacity, reactive surface area, activation energy and mineral fraction) in order to predict the response of the reservoir to LSWF. In this study the effect of clay content on the reservoir response was modelled indirectly by altering the cation exchange capacity relative to the clay mineral fraction present in the reservoir to determine its effect. Utilising a mechanistic modelling approach, homogeneous Cartesian models were run in the compositional finite difference reservoir simulator GEM to assess the impact on oil recovery. The simulated coreflood tests reveal that under secondary LSWF recovery was 68.4 % compared to 63.6 % for formation water (high salinity). The conservative nature of the relative permeability curves limited the incremental recovery. An analysis of the tertiary recovery utilising a coreflood based on Fjelde et al. (2012) revealed that cation exchange impacts the predicted recovery by up to 2.65 % OOIP for the range of 5 - 30 % clay content. Given that the recovery under tertiary conditions is considered in the literature to be between 6 and 12 %, this is significant and highlights that if idealised data is selected rather than real field data, then significant potential exists to under or over-predict the incremental recovery.