Oilfield Lithology Prediction from Drilling Data with Machine Learning
Published:
Read this work in Towards Data Science
When drilling into the subsurface to search the oil, drilling engineers look for many ways to know the type of rock (lithology) of the formation they are drilling. They could do logging-while-drilling (LWD) to acquire real-time petrophysical data, but this is often very costly. Or they will coordinate with mud loggers who compiled the drill cutting samples and analyzed the lithology in the lab, but this takes time. In fact, drilling engineers need short time to make a fast decision. Therefore, I proposed to apply ML to predict lithology from drilling data. Drilling data is something that the engineers have. ML can be considered a viable solution for this need.
The data used in this work was from the open-source Volve oil field dataset in the North Sea. To achieve our objective, real-time drilling data and mud log were integrated, as later on this will be used for training dataset. The mud log data was imbalanced, i.e. there are far more records of sandstone than that of shale. Therefore, Synthetic Minority Oversampling Technique (SMOTE) was used to oversample the minority points before model training. The data was split into training and test data using stratified sampling. Next, AdaBoost model was trained on the data, which achieved 96% accuracy, 86% precision, and 89% recall.
Drilling engineers can use this model to predict lithology instantly whenever drilling measurements come to them in the rig site. The more data from cuttings analyzed from the mud logger come, the more the training data can be added. Further improvement after this work is to implement real-time APIs such as Apache Kafka to ingest data and perform ML prediction.