PART ONE...

Machine learning is a hot topic these days. It has permeated discussions about technology in many fields, the oil and gas industry included. It is transforming the way we process and interact with data as well as expanding what we can accomplish with it. But even though it is becoming such a prevalent technological force, to some it has retained somewhat of a mystical aura, and a lot of people still don’t understand the basics of how it works.

A good place to start in understanding machine learning is understanding what it is used to accomplish. Machine learning can have a lot of different applications, but at its most basic it is a tool used to find patterns in large sets of data and predict outcomes. These data can be measurements of anything—well data or financials, for example.

In the past, statistical and data science would use a lot of different mathematical formulas created by a person and then test the results to see which one had the best outcome. As computers have gotten faster, a new approach allows them to look at data and go through a process of repetition to see how the data behaves when the values are changed to come up with a formula that is usually a lot more complicated than a human would ever consider and doesn’t have any of the inherent biases that a human has.

To begin the machine-learning process, the appropriate algorithms must be selected. An algorithm is the procedure that iteratively runs the data through in the computer. There are a variety of different types of algorithms, and the algorithm is chosen based on the type of problem being solved and what you’re trying to measure. Once the problem has been identified and the type of algorithms to be used have been decided upon, the relevant data must be identified. Sometimes types of data may be collected that are not necessarily relevant to the outcome of the problem being addressed, and these data have to be weeded out to avoid skewing the results. Then half of the data is “trained” by running it through the algorithm. Once the data has been trained, the end product is a model. The model is the formulaic output of machine learning that results from iteratively running the data through the algorithm. Finally, the second half of the data is run through the model to determine whether it is viable—that is, whether it produces the expected results when the data are processed by it. If all expectations are met, then machine learning has helped you create a model that can predict outcomes better and faster than a human would be able to.

There is, of course, a lot more going on behind the scenes that has not been included in this simplified explanation, but the most recent advances in machine learning mean that you don’t need to be a mathematician to use it. Machine learning is a practical tool that enables people to predict outcomes quickly and accurately, and it is becoming increasingly accessible to all kinds of users.