Model Scoring & Development
Updated: May 20
Click play on the video below or keep reading for a discussion between Data Scientist, Ildar Abdrashitov and host Nicole Hutchison talking about machine learning model scoring and development.
Q1: What is Model Scoring? What does it mean and why is it so important?
Answer: First, while training machine learning models we first need to understand that in order to do this you need to make three major decisions.
First, you need to understand what kind of model you are going to choose and what kind of architecture that model will be.
Second, you will have to decide on hyperparameters that you will choose for your model.
Third, in order to make a reliable and good model, is decide what kind of data or, in other words, features you're going to select for that model.
in order to arrive at these decisions, you have to score your model. Machine learning is a very iterative process. You are basically training and re-training and playing with different hyperparameters and tuning your model and at every iteration, you are scoring your model. You're calculating your metric for that model and by doing that you are assessing your models across other possible options so you can arrive with the best model you can possibly have for the data you are working with.
A few words about metrics
There are a lot of metrics about machine learning models and there are a lot of different metrics applied for different categories of machine learning problems. Let's say, for classification problems, you may choose a precision-recall F Metrics. For regression models, people, as it is a very common practice in the industry, may choose R Square or Mean Average Error or Mean Average Percentage Error for the metrics.
Q2: How does someone go about choosing the right metrics for their specific model?
Answer: Well actually, there is no golden rule for choosing the metric. First, you need to realize that your metrics should serve a business need that your machine learning model is going to solve. For instance, if you're training a model to predict power consumption usage and it is enough for you to just to have an accurate model that will on average predict future power consumption quite accurately, you may just stop with a mean average error. However, if you need your model to capture large spikes, large deviations and large so-called outliers in your data, such as large spikes in power consumption, then probably choosing a root means square error will be the metric of your choice.
Q3: Can you shed a light on what model deployment is and what it looks like to deploy a model?
Answer: Models are themselves basically an object with a set of learned parameters and are basically useless unless they are wrapped in a consumable format. By this, I mean that a model should serve business IT systems and users in a convenient way. This is why your machine learning models should be wrapped in some sort of application. In the industry, it is most commonly done by creating an application with REST API that serves that model so that any systems like IT systems, business systems, business intelligence systems and systems that will be consuming the machine learning model, can talk to this model in a very efficient way. This, in essence, is model deployment.
When deploying your model you have to track and monitor its quality metric and how the dynamics of that quality of that metric behaves during that time. You do this because models tend to degrade over time because business tends to change over time and therefore so does the data that describes a certain business process.
Interview conducted by Nicole Hutchison
Expert Ildar Abrashitov, Data Scientist.