2 weeks ago, in my part 3 post of my ML.NET adventure, i wrote about AutoML through Command Line Interface CLI and how i generated and i wish to expand more on them
What AutoML does and its coverage
As of time of writing, there are 3 that has incorporated into AutoML,
I also understand from Microsoft Docs that there will be future machine learning tasks that can be incorporated.
The various commands possible
> mlnet auto-train --task binary-classification --dataset "customer-feedback.tsv" --label-column-name Sentiment
> mlnet auto-train --task regression --dataset "cars.csv" --label-column-name Price
> mlnet auto-train --task multiclass-classification --dataset "Training.csv" --label-column-name "Risk" --max-exploration-time 600
Output from ML.NET
After running the respective commands for ML.NET, you will noticed 1 folder that will consists of
Logs – The logs file consists of a full logs with information on all the iterations that have happened while evaluating the algorithms.
ConsoleApp – This application, in C#, allows you to run and make predictions like an end-user applicaiton
it consists of MLModel.zip which is a serialized model that is ready to use for running predictions
it also consists of the code that was used to generate the model which we can use for retraining purposes.
Quality of the generated model
Understanding more on the quality of the model that was generated.
You will notice –
with Binary Classification – comes
with Multiclass Classification – comes
with Regression – comes
You will be able to see how to understand the metrics via this link – https://docs.microsoft.com/en-us/dotnet/machine-learning/resources/metrics
In my last post, i covered the steps i have taken to setup the environment required for ML.NET to work. Recall that we need to load the data, prepare the data, train the model and lastly using the model. We are focusing on the training and getting the model today. There are 3 ways for us to get the model of ML.NET.
In this post, I will focus on using the CLI – command line interface to test and get us the best algorithm. The sample data i am using is from https://archive.ics.uci.edu/ml/machine-learning-databases/00331/sentiment%20labelled%20sentences.zip , more specifically – i am using the Yelp file. if you are not able to get it anymore, you may download from here https://www.limguohong.com/wp-content/uploads/2019/06/yelp_labelled.txt
First, we need to understand this data that we downloaded and what it meant. If you were to open up in excel or a text editor, you will notice that everyline is a text followed by a digit at the back. The digit is binary based – 1 or 0. You will further notice that those whose line are positive and labelled with a 1 as the digit and those whose line are negative are labelled with a 0 and there are 1000 lines of text(reviews).
Problem we are solving via machine learning
In this very specific tutorial post i am making here, we are attempting to train a model to understand if a review is positive OR negative and return the result accordingly. We are using yelp review to train up the model via AutoML CLI.
As you have probably noticed, we are attempting to predict if a new review is likely POSITIVE OR NEGATIVE and this is a binary way of classification and this sheds some light on which tasks should we use.
Which tasks should we use?
We understand that there are 7 tasks in ML.NET right now. Based on the problem we are solving, we will then need to choose which tasks (or sometimes i even call it, classification of problem) will it fall within.
For explanation on what tasks do what – please check the following link – https://docs.microsoft.com/en-us/dotnet/machine-learning/resources/tasks
Now that we understand what tasks are available, we will leave it to AutoML CLI to tell us which Trainer should we use. The concept of Trainer is
Trainer = Algorithm + Task.
In this very specific tutorial we are making here, as the problem is a binary based problem, the best classification to use would be Binary Classification.
Tutorial on Binary Classification – AutoML CLI ML.NET
mlnet auto-train --task binary-classification --dataset "yelp_labelled.txt" --label-column-name sentiment_label --max-exploration-time 20
*For explanation, please refer to the end of this post.
*You may also want to attempt to run the exploration time to be longer and see if they suggest other better algorithm. In my screenshot below, i used 20 seconds and 60 seconds and the result was different.
In my next post, i will share how to make sense of the generated file.
ML.NET Command breakdown
mlnet auto-train –task binary-classification –dataset “yelp_labelled.txt” –label-column-name sentiment_label –max-exploration-time 20
Notes about the command.
Last week, i did the first post on ML.NET covering the basics and its various steps required to get a model up and use it – I will cover how to go about preparing, coding and using them in the later posts.
Two of the key steps involved are
1. Loading the data
2. Transforming the data
3. Training and Generating the model
4. Using the trained model
In this tutorial, we will focus on getting the environment in your computer correct so that we can prepare and start doing ML.NET. Kindly note that this tutorial is written for Windows Environment. As of time of writing, i am on Windows 10 with Visual Studio 2017 Enterprise.
I attempted to start by calling the command (You can start by going to Command Prompt and type straight away)
but i was thrown with the error –
'mlnet' is not recognized as an internal or external command, operable program or batch file.
I recognized that i do not have mlnet installed. I then run
dotnet tool install -g mlnet
and what? –
No executable found matching command "dotnet-tool"
Based on some search, concluded it is due to the fact that dotnet tool is only available in .NET CORE 2.1.3 onwards and I am running – 2.1.2
Went on to https://dotnet.microsoft.com/download/dotnet-core/2.2 and downloaded dotnet core 2.2 (as of time of writing .net 3 is in preview and hence I did not use yet). Do note that the release was not compatible with VS 2017 and if you are using VS 17, there is another version for you to download.
After installing, restart your computer and let it install again by running the command.
dotnet tool install -g mlnet
Note that you have to wait. Nothing will happen for some time and it will just magically works after that!
Met YangLin, Pratibha, Eugene and Alex there while I attend the Windows 7 Development workshop at SMU organised by SG Acad Team, MS Singapore. I have been following Jocelyn Villaraza‘s blog but never had a chance to see her in action and today, finally, i had. Pretty good presentation done by her.
I am gonna rewrite one of her guide for my school’s bootcamp. Hope to get it done when I am free next week ( After Monday ). I never knew that program was so cool.
Going to teach wushu tomorrow morning!