Search Results For : tutorial

ML.NET PART 4 – MACHINE LEARNING

2 weeks ago, in my part 3 post of my ML.NET adventure, i wrote about AutoML through Command Line Interface CLI and how i generated and i wish to expand more on them

What AutoML does and its coverage

As of time of writing, there are 3 that has incorporated into AutoML,

  1. binary-classification
  2. multiclass-classification
  3. regression

I also understand from Microsoft Docs that there will be future machine learning tasks that can be incorporated.

ML.NET CLI input and output
ML.NET CLI input and ouput

Image copied from https://docs.microsoft.com/en-us/dotnet/machine-learning/automate-training-with-cli

The various commands possible

> mlnet auto-train --task binary-classification --dataset "customer-feedback.tsv" --label-column-name Sentiment
> mlnet auto-train --task regression --dataset "cars.csv" --label-column-name Price
> mlnet auto-train --task multiclass-classification --dataset "Training.csv" --label-column-name "Risk" --max-exploration-time 600

Source – https://docs.microsoft.com/en-us/dotnet/machine-learning/reference/ml-net-cli-reference

Output from ML.NET

After running the respective commands for ML.NET, you will noticed 1 folder that will consists of

  1. logs folder
  2. ConsoleApp folder
  3. Model folder
  4. sln file (solution)

Logs – The logs file consists of a full logs with information on all the iterations that have happened while evaluating the algorithms.

ConsoleApp – This application, in C#, allows you to run and make predictions like an end-user applicaiton

Model –
it consists of MLModel.zip which is a serialized model that is ready to use for running predictions
it also consists of the code that was used to generate the model which we can use for retraining purposes.

Quality of the generated model

Understanding more on the quality of the model that was generated.

You will notice –
with Binary Classification – comes
1. Accuracy
2. AUC
3. AUPRC
4. F1-Score

with Multiclass Classification – comes
1. MicroAccuracy
2. MacroAccuracy

with Regression – comes
1. RSquared
2. Absolute-loss
3. Squared-loss
4. RMS-loss

You will be able to see how to understand the metrics via this link – https://docs.microsoft.com/en-us/dotnet/machine-learning/resources/metrics

ML.NET PART 3 – MACHINE LEARNING

In my last post, i covered the steps i have taken to setup the environment required for ML.NET to work. Recall that we need to load the data, prepare the data, train the model and lastly using the model. We are focusing on the training and getting the model today. There are 3 ways for us to get the model of ML.NET.

  1. Coding it ourselves
  2. Using Model Builder Tool
  3. Using AutoML via CLI to perform model training and picking the best algorithm.

In this post, I will focus on using the CLI – command line interface to test and get us the best algorithm. The sample data i am using is from https://archive.ics.uci.edu/ml/machine-learning-databases/00331/sentiment%20labelled%20sentences.zip , more specifically – i am using the Yelp file. if you are not able to get it anymore, you may download from here https://www.limguohong.com/wp-content/uploads/2019/06/yelp_labelled.txt

Data
First, we need to understand this data that we downloaded and what it meant. If you were to open up in excel or a text editor, you will notice that everyline is a text followed by a digit at the back. The digit is binary based – 1 or 0. You will further notice that those whose line are positive and labelled with a 1 as the digit and those whose line are negative are labelled with a 0 and there are 1000 lines of text(reviews).

Problem we are solving via machine learning
In this very specific tutorial post i am making here, we are attempting to train a model to understand if a review is positive OR negative and return the result accordingly. We are using yelp review to train up the model via AutoML CLI.

As you have probably noticed, we are attempting to predict if a new review is likely POSITIVE OR NEGATIVE and this is a binary way of classification and this sheds some light on which tasks should we use.

Which tasks should we use?
We understand that there are 7 tasks in ML.NET right now. Based on the problem we are solving, we will then need to choose which tasks (or sometimes i even call it, classification of problem) will it fall within.

Tasks include

  1. Binary Classification
  2. Multiclass Classification
  3. Regression
  4. Clustering
  5. Anomaly Detection
  6. Ranking
  7. Recommendation

For explanation on what tasks do what – please check the following link – https://docs.microsoft.com/en-us/dotnet/machine-learning/resources/tasks

Now that we understand what tasks are available, we will leave it to AutoML CLI to tell us which Trainer should we use. The concept of Trainer is
Trainer = Algorithm + Task.

In this very specific tutorial we are making here, as the problem is a binary based problem, the best classification to use would be Binary Classification.

Tutorial on Binary Classification – AutoML CLI ML.NET

  1. Create a new folder – I created “AutoML CLI Binary”
  2. Place the txt file into the folder.Creating the folder
  3. We need to modify the data abit as it is missing the header to inform the system on which is the LABEL – do note that it must be of boolean type (1 or 0, true or false). As you can see on https://docs.microsoft.com/en-us/dotnet/machine-learning/resources/tasks under Binary Classification, they require the label column data field to be of Boolean. Open up the txt file in excel and add the header “sentiment_label” above the result. Machine Learning - Binary ClassificationMachine Learning - Labeling
  4. Go to the folder and open up command prompt and run this command.
mlnet auto-train --task binary-classification --dataset "yelp_labelled.txt" --label-column-name sentiment_label --max-exploration-time 20

*For explanation, please refer to the end of this post.

  1. After running, you will notice that it mentioned how many iteration it runs and inform which trainer has the best accuracy.

*You may also want to attempt to run the exploration time to be longer and see if they suggest other better algorithm. In my screenshot below, i used 20 seconds and 60 seconds and the result was different.

  1. You will notice a new folder has been generated.

In my next post, i will share how to make sense of the generated file.

ML.NET Command breakdown

mlnet auto-train –task binary-classification –dataset “yelp_labelled.txt” –label-column-name sentiment_label –max-exploration-time 20

Notes about the command.

  1. We are informing mlnet to run the command auto-train which will “Create a new .NET project using ML.NET to train and run a model”
  2. We then inform mlnet the task that we want it to perform by calling –task, at time of writing, ML.NET AutoML CLI has the following supported.
    1. regression
    2. binary-classification
    3. multiclass-classification
  3. We then inform it via –dataset or -d on which file is mlnet supposed to read. Since we are running the command on project root, we just input the file name “yelp-labelled.txt”
  4. We can use either –label-column-name or –label-column-index on column to predict. In this case, we used –label-column-name and inform them to read the column name “label”
  5. We end off the command by informing mlnet in seconds the max-exploration time. In this case we set it to be 20 seconds.

Posts-
ML.NET Introduction – Introduction
ML.NET Part 2 – Machine Learning – Environment setup
ML.NET Part 3 – Machine Learning – Generating Model via ML.NET CLI – Binary Classification

ML.NET Part 2 – Machine Learning

Last week, i did the first post on ML.NET covering the basics and its various steps required to get a model up and use it – I will cover how to go about preparing, coding and using them in the later posts.

Two of the key steps involved are
1. Loading the data
2. Transforming the data
3. Training and Generating the model
4. Using the trained model

In this tutorial, we will focus on getting the environment in your computer correct so that we can prepare and start doing ML.NET. Kindly note that this tutorial is written for Windows Environment. As of time of writing, i am on Windows 10 with Visual Studio 2017 Enterprise.

Installing MLNET

I attempted to start by calling the command (You can start by going to Command Prompt and type straight away)

mlnet

but i was thrown with the error –

'mlnet' is not recognized as an internal or external command, operable program or batch file. 
'mlnet' is not recognized as an internal or external command, operable program or batch file.
‘mlnet’ is not recognized as an internal or external command, operable program or batch file.

I recognized that i do not have mlnet installed. I then run

dotnet tool install -g mlnet

and what? –

No executable found matching command "dotnet-tool"

Based on some search, concluded it is due to the fact that dotnet tool is only available in .NET CORE 2.1.3 onwards and I am running – 2.1.2

dotnet version
dotnet version

Went on to https://dotnet.microsoft.com/download/dotnet-core/2.2 and downloaded dotnet core 2.2 (as of time of writing .net 3 is in preview and hence I did not use yet). Do note that the release was not compatible with VS 2017 and if you are using VS 17, there is another version for you to download.

After installing, restart your computer and let it install again by running the command.

dotnet tool install -g mlnet

Note that you have to wait. Nothing will happen for some time and it will just magically works after that!

mlnet installed!
mlnet installed!

Posts-
ML.NET Introduction – Introduction
ML.NET Part 2 – Machine Learning – Environment setup

ML.NET Introduction

Recently, a friend of mine, Maxx and myself decided to embark on a quest to start learning ML.NET. It came to our attention that ML.NET is released on Preview in 2 April 2019 and subsequently on 3 May 2019 on Stable Release.

Being in a language that I am comfortable with, i decided to give it a try and see what capabilities are available and how we are able to build something.

Every the course of next few weeks, we are going to try

  1. Coding it ourselves
  2. Using Model Builder Tool
  3. Using AutoML to perform model training and picking the best algorithm.

More information can be found on their site on
https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet

We are also going to follow the ML.NET tutorials released –
https://docs.microsoft.com/en-gb/dotnet/machine-learning/tutorials/ then attempt to run the sample –
https://github.com/dotnet/machinelearning-samples/blob/master/README.md

We have ran through the Microsoft Docs detailing ML.NET and will be extracting the important points out in the next various posts i made.

ML.NET will give developers the power to add in machine learning capabilities to .NET applications. With this ability, developers are now able to make predictions using the data presented.

The general steps required
1. Load the data
2. Prepare the data
3. Train and test the model
4. Get predictions / Using the model

ML.NET presents various functions and methods to do them and in the course of next few weeks and posts, we will present them in our blogs.

As of date of posting, ML.NET has the following Tasks.

  1. Binary Classification
  2. Multiclass Classification
  3. Regression
  4. Clustering
  5. Anomaly Detection
  6. Ranking
  7. Recommendation

Posts-
ML.NET Introduction – Introduction
ML.NET Part 2 – Machine Learning – Environment setup

Office Excel – TEXTJOIN Function

Microsoft releases new iterations of Microsoft Office suite every three years for desktop and Microsoft Office 365 is the subscription based cloud version of their Office software. Earlier in 2016, Microsoft released the 2016 edition of Office and some updates to Office 365 which added new features in Excel. Some very useful functions like the CONCAT and TEXTJOIN functions are added which make concatenating or joining text very easier with multiple cells or strings in your spreadsheet. These functions are only available in the latest Office 2016 desktop installation and Office 365 subscription. To show these new functions, here is the Excel CONCAT and TEXTJOIN function tutorial. I am breaking them into 2 different tutorials for ease of access.

The Excel TEXTJOIN function joins or combines text from multiple cells in your spreadsheet with each string separated by a delimiter. The delimiter can be a comma or space. If the delimiter is empty, the Excel TEXTJOIN function will concatenate the strings like in the previous tutorial. Here is how to use Excel TEXTJOIN function tutorial:

The format of the Excel TEXTJOIN function is:
TEXTJOIN(delimiter, ignore_empty, text1, text2, … , textN)

Definition

  • “delimiter” is the character or string inserted between each string you want to join.
  • “ignore_empty” is either TRUE (exclude empty cells or strings) or FALSE (include empty cells or strings).
  • “text1” is the first string or cell and “textN” is the nth string or cell which you want to join.

Now here is how we use the Excel TEXTJOIN function:

  1. In your Excel spreadsheet, see which cells or strings you want to join using TEXTJOIN.
  2. Then select a cell where you want to display the result of the TEXTJOIN function.
  3. For example, in this following example spreadsheet, we want to join the strings in cells A2 through D2 and we want to display the output in cell F2.
  4. Select the cell F2 and enter the Excel TEXTJOIN function in the formula bar above the spreadsheet:
  5. =TEXTJOIN(“ ”,TRUE,A2,B2,C2,D2)

  6. After entering the function, you will see the result in cell F2:
  7. Office Excel - TEXTJOIN function

    Office Excel – TEXTJOIN function

  8. As you can see in the example, the data in cells A2 through D2 is now joined in cell F2 and there are spaces between the strings since we entered space as a delimiter in the function.
  9. You can do this with numbers as well, like in the result shown in cell F3:
  10. Office Excel - TEXTJOIN function

    Office Excel – TEXTJOIN function

  11. The result in cell F4 is when TRUE is used:
  12. Office Excel - TEXTJOIN function

    Office Excel – TEXTJOIN function

  13. The result in cell F5 is when FALSE is used and you can see the difference between step 8 and 9:
  14. Office Excel - TEXTJOIN function

    Office Excel – TEXTJOIN function

  15. You can also add strings directly in the function as shown in the following examples:
  16. Office Excel - TEXTJOIN function

    Office Excel – TEXTJOIN function

—–
You can view more Office Excel Tutorials in the link too!