The current state of Machine Learning in .NET an (un)exhaustive overview

Naar blogoverzicht
28 March 2019

In this blog post, I’ll try to give a brief overview of the different forms of machine learning that are available to (.NET) Developers on the Azure platform. Due to the sheer size and speed at which our friends at Microsoft are expanding their cloud offering, I don’t have the illusion of even nearing completeness. So please just let me know in case I missed some :)

Azure Cognitive Services

First stop is Azure Cognitive Services. These are a set of ready to use API's that you can use to power charge your applications with AI capabilities. No actual machine learning knowledge is required, so it’s a good start to be amazed by the powers of machine learning.

Azure Cognitive Services consists of a number of AI services hosted in Azure, grouped in the subjects listed Vision, Speech, Knowledge, Search and Language. As you can see a broad spectrum of ready to use AI services is offered, from Emotion Detection to Sentiment Analysis and knowledge interpretation. You can easily browse and give the services a try to see how you can use them in your own applications.

Microsoft is constantly adding new services to this offering. For example, the custom vision service allows you to upload and train the algorithm to recognize your own images. By uploading just a few images you are able to get some amazing results. But looks can be deceiving. This service used a pre-trained deep neural network model trained on millions of images. A technique called transfer learning is used to optimize the existing model to recognize your specific images.

Another service that was added recently is Anomaly Detection. This service allows you to monitor and detect issues in time series based data, so you can act accordingly. Or even better, act before it happens.

Azure ML Studio

Hopefully, Azure Cognitive Services have sparked your interest in machine learning. A logical next step would be to give Azure ML Studio a try. Azure ML Studio is a so-called MLAAS (Machine Learning As A Service) solution from Microsoft that allows you to setup machine learning experiments using a drag-and-drop editor. No need to configure any hard- or software, all from the ease of your favorite web browser. Azure ML Studio allows you to model all steps in the machine learning process, from data preparation, splitting training and test data sets, model en parameter selection to actually training, scoring and evaluating your models. It’s a great way to get a better, visual understanding of the machine learning process and the different machine learning algorithms available, from decision trees to neural networks. The Azure AI Gallery has a great collection of sample experiments that you can use as a basis and then fine tune for your projects. As you can see from the samples you can develop quite complex data science solutions using the Azure ML Studio building blocks.

Once you were able to train your model and are satisfied with the prediction experiment, you can easily publish your trained model as a predictive web service. You can even automate retraining the experiment with new data, as we did in our product recommendation service (https://dotcontrol.com/blog/how-machine-learning-improves-product-recommendations-to-increase-webshop-conversion-). You can even publish and monetize your predictive web services in the marketplace.

Data Science Virtual Machine

On the other end of the spectrum, Microsoft offers the Data Science Virtual Machine. As its name suggests this is a virtual machine preinstalled and configured with all data science and machine learning tools an aspiring data scientist could wish for. There are editions of the Data Science Virtual Machine: Windows, Linux, and even a GPU optimized one. Setting up and configuring a data science environment can be quite time-consuming. Because this virtual machine is hosted in Azure, you also have the benefit of only having to pay for CPU/GPU computing power when needed. So if you really want to get your hands dirty, you should use the Data Science Virtual Machine.

Azure Machine Learning Service

Somewhere in between Azure ML Studio and the Data Science Virtual Machine you have the Azure Machine Learning Service. Azure Machine Learning Service (the successor of Azure Machine Learning Workbench) is a cloud-based environment that allows you to rep data, train, test, deploy, manage, and track a wide range of open source machine learning models. For example Scikit-learn, TensorFlow, PyTorch, CNTK, and MXNet. The nice thing about Azure Machine Learning Service is that you can start from your local computer and then move to the cloud when needed. Both Visual Studio, and Visual Studio Code, offer plugins that allow you to connect to Azure Machine Learning Service, so you can use your favorite IDE for data science as well as some other best practices “borrowed” from software development such as version control and CICD like setups.

Machine Learning Services in SQL Server

Microsoft also did not forget our DBA friends and added Machine Learning Services (R, Python) in SQL Server (2017). This can be useful because you’re basically “bring computing to the data”. So no need to move data already present in the database. This also brings an added security benefit because of the data never leaving the database. You can both train models and do predictions based on the trained model.

Microsoft Cognitive Toolkit

Microsoft's answer to Google’s TensorFlow is called Microsoft Cognitive Toolkit (previously known as CNTK). The Microsoft Cognitive Toolkit is a framework to allows you to train deep learning models. The Microsoft Cognitive Toolkit comes preinstalled on the Data Science Virtual Machine. We used this toolkit for a client to develop a prototype that was able to perform object detection from images. At that time the object detection feature was not yet available in the Cognitive Services Custom Vision API. So, in that case, we needed to move to the data science virtual machine and use a custom Microsoft Cognitive Toolkit based image object detection solution. It’s nice to see that Microsoft is gradually adding features like object and anomaly detection to their Cognitive Services offering so it is available for everyone.

R or Python?

As you may have noticed, most data science tool kits and sample scripts are in R or python. So it makes sense to learn a bit of at least one of these languages. Learning an R-like language coming from a c# background can be a bit daunting at first. But in the end, it pays off. There are lots of free online training resources. Microsoft closed source days are long over as you can see by the services above not only supporting open source tool kits but also open sourcing their own Cognitive Toolkit. Azure Notebooks is a nice way to start experimenting with your R or python skills. Both Visual Studio and Visual Studio Code provide support for the R and Python languages.

ML.NET

By now you might be thinking isn’t there any machine learning I can do in good old c#? Luckily Microsoft comes to the rescue again with ML.NET. ML.NET is a machine learning framework built for .NET developers. This means you use a language you’re comfortable in to integrate machine learning into your existing applications.

One of the original creators of Xamarin, Miguel de Icaza, also created TensorFlowSharp. TensorFlowSharp is a strongly-typed .NET API that allows you to use TensorFlow from C# and F#. And if you really want to understand how machine learning algorithm work, you can always try a basic implementation in c#.

Data Lakes, Analytics and SQL Data Ware House

In each data science process, having a central place to store all your data is a must. For more information on Data Lakes, Analytics and SQL Data Ware House please check one of our other blog posts: https://dotcontrol.com/blog/data-is-the-new-oil

Last but certainly not least: Statistics & Data Science concepts

Most machine learning algorithms depend heavily on statistical concepts. So in order to be able to understand and judge the quality of the outcomes of your experiments, it makes sense to do a little refresh of your high school/ university statistics courses. It also doesn’t hurt to read up a little on how the different machine learning models actually work.

I hope this blog post will help get started on your machine learning journey. Don’t get discouraged by the perceived complexity. Just remember things have become a lot easier since the early days. Just start small and simple and go from there. I promise it will be well worth the effort.
 

Rutger Buijzen
DotControl & RockBoost
https://www.linkedin.com/in/rutgerbuijzen
follow me on: @rutgerbuijzen

Share this post on social media

Search
Van Nelleweg 1, 3044 BC Rotterdam
+31 (0)10 71 44 646 info@dotcontrol.nl