Can Azure ML replace SSAS Data Mining?

The data mining tools in SSAS (multidimensional mode) have been available since SQL Server 2000, and the range of data mining algorithms that are bundled are generally considered to be sufficient for most requirements. Once a data source has been defined in SSAS then building, training, and testing a model is relatively straightforward because the data mining tools available in the development environment are tailored for each algorithm. This makes data mining much more accessible to those of us who are not statisticians, but it is still a single platform for designing data sources and data mining models, which are very different skill sets.

 

My issue with SSAS is that the data source for a mining model is either a cube or an Analysis Services data source, which is effectively limited to relational databases because those are the only viable data sources for multidimensional databases. You can create a mining model in a SSAS project without creating an OLAP cube but this just underscores the fact that Data Mining could have been implemented as a separate product.

 

In contrast, Azure ML just provides the development environment for building, training, testing, and deploying ‘predictive analysis models’ (the terminology recognises that its solutions can be broader than data mining models). It provides a limited set of tools for loading data sets for training and testing purposes, which can be: a local file uploaded and persisted as a dataset, data entered interactively, or an external data source imported using a data reader module (think of modules as tasks in a workflow). There are also a few modules for formatting and transforming the input data before it is processed that can be encapsulated as part of the dataset definition or can be part of the published experiment.

 

The data sources supported by the Data Reader module are all cloud based, and are either direct connections to Azure data services, or URIs (including OData endpoints). Bear in mind that these are data inputs for training and testing purposes only. Once a ML experiment (i.e. the combination of modules that define the predictive analysis model) has been trained and evaluated then it can be automatically published as a web service. Any client application can then provide any data source it can connect to as the data input, provided it matches the defined dataset schema. To test your published web service, there is sample code and a downloadable Excel workbook for single output requests.

 

The fact that experiments with trained models can be published as web services addresses the other concern I have with SSAS Data Mining models, namely the limited number of client applications that support it. The main options are using the Excel data mining add-ins in Excel, developing custom applications that submit Data Mining Extensions (DMX) queries, or third-party products.The decision to expose Azure ML experiments as Azure web services is significant because the foundations have been laid to provide much wider client support than was possible with SSAS.

 

Azure Data Factory already supports Azure ML web service as a linked service for batch scoring, using blob storage to store input data and output results. Excel add-ins are also starting to appear on CodePlex, and you can use Power Query to connect to published Azure ML web services available in the Azure Marketplace. I am expecting further integrations and client support from  Microsoft, particularly with respect to Power BI.

 

 Azure ML Studio

 

My initial interest in Azure ML was to understand whether it provided a viable alternative to SSAS data mining and provide better support for the diverse data sources I am working with today. The decision to implement the machine learning as a web service which abstracts it from the client’s choice of data source addresses my concerns. If you are making the transition from SSAS then you will find that the user experience in Azure ML Studio is much better, and I predict that Azure web services and Excel add-ins or Power BI reports will provide equivalents to the data mining model algorithms and visualisations in SSAS.

 

Microsoft have just announced General Availability of Azure ML. If you haven’t tried it yet, I would recommend signing up for the free tier to experience it for yourself.

Written by Nicholas Revell at 00:00

Categories :

0 Comments :

Comment

Comments closed