Some of the limitations we have with R: 1.- Data “must” be in memory; 2.- The process “is not” executed in parallel.
There are some workarounds for both the parallel and the in memory, but with custom development.
One sample of how we deal with sql server data and R is by having R Studio reading from SQL Server using RODBC, RJDBC, executing some R and then moving results back to SQL Server. With some custom development we are able to build a model and then query the model, but as said, too much custom development and poor performance for real time predictions. Still the in Memory and parallel limitations.
Azure Machine Learning.
Azure ML is great for dealing with R. We can load data into Azure ML, from different Microsoft products and external sources, execute R, and then move results back into a repository. Querying for predictions also needs some custom development but is doable. Azure ML is on cloud so not an option for on premise customers.
SQL Server 2016
From what Microsoft is presenting (Ignite) and what we can see in the CTP. This is like seeing a dream come true.
Part of this success is because of the acquisition of Revolution Analytics.
t-sql executes R directly, no R studio or any other external piece. Using data stored in SQL Server, we can build a model, query a model, check accuracy, and store results back in sql server. Applications can query a model connecting to sql server, like we do with linked server to SSAS for data mining.
Revolution R was able to deal with in Memory limitations and to execute in Parallel, so we can assume this will also happens in R within SQL Server.
SQL Server supporting R in the database engine brings a lot of power to our data mining capabilities, we can have the full functionally of R in our SQL Server environment. Easy to process or query. It is on premise!!!
R is a great data science script lang, library and executing environment, R + SQL Server is a huge step. I am building BIML code to generate packages that call t-sql for processing, querying and testing different models, easily integration with Power BI, SSRS, tabular direct query predictions…. Huge step.