Ray on Azure ML

Caio Moreno
2 min readMar 8, 2023

Hi all,

I would like to promote the work of the Ray on Azure ML team.

Below is the text that I extracted from the Microsoft Tech Community Blog.

Ray and Dask are two among the most popular frameworks to parallelize and scale Python computation. They are very helpful to speed up computing for data processing, hyperparameter tunning, reinforcement learning and model serving and many other scenarios.

For an Azure ML compute instance, we can easily install Ray and Dask to take advantage of parallel computing for all cores within the node. However, there is yet an easy way in Azure Machine Learning to extend this to a multi-node cluster when the computing and ML problems require the power of more than one nodes. One would need to setup a separate environment using VMs or K8s outside Azure ML to run multi-node Ray/Dask. This would mean losing all capabilities of Azure ML.

To address this gap, we have developed a library that can easily turn Azure ML compute instance and compute cluster into Ray and Dask cluster. The library does all the complex wirings and setup of a Ray cluster with Dask behind the scene while exposing a simple Ray context object for users perform parallel Python computing tasks. In addition, it is shipped with high performance Pyarrow APIs to access Azure storage and simple interface to install additional libraries.

The library also comes with support for both Interactive mode and job mode. Data scientist can perform fast interactive work with

--

--

Caio Moreno

Solutions Architect and Data Scientist @databricks | Adjunct Professor at @IEuniversity | PhD @unicomplutense (Opinions are my own)