Microsoft and Databricks said the vectorization query tool written in C++ speeds up Apache Spark workloads up to 20 times
Microsoft has announced a preview of the C++-based vector query tool for cloud analytics service Azure Databricks and Apache Spark-based AI. Azure Databricks, in partnership with Databricks, introduced the Photon-powered Delta Engine on September 22.
Written in C++ and compatible with Spark APIs, Photon is a vectorized query tool that leverages modern CPU architecture and Delta Lake open-source transaction storage to enhance Apache Spark 3.0 performance up to 20 times. Microsoft said that as organizations embrace data-10 decision-making, they now have a platform that can quickly analyze large numbers and types of data.
Photons provide more parallel CPU processing capabilities at the data and command level. Other components in Delta Engine include an improved query optimizer and a cache layer. This combination of technologies drives major data to use cases including data engineering, machine learning, data science, and data analytics.
Azure Databricks aims to allow users to quickly set up an optimized Apache Spark environment. It provides native integration with Azure Active Directory and other Azure cloud services such as Azure Synapse Analytics and Azure Machine Learning, with customers able to build end-to-end data warehouses, machine learning, and real-time analytics solutions. Users can request access to Photon Preview by filling out the question.