MemSQL Announces Significant Advances for Creating Real-Time Data Pipelines for Spark and Python in New Release

MemSQL, a provider of real-time databases for transactions and analytics, has announced significant advances for creating real-time data pipelines for Apache Spark, as well as support for the Python language and Non-Uniform Memory Access.

MemSQL can now run Spark SQL queries inside of the MemSQL database, provide in-browser Python programming, and automatically optimize NUMA deployments. These features would drive rapid results and faster analytics for data scientists.

“The newest release of MemSQL Ops reinforces our commitment to the Spark community to deliver even faster access to real-time data and analytics,” said Eric Frenkiel, co-founder and CEO, MemSQL. “Our mission is to deliver technology that integrates advances across the open source ecosystem and that appeals to the programming community at large.”

memsqlAs a transient processing framework, Spark is well suited for data analysis and model development, but it is not purpose built for high performance SQL. To that end, MemSQL now allows Spark SQL queries to run inside of the MemSQL database, which would be able to improve performance by up to 50x on many workloads. By combining MemSQL with Spark, data scientists can tap a permanent, transactional datastore to feed the latest business data into their models for real-time analytics.

The combination of Spark and MemSQL further unifies in-memory processing with in-memory storage for “lightning fast” results. Users have access to a familiar SQL interface, which would provide the performance and persistence to run real-time data pipelines successfully. Spark data transformation capabilities can be fully utilized when paired with distributed, in-memory stores like MemSQL, compared to traditional disk-based stores like HDFS.

NUMA Environment

The latest release of MemSQL Ops also features in-browser Python programming, which opens up Python’s vast library of analysis packages such as Numpy, Scipy and Pandas to users running MemSQL.

For users running MemSQL in a NUMA environment, MemSQL Ops now offers point-and-click installation. MemSQL Ops can intelligently map MemSQL instances to CPUs that share local memory. The increased efficiency on large server deployments can accelerate queries by up to 40%. From ultra-fast query execution to efficient storage of business data, MemSQL would enable users to operate with maximum efficiency in fast-paced production environments.