1
/
of
1
Packt Publishing
In-Memory Analytics with Apache Arrow: Accelerate data analytics for efficient processing of flat and hierarchical data structures
In-Memory Analytics with Apache Arrow: Accelerate data analytics for efficient processing of flat and hierarchical data structures
Regular price
$41.99 USD
Regular price
Sale price
$41.99 USD
Quantity
Couldn't load pickup availability
Harness the power of Apache Arrow to optimize tabular data processing and develop robust, high-performance data systems with its standardized, language-independent columnar memory format
Key Features- Explore Apache Arrow's data types and integration with pandas, Polars, and Parquet
- Work with Arrow libraries such as Flight SQL, Acero compute engine, and Dataset APIs for tabular data
- Enhance and accelerate machine learning data pipelines using Apache Arrow and its subprojects
- Purchase of the print or Kindle book includes a free PDF eBook
- Use Apache Arrow libraries to access data files, both locally and in the cloud
- Understand the zero-copy elements of the Apache Arrow format
- Improve the read performance of data pipelines by memory-mapping Arrow files
- Produce and consume Apache Arrow data efficiently by sharing memory with the C API
- Leverage the Arrow compute engine, Acero, to perform complex operations
- Create Arrow Flight servers and clients for transferring data quickly
- Build the Arrow libraries locally and contribute to the community
This book is for developers, data engineers, and data scientists looking to explore the capabilities of Apache Arrow from the ground up. Whether you’re building utilities for data analytics and query engines, or building full pipelines with tabular data, this book can help you out regardless of your preferred programming language. A basic understanding of data analysis concepts is needed, but not necessary. Code examples are provided using C++, Python, and Go throughout the book.
Share
