Apache Pinot is a real-time distributed OLAP database designed for low-latency query execution even at extremely high throughput. Apache Pinot can ingest directly from streaming sources like Apache Kafka and make events available for querying immediately. Querying real-time events as they happen is at the core of Apache Pinot.

Apache Pinot was developed at LinkedIn in 2014 and currently serves over 200k queries per second to power many of the features users love like “who’s viewed my profile?”.

Founded by the original developers of Apache Pinot, StarTree provides a fully managed platform for Apache Pinot. By removing the burden of infrastructure management, companies can focus on delivering real-time insights to their end users.

Columnar Database

At its core, Apache Pinot is a column-oriented database that can be queried using a subset of SQL. Its pluggable indexing allows for rapid development of new index types that continuously push the boundaries of performance. OLAP databases are traditionally slow at query execution and inefficient with storage, but Apache Pinot executes most queries in under a second while providing improved storage requirements compared to other databases.

User-facing Real-time Analytics

Apache Pinot was built for user-facing applications. These applications require low-latency response times to provide users with the best experience possible. Traditionally, analytical insights were reserved for internal users because end users are less forgiving. This is why low-latency is critical to the mission.

Real-time Data Ingestion

Apache Pinot is capable of ingesting millions of events per second while allowing them to be queried immediately. The unique combination of ingestion and query availability unlocks new user experiences that could not previously be accomplished at scale without caching.

Highly Scalable and Fault Tolerant

Apache Pinot is a distributed system that can span thousands of nodes. These nodes act as a single system responding to query requests in unison, and the data is replicated between nodes for fault tolerance.

Learn More

📖 Read the Docs
📺 Watch the Getting Started Videos
💬 Join Slack
🐦 Follow us on Twitter