Motivation: Supporting high data ingestion rates and performing frequent aggregate queries

In real time analytics, a major requirement is to be able to ingest data at high rates, while at the same time compute aggregations over the real-time data. For instance, a common use case is ingesting data at high rates and computing KPIs or other metrics, over the ingested data. Examples of this use case are performance monitoring, IoT, eAdvertisement, smart grids, industry 4.0, etc. This kind of workload is troublesome for SQL databases because they are not efficient at ingesting data. Also, the aggregate analytical queries are very expensive because they need to traverse large amounts of data very…


Motivation: supporting high data ingestion rates and efficient SQL queries

The cost of monitoring solutions highly depends on the required footprint to ingest the monitoring data and to query these data. Today there is a duality on existing data management solutions. On one hand, NoSQL technology and, more particularly, key-value data stores, are very efficient at ingesting data. However, queries are not efficiently processed since they have a dramatic tradeoff due to the data structures they use to manage data, this makes them very efficient for ingesting data, but very inefficient for querying data. On the other hand, SQL databases are symmetric. They are very efficient at querying data. However…


Motivation: Supporting high data ingestion rates and performing frequent aggregate queries

In real time analytics, a major requirement is to be able to ingest data at high rates, while at the same time compute aggregations over the real-time data. For instance, a common use case is ingesting data at high rates and computing KPIs or other metrics, over the ingested data. Examples of this use case are performance monitoring, IoT, eAdvertisement, smart grids, industry 4.0, etc. This kind of workload is troublesome for SQL databases because they are not efficient at ingesting data. Also, the aggregate analytical queries are very expensive because they need to traverse large amounts of data very…


This post was written originally at https://leanxcale.com/blog

SUMMARY

New applications, such as geo-marketing tools, car-sharing, or mobility apps, require handling large data volumes. Traditionally, most applications that require geographical query capabilities use PostgreSQL as a database. However, because PostgreSQL is a centralized database, its scalability capacities are limited. This post introduces LeanXcale as a scalable alternative for developers that process geographical data.

INTRODUCTION

LeanXcale’s GIS support combined with LeanXcale’s performance capacity result in a high-performance GIS system that allows a large volume of data and a high number of parallel queries to be harnessed effectively. Now, LeanXcale’s GIS support means it can process geographical operations as well as offer…


This post was written originally by José María Zaragoza at https://leanxcale.com/blog

A database is essentially “an organized collection of data.” Of course, this idea of being “organized” holds an entire world within, which is a primary reason why databases have become one of the most complex systems in computer science.

Even more, the topology of the data stored and how they are used define their management strategy. A common use case when generating and using data is the time-series, where the time when data was recorded is stored sequentially. Here, the data begins at a point in time and continues to be generated forever.

As a database receiving data with times…


This post was written originally by José María Zaragoza at https://leanxcale.com/blog

In the world of databases, most people have encountered some “hidden” processes executed in the background resulting in the spoiling of performance. Each database technology has its usual suspects. With Oracle, it is the RMAN process, and with PostgreSQL, troubles are clearly from the vacuum process. The vacuum process is a well-known drawback, and during recent years, I received many questions related to this process and how new databases managing these technical problems they try to overcome. This responsibility is a common headache for integration engineers, and the first target of any performance improvement task force executed over a system…


This post was written originally by Jesús Gallego Romero at https://leanxcale.com/blog

In this post, I show how to set up a local LeanXcale environment using Docker and Docker Compose for development purposes. This is very useful in development environments when we do not want to spend our time setting up complex, production-like environments and instead want to have a fully functional testing environment. Also, I show the benefit of scaling up the database with Docker Compose by distributing a table into two nodes and loading it in parallel.

All the code and configuration files used in this post can be downloaded from our git repository.

Prerequisites

The first prerequisite is to install…


This post was written originally by Diego Burgos Sancho and Jesús Gallego Romero at https://leanxcale.com/blog

Today, NoSQL databases are widely used with key-value datastores being one of the most used of all database varieties. The key-value database is designed for storing, retrieving, and managing associative arrays. Records are stored and retrieved using a uniquely identifying key to find data within the database quickly. These databases are used in many scenarios, such as AdTech for cache matching, TravelTech for real-time pricing, and IoT for enabling smart cities.

The most used key-value database is likely Amazon DynamoDB, although alternatives exist, including Microsoft Azure Cosmos DB and Aerospike. Amazon DynamoDB is fully granted to deliver single-digit, millisecond performance…


This post was written originally by Diego Burgos Sancho at https://leanxcale.com/blog

INTRODUCTION

Today, companies are storing more data compared to years ago, which creates a need for systems capable of storing and processing so much information. The data generated and stored by companies has been exponentially growing during the last years. By 2025, it is estimated that 463 exabytes of data will be generated each day globally. The best-known technology to store and process data is a database. However, traditional databases cannot manage that huge amounts of data. An alternative exists, called NoSQL, but it includes multiple problems:

  • The interfaces of the NoSQL platforms are different from traditional SQL databases, which implies…


This post was written originally by Sandra Ebro Gómez at https://leanxcale.com/blog

Geospatial data refers to objects (in the wide sense of the object word) that may have a representation on the surface of the earth. These representations can be locations (points), paths, areas, or any kind of information that is capable of being represented in a map. As these kinds of objects may be static in the short term or dynamic in real time, geospatial data combines objects with metadata, such as specific attributes or temporal information, to try to build a complex environment ready to be analyzed. In this context, the concept of geospatial big data arises. …

LeanXcale

LeanXcale combines real-time NoSQL access with SQL full ACID linear scalability and analytical capabilities

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store