Parallel Processing, Replication & Database Grids
Parallel Processing - Transbase® Dynamic Multithreading
Transbase® Dynamic Multithreading was designed to take advantage of multiprocessor architectures (today available also in notebooks), even with compute-intensive queries.
This is done by the dynamic parallelization of certain appropriate sub-calculations. Transbase® Dynamic Multithreading splits the selected sub-calculations into several similar threads, whereby the parallelism can be changed almost arbitrarily.
The division takes place via a special operator tree node ASYNC, which concentrates on the following essential aspects of parallel processing:
- Provide a tuple buffer into which underlying nodes lay their tuples and from which nodes above collect their tuples
- Start and stop threads and buffer access synchronization - the number of generated threads is dynamic and depends on the processing speed of the threads
If there is a 'traffic jam' at one point in the operator tree, further threads are created to reduce the congestion. This results in a dynamic balance of the threads with the result that they do not have to wait too often because of full or empty buffers.
Thus, while many concurrent requests are already distributing these requests to different Transbase® kernel processes for utilization of the CPUs, in the case of low request concurrency, it is the task of the single process to simultaneously utilize multiple CPUs.
Using two different methods, the process can be broken down into multiple threads, which can then work in parallel on a single query:
Decomposition into different threads
e.g. IO-Threads, Restriction-Threads, Sort-Threads, etc.
There are certain limits to the parallelism, which result from the different functions of the threads.
Breakdown into several similar threads
Here, the parallelism can be changed almost arbitrarily, as far as a certain data parallelism is given. This decomposition is done as Dynamic Multithreading (see graphic), with the number of parallel threads being dynamically adjusted during execution.
Parallel processing also extends in particular to the IO-relevant phases of query processing. As a result, the IO throughput is also increased significantly, so that large RAID systems can be optimally utilized. In particular, the B-tree algorithms for sheet access - sequential scan and interval access - the hypercube algorithms - multidimensional access - as well as the sorting algorithms, which can generate a high IO load depending on the sorting volume, are particularly relevant for IO.
Thus, existing hardware resources can be fully exploited not only for the parallel processing of different requests, but also for the acceleration of individual compute-intensive requests.
Replication & Database Grids
Transbase® databases are very easy, robust and fast to replicate. The replication serves for
- Setting up a standby mode
- Load distribution in a database grid
- continuous or periodic distribution from a central database to autonomous offline databases
Technically, replication occurs by continuously transferring the log of the master database to the slave database (s) and processing it there. This means that every slave database is able to take over the role of the master database immediately and without start-up time, should this be necessary. This is called standby operation, which is needed especially for highly available databases.
The replication takes place - as well as the communication between client and server - via a fixed TCP / IP port. This allows a database to be replicated to a geographically remote computer.
Database Grids & Standby Databases:
Standby databases are pure read-only databases, they can only be changed via replication. Otherwise, they can be operated like normal databases, even while replication is active. All slave databases can be integrated into a database grid in which Transbase® ensures automatic dynamic load balancing of read-only requests.
Through replication, a centralized database can be replicated to many autonomous databases, which in turn are periodically or continuously updated. This is suitable e.g. in the automotive sector for the supply of workshops from a central data source. After a morning update, garages can work autonomously on a daily updated database
In this environment, any hierarchical structures are supported in which standby databases can be both master and slave.