What is Sharding?
Sharding is a database partitioning technique that distributes larger databases into smaller segments and can be spread between more two or more exclusive servers.
Each distinct segment in a database is identified as a separate shard; held on isolated data bank servers. These shards function as the solitary locations for relevant subsections of data. That’s all well, you might say, but what are the reasons for doing so?
There are various benefits to this horizontal segregation technique. Sharding enables a decrease in index size, improves search capabilities and leads to several enhancements in data management and efficiency. A major payoff of this arrangement is scalability; enabling creation and management of large databases.
The Different Types of Sharding
There are two basic categories of sharding: algorithmic and dynamic.
It is a technique that allows clients to access shards without requiring any help from an external service.. A popular user case for algorithmic sharding is a multi-volume encyclopedia or a directory containing contact information.
In dynamic sharding, an external service that validates existence of data in a namespace and defines the location of shards where an object’s data lies.
Another type of categorization is concerned with the architecture of the input data. The differentiation between horizontal and vertical sharding stems from the tabular outlook of the database.
- Vertical sharding
A database can be partitioned vertically; it is a method for storage of data stemming from different columns into separate shards.
- Horizontally sharding
Storage of data distributed and linked across rows is called horizontal sharding. Horizontal partitioning is an arrangement which is executed on the condition that the rows, and not columns, of a data bank table are linked distinctly.
- Domain specific sharding
A logical split is drawn within the application data, storing them in different databases. Generally, this type of splitting is applied at the application level.
Importance of Sharding
Sharding is important due to several reasons, some of which were briefly mentioned earlier. Let’s reiterate the main advantages of this technique:
- Increased search performance. As rows of database tables are held separately, this means smaller index size. This ultimately results in faster and improved search capabilities.
- Reliability in data is a fundamental goal of sharding. Multiple servers usually translates to high availability, as server downtimes can be less costly as compared to dependency on one back end data source.
- Harmonizing. The load of the system is equally scattered and the stored data can be managed more efficiently.
- Speed. Sharding basically allows one to benefit from the computing power of more than one server. This allows noticeable performance improvements and overall speed enhancements.