Understand Object storage by its performance
Nowadays anyone who want to smartly store cool or cold data will be guided to an Object Storage solution. This cloud model replaced a lot of usages such as our old FTP servers, our backup storage or static website hosting. The keywords here are "Low price", "Scability", "Unlimited". But like we can observe with Computes, all Objects Storages aren't equal, firstly in terms of price, then in performance.
What does qualify Object Storage performance ?
Depending of your architecture, latency could be a key factor in the case of workloads related to small blobs. A common example is static website hosting: The average file size won't exceed 1MB, then you may expect them to be receive by clients almost instantly. Keep in mind that (generally) an Object Storage is a unique point of service, so for inter-continental connection, it's recommended to link with a CDN. The table below describes the worldwide average for time-to-first-byte toward storage services:
|Africa||Asia||China||Europe||N. America||Pacific||S. America|
If you work with high-sized objects, bandwidth is a more interesting metric. It is especially visible in Big Data architectures, for their low storage costs, Object Storage are very appropriated for huge dataset storing but between the remote and local storage, network bandwidth is the main bottleneck. Like latency, the factor is double: client and server networks count and at this game Clouds aren't equal. Server's bandwidth can be throttled at different layers:
- For a connection : A maximum bandwidth is set for incoming request
- At bucket layer : Each bucket are limited
- For a whole service : Limitation is global for the tenant or each deployed Object Storage service
While Object Storage often appears as simple filesystem available with HTTP, under the hood, many technical constraints appear for the Cloud provider. Buckets are presented as ±unlimited flat blob containers, but several factors can make your performance varies:
- The total number of object in your bucket
- The total size of objects in your bucket
- The name of your objects, especially the prefix
Something never presented on the landing pages is the capacity to handle a high load of connections. Again here, the market isn't homogeneous, some vendors support heavy times worthy of a DDoS, other will have a decreasing of performance or simply return a HTTP 429 Too Many Requests. The solution may be to simply balance loads across services/buckets or use a CDN service which is more appropriate for intensive HTTP workloads.
There's no rule of thumb to establish if an Object Storage has good performance from its specification. Even if providers use standard software such as Ceph, the hardware and configuration create a genuine solution with their constraints and advantages. That's why performance testing is always a requirement to understand the product profile.