Understand Object storage by its performance

How to qualify Object Storage perf

Nowadays anyone who want to smartly store cool or cold data will be guided to an Object Storage solution.┬á This cloud model replaced a lot of usages such as our old FTP servers, our backup storage or static website hosting. The keywords here are “Low price”, “Scability”, “Unlimited”. But like we can observe with Computes, all Objects Storages aren’t equal, firstly in terms of price, then in performance.

What does qualify Object Storage performance ?

Latency

Depending of your architecture, latency could be a key factor in the case of workloads related to small blobs. A common example is static website hosting: The average file size won’t exceed 1MB, then you may expect them to be receive by clients almost instantly.

Keep in mind that (generally) an Object Storage is a unique point of service, so for inter-continental connection, it’s recommended to link with a CDN. The table below describes the worldwide average for time-to-first-byte toward storage services:

Africa Asia China Europe N. America Pacific S. America
Asia 1.393 1.264 1.065 0.812 0.899 1.233 1.272
Europe 0.874 0.820 0.957 0.214 0.490 0.996 0.768
N. America 1.343 0.934 1.164 0.635 0.325 0.870 0.652
Pacific 2.534 1.094 1.117 1.763 1.161 0.760 1.570

TTFB in seconds

Bandwidth

If you work with high-sized objects, bandwidth is a more interesting metric. It is especially visible in Big Data architectures, for their low storage costs, Object Storage are very appropriated for huge dataset storing but between the remote and local storage, network bandwidth is the main bottleneck.

Like latency, the factor is double: client and server networks count and at this game Clouds aren’t equal. Server’s bandwidth can be throttled at different layers:

  • For a connection : A maximum bandwidth is set for incoming request
  • At bucket layer : Each bucket are limited
  • For a whole service : Limitation is global for the tenant or each deployed Object Storage service

Bucket scalability

While Object Storage often appears as simple filesystem available with HTTP, under the hood, many technical constraints appear for the Cloud provider. Buckets are presented as ┬▒unlimited flat blob containers, but several factors can make your performance varies:

  • The total number of object in your bucket
  • The total size of objects in your bucket
  • The name of your objects, especially the prefix

Burst handling

Something never presented on the landing pages is the capacity to handle a high load of connections. Again here, the market isn’t homogeneous, some vendors support heavy times worthy of a DDoS, other will have a decreasing of performance or simply return a HTTP 429 Too Many Requests.

The solution may be to simply balance loads across services/buckets or use a CDN service which is more appropriate for intensive HTTP workloads.

Conclusion

There’s no rule of thumb to establish if an Object Storage has good performance from its specification. Even if providers use standard software such as Ceph, the hardware and configuration create a genuine solution with their constraints and advantages. That’s why performance testing is always a requirement to understand the product profile.

Observe worldwide network latencies

Have you ever thought which provider will give the best latency to your users ? Not a theoretical value but an accurate metric representing a real end-to-end connection. At Cloud Mercato our platform allow us to manage cloud components all around the globe. VPS, virtual machines, buckets or CDN, we can easily setup worldwide client-server configuration and run network workloads. But this approach could be qualified as Datacenter to Datacenter: My client is an instance at provider X and it hits another machine at provider Y. So basically as providers are always supposed to have a low latency connection, the scenario becomes unappropriated to test real end user connection.

From our point of view, this performance test has to be done in the same condition than an end user: From a 3G/4G/5G device, with WiFi, through aDSL or optical fiber. Instead of create another Unix command we decided to write Observer, a web application letting your test more than 100 locations directly from your browser.

What it does ?

Observer displays performance from live tests operated by your browser. We setup endpoints among a bunch of Object Storages and CDNs and allow you to compare performance among the different solutions and providers.

Actually this application requests to our CTP a list of available endpoints serving a 1 byte file. For each item, an AJAX request is launched outputting Time To First Byte (TTFB). This value is reported as milliseconds on left table and temperature on map.

Some quick observations

  • If your target is regional, CDNs may not bring you an advantage in terms of latency
  • Even without CDN, Google benefits a lot from their private worldwide network

What is the future of this application ?

It’s actually still in beta/PoC but clearly it reaches our ambition that are testing TTFB from anywhere. From this seed we already imagine a lot of usage:

  • Smart integration directly on provider website making live testing
  • Better data visualization with charts
  • Whole data visualization allowing to understand geographical area’s latency by provider and/or device
  • Bandwidth test with upload and download
  • Integrate our pricing data
  • Yes, change the skin …

If you are a provider and would like to integrate your product in this application, do not hesitate to contact us. In any case, we invite you to test and give us a feedback, we love to see other insights.