Latency in Global Lock Servers

The file storage world has long been dominated by large, expensive devices tuned to serving files locally. This hardware was not built to expand access to files to multiple offices or mobile devices. But today’s end users need to be able to access files from any location and on any device, and they need to be able to collaborate easily with each other, editing and updating the same project files without introducing conflicts. This is the reason so many businesses are now embracing cloud-backed global file systems.

But a global file system still has to be fast. An engineer located in New York needs to be able to open a project file without delay, even if that file was just updated by her colleague in Los Angeles. Any recent changes need to appear in New York, too, and the global file system has to prevent editing conflicts between any users who might be working on the project. Otherwise you end up with an IT nightmare.

Global Lock Servers

This conflict prevention piece is the role of the global lock server, an essential part of the global file system. Unfortunately, some degree of latency in global lock servers is unavoidable, and as I will explain in this post, the key is to minimize that latency so that it is unnoticeable to the end user.

There are two types of latency, and the degree to which they impact performance depends both on the particular operation being carried out and the overall design of the system. The first design approach is to place a lock server on the storage hardware at each location, then have these lock servers communicate via the corporate network. The second approach is to have a single, cloud-based lock server that communicates with each of those local devices. The choice between the two involves multiple trade-offs.

Lock-related Latency

Consider lock-related latency first. The device-centric approach to locking can reduce latency for local users since the system doesn’t have to go to the cloud for a lock. As for the other approach, the system does have to communicate with the cloud, but latency is consistently minimal. The difference is on the order of milliseconds, and this performance is consistent across all locations because the lock server is a dedicated, web-scale server that can easily scale up as it needs more resources.


“What the cloud-centric system gives you is consistently minimal latency across all locations and devices.”


The device-centric approach to the lock server does not offer this level of consistency. Performance depends in part on the available resources of a constrained, physical device that is carrying out many other tasks, including serving user IOPS, dealing with metadata, pushing to the cloud, etc. As device load increases, latency introduced by the lock server itself can be a real problem, locally or in distant locations.

Network-related Latency

Next you have to consider network-related slowdowns. With the cloud-based approach, latency can increase slightly with added load to the local cloud bandwidth, but the effect is consistently minimal. The device-centric approach introduces another sort of problem. Consider those two hypothetical end users in New York and Los Angeles. Even if that user in New York has a locally cached version of the file in question, she could experience delays in opening the file because her local server needs to communicate with the lock server on the home device in Los Angeles. The lock request has to hop and push and wind its way through the corporate network to get to Los Angeles, then return. Even with MPLS, this could be a more difficult and time-consuming journey than a trip to the cloud, not to mention the load many of these transactions add to the corporate network. If the inter-office communication runs through a VPN over the Internet, you are introducing the latency of going across the Internet and through the VPN servers.

Generally, the total latency is equal to the sum of the network latency and the lock server latency, but it can vary greatly depending on the situation. The application in question, the sharing occurring, access patterns, transactions, whether an open request points to one file or many nested under the covers – all of these can impact latency. What the cloud-centric system gives you is consistently minimal latency across all locations and devices. On the other hand, the local, device-based approach can vary significantly with load and, as discussed above, location. This means end users do not enjoy equal access to the same files, which should really be one of the main functions of a global file system.

Nasuni has always taken the cloud-centric approach to storing, protecting, managing and extending access to files. That is part of the reason we are able to deliver a complete set of Enterprise File Services. We chose to design our system with a cloud-based global lock server because we wanted something that was built to scale – whether that growth relates to the number of offices, mobile users, files or all three.

If you would like to learn more, or discuss your particular use case, we’d be happy to talk.