Distributed cloud is an emerging cloud model where an organization’s cloud strategy and deployment takes into consideration where underpinning infrastructure resources are physically located. Contrastingly, distributed cloud emerged from the original cloud concept that has abstracted away the notion of infrastructure location from the minds of cloud users. In practice, the distributed cloud still has abstracted the cloud idea away from the underlying infrastructure’s location. But now, cloud engineers are able to consider greater flexibility in utilizing infrastructure resources closer to where they are actually needed, in other terms, pushing these resources out to the edge where compute and storage are used and latency can be significantly reduced.
Edge sites are data centers closer to customers—“the edge” refers to the location where devices and local area networks interface with the internet. By placing resources near the edge, particularly where there are more customers and greater usage, CSPs can reduce latency and provide greater quality of service for their users.
By combining multicloud models with distributed cloud, and emphasizing pushing resources to the edge, the concept of substations has also emerged to encompass the idea that CSPs are making available something akin to “shared cloud zones.” These zones, strategically distributed by CSPs, enable companies to customize a distributed cloud deployment aligned with their needs, while importantly retaining the key cloud value proposition, where the CSP assumes infrastructure risk and promises uptime, innovation, and support.
Distributed cloud systems offer the same storage features that other cloud solutions provide. Since distributed systems are technical strategies for improving availability, and consistency, they are less of an active concern for consumers who use the cloud to forget about IT in their day-to-day tasks. However, when entering into agreements where availability is critical, cloud consumers may want to ensure that their CSP offers the advantages of distributed cloud systems. Most distributed cloud storage systems have the following features:
- Elasticity and Scalability — Elasticity refers to the ability to increase workload by adding or removing hardware, whereas scalability refers to the ability of a system to increase the workload on the currently available system. Elasticity and scalability are the key features in cloud-based systems.
- Partitioning— Partitioning allows IT teams to logically divide storage devices, which in turn allows different file systems to be installed on different logical divisions. When many environments are needed, partitioning helps to facilitate a single set of hardware to operate multiple and different operating systems at the same time, another key feature of clouds that enable service to multiple clients with unique IT requirements.
- Replication— Replication is an enabling feature for distributed systems. By replicating data across multiple clusters, more users can be served from wider regions. The basic challenge for cloud providers is ensuring replicated sets are consistent with each other so that all users are accessing relevant and timely information.
- Fault Tolerance — Systems that are fault-tolerant still suffer errors and system outages but remain available despite issues. These systems use an array of strategies to ensure uptime, including replication, redundant hardware, and additional connectivity.
The main benefit of distributed cloud is to bring processing and storage capacity closer to the end-user, or device. While this has the primary impact of reducing latency and building fault tolerance, it has also led to new technological innovations in combination with innovative technologies like the Internet of Things. The following benefits directly impact cloud consumers:
- Improved Compliance — Data compliance and privacy remain heightened issues in the cloud space. The distributed cloud allows companies to isolate sensitive information, such as Personally Identifiable Information (PII) or Protected Health Information (PHI), onto servers in regions that comply with regulations, while other less sensitive and highly access data can remain on servers closer to users, even if they are in regions outside of compliance.
- Increased Uptime — The distributed cloud is a redundancy strategy that improves service failovers, and creates highly dependable services. But, geo-replication brings with it security challenges and complexity in maintaining, deploying, and troubleshooting these systems.
- Improved Delivery — For cloud providers, distributed cloud improves content delivery and ultimately the users’ experience for cloud consumers because resources are closer to where they are used.
- Improved Scalability — Like public clouds, distributed clouds have the advantage of rapid scaling. With the ease of adding new virtual machines to a deployment, new resources can be provisioned on-demand.
Public cloud deployments allow consumers to offload the risks and costs of maintaining and operating their own IT infrastructure. Most cloud service providers will happily assume that risk under a pay-as-gomodel, provisioning only the required storage and compute resources the consumer needs and charging them for that amount. The CSP and consumer enter into a service level agreement (SLA) that establishes the quality of service that is expected. The responsibility then becomes the CSPs to deliver the agreed-upon quality of service, security, and compliance. For customers that have users in multiple locations, CSPs turn to distributed cloud technologies to fulfill their obligations.
The challenge is to deliver data and services from the cloud reliably and fast to global users. While cloud providers conceal how this is done to the consumer, to abstract away the hardware concerns, the premise is simple, to deliver services reliably and fast, then place a copy of those services and data near where they are being most used. In fact, make multiple copies and place them in multiple regions to serve those closest and build a level of fault tolerance through redundancy. These locations are called points of presence (PoPs), an access point or physical location where two devices or networks can communicate. For instance, an ISP line in a home is a PoP for the ISP and home network, and a cell tower is a PoP for the carrier service and a mobile phone. In the case of the CSP, a local or regional data substation is the PoP for local users of the cloud resources.
The net effect is to extend providers' reach, and make the cloud experience for all users like they were connected directly to the central system.
Cloud can refer to many different types of deployments, however, they typically are abstracted and categorized by ownership and responsibility. A public cloud is one that is provided by a cloud service provider over the public Internet. It is exposed to all the connectivity and threats in that space. The CSP is responsible for maintaining the cloud infrastructure, and depending on the type of service provided—IaaS, PaaS, or SaaS—the cloud consumer is responsible for their part. A private cloud is owned and maintained by one organization and secured behind its firewall, limited to a particular group, or group within a group.
A distributed cloud is a cloud with the added consideration of where cloud infrastructure resources are situated geographically. While the cloud connotes infinite accessibility and availability physical limitations of technology cannot faithfully deliver on that promise. To overcome real-world limitations, like latency, CSPs and enterprises develop cloud substations, where resources can be added to a cloud fabric but the infrastructure is actually geographically closer to users. The cloud appears the same, but the underlying is more efficiently placed.
Distributed cloud and distributed cloud storage are nearly synonymous in functionality to cloud consumers. The cloud, today, is moving more into a distributed system to improve overall service delivery. One slight way to distinguish distributed cloud storage is to compare it to cloud computing. Though they operate on the same techniques and hardware, cloud computing distributes workload across a data center's servers to be more efficient and redundant, and distributed cloud storage distributes storage and workload across a network of systems, most likely geographically distant from each other, to be more efficient and redundant.
Edge computing is one of the latest concepts emerging in a cloud context. Though the term may be fresh, the technical idea is not, and pushing compute and storage resources closer to “the edge” of the network has always been a conceptual solution for solving latency. The edge refers to a portion of the distributed system, namely the location where data processing grants the least latency and bandwidth issues, but is not a technology in itself. For Google Distributed Cloud, they make use of the distinction by dividing their offerings into several types of edges where they can run:
- Google’s network edge — A global network of sites that consumers can host their cloud.
- Operator edge — Working with communication services providers to utilize 5G/LTE services as the edge network.
- Customer edge — Supporting customer-owned edge or remote locations whether that be a store location, a plant floor, or branch office.
- Customer data centers — Supporting customer-owned data centers and colocation facilities.
Depending on the paradigm, the edge can take many forms.
Distributed cloud has several practical use cases, each is intent on managing latency and bandwidth. The power of distributed systems to significantly reduce latency has enabled other technologies and changed how resources are delivered:
- Edge and IoT Compute and Storage Capacity — The Internet of Things (IoT) itself is an edge, where data processing is performed by small appliances at the point of use. For instance, in manufacturing industries, line robots connected to edge computing can make quick decisions without needing to ping a central server for an answer. Self-driving cars are another application of edge computing—the safety implications of sending large amounts of sensor data to a central server while driving is too great. Combined with AI and machine learning, distributed systems and edge systems can work together to deliver low-latency services that otherwise, for many limiting reasons, could not.
- Content Delivery Optimization — Streaming services are the poster child of content delivery networks (CDNs), and Netflix may arguably be the model. CDNs are collections of servers geographically distributed throughout the world providing content to users from the most available local servers. Netflix traffic volume and user-based reach were growing so much to keep pace they developed their own proprietary CDN called Open Connect. In a smart move to distribute content, Open Connect boxes (which look like Netflix red servers) are issued to partner ISPs in regions throughout the world. These Open Connect boxes connect with Netflix servers and download regionally popular content to serve local users.