Building Resilience: Our Cloud Communications Infrastructure

We understand that our VoIP platform is vital to our customers; they depend on it to provide their products and services. Like security, availability and resilience can’t be ‘added later’ or only considered at the edges of the platform. They must be integral to its design.

Security and reliability have always been considerations when choosing new business-critical technology. However – in the wake of recent challenges for the telecommunications sector – they’re currently even more in the spotlight. As such, we wanted to take this opportunity to discuss our approach.


How we build resilience into our services

Rob Lesslie – Gradwell’s Head of Development – explains our platform architecture:

The platform architecture is complex, comprising many different and interconnected elements. For performance and availability reasons each element is a cluster, built to scale independently and be resilient. The precise number of infrastructure nodes required in each cluster to deliver a high quality of service can’t be known in advance because it is dependent on usage, and differing patterns of usage drive different loads across the platform. We need the platform to be sufficiently responsive so that as demand changes, it responds in real-time to adjust the mix of resources in use.”

We are also cognisant that – because of events beyond our control – any individual node in a cluster could fail at any time without warning. This also drives the architecture toward one which ensures there are no single points of failure through the use of homogenous clusters.”

Embracing the power of public cloud

“The best way to build a platform which effectively delivers against these requirements is to use the public cloud” explains Rob. However, treating the public cloud in the same way one would approach a traditional data centre deployment does not result in a modern, fit for purpose cloud native solution. The engineering approaches, platform architecture, tools, testing and monitoring methods must all adapt to get the best result. Continuous integration, continuous delivery, DevOps, and serverless computing are all necessary. We must also be aware of our regulatory responsibilities (especially with regard to data sovereignty and territoriality) while deploying in a way that gives us geographic resiliency.”

Protecting against DoS attacks and outages

“Following the principle of defence in depth, we have a number of layers of protection against DoS attacks and other load-based vectors” says Rob.This often includes very high volumes of legitimate traffic from our customers, which we are able to discriminate from that of malicious actors and are able to serve successfully because of our ability to use the scalability inherent in the public cloud.  Invalid traffic from third parties is evaluated and dropped as early as possible to reduce unwanted load.”

We also apply the zero-trust principle and all operations, including internal platform operations, require authentication over TLS with each service and cluster obeying the principle of least privilege.  As many network interfaces as possible are private, and all internal and external API operations are subject to rate limits and other protections.”

What this means for Gradwell customers

“It is this inherent flexibility in the structure, coupled with a very high level of automation in platform orchestration and adherence to best architectural principles, which allows us to deliver a highly available service to our customers” concluded Rob.

In practical terms, some of the key benefits this approach makes possible include:

  • Uninterrupted service: even in the face of unforeseen challenges or heavy usage spikes, our services remain uninterrupted and reliable.
  • Enhanced security: by following best practice, adhering to regulatory requirements and employing a zero-trust approach, we provide a highly secure environment and protect our customers from potential threats.
  • Improved flexibility: our cloud-native approach allows for seamless scaling of resources based on demand, ensuring that our customers can grow without constraints.
  • Efficiency: through automation we streamline operations, minimise response times and enhance the efficiency of our services, ultimately delivering a superior experience for our customers, their end-users and stakeholders.

Wave VoIP Phone System

Learn more about Gradwell’s feature-rich, secure, and cost-effective solution. 

More articles

Interpreting BT’s latest PSTN switch-off announcement

Interpreting BT’s latest PSTN switch-off announcement

In November 2017, BT announced their intention to retire analogue telephone networks such as the Public Switched Telephone Network (PSTN) by the end of 2025. On 20th May 2024, they released details of a “refined digital switchover programme,” causing confusion for the...

Wave earns ‘PC Pro Recommendation’ for third year running

Wave earns ‘PC Pro Recommendation’ for third year running

We are delighted to announce that our very own, cloud-based phone system, Wave has earned PC Pro Recommended status for the third consecutive year. As part of PC Pro Magazine issue 357, which included a report marking the best business VoIP phone systems, Wave stands...