Interview Silk CTO: “Getting Out of the Storage Hardware Business Has Been a Liberating Experience”

Photo Derek Swanson
Silk CTO Derek Swanson

Silk, a cloud data platform provider formerly known as Kaminario, with global offices located in North America, the UK, and EMEA, and its R&D in Israel, has been transforming its business from selling all-flash arrays to becoming cloud block storage solutions provider. HostingJournalist sat down with Silk CTO Derek Swanson to talk about their business transformation, Silk’s market position, and market trends they’re responding to.

Silk is a cloud data platform that enables companies to embrace hybrid cloud while optimizing performance and cost for their cloud infrastructures. The company decouples data from the underlying infrastructure making it possible to “quickly and easily” move data from on-prem to the cloud, from cloud to cloud, and back again without the need to refactor or rearchitect the applications.

Why this transformation from all-flash arrays to cloud block storage, from hardware to these software-defined storage solutions?

“As our customers increasingly began to adopt cloud solutions, we discovered that there was a gap in managing both their cloud and on-prem solutions coherently through a hybrid cloud strategy. In addition, as more companies become cloud-native, we realized they needed a way to ensure that they are getting the high performance that they need. With our software-defined solution, we were perfectly positioned to offer customers a private cloud, hybrid cloud, public cloud, and multi-cloud platform that enables them to easily and completely move and manage their data infrastructure, while offering superior performance for a reduced cost.”

“We are still developing and enhancing our all-flash arrays that run on-prem (private cloud), we’ve simply extended the platform to include public cloud. The block storage we provide in the public cloud is all-flash (no HDD or slower medium) and made for applications that want high performance and granular composable scalability for high efficiency.”

Silk claims that its clients can make cloud environments run smarter using of Silk’s Cloud Data Platform. Can you explain?

“The Silk Cloud Data Platform leverages machine-learning analytics that help you monitor and optimize the performance of your workloads across your hybrid cloud and multi-cloud environments in real-time. You can easily orchestrate your data, moving workloads from one cloud environment to another, or on-prem to the cloud (and vice versa), as well as scale up and down to quickly meet your needs as they evolve. Silk also offers rules-based automation so you can put your infrastructure on cruise control and let Silk optimize your workloads for you.”

“Silk leverages the native cloud IaaS capabilities of adding and turning off compute resources as needed through an application we call Flex. Flex is our orchestration engine that can dynamically add and remove controller performance capacity, based on rules around typical SLOs such as throughput, latency, or IOPS. With a variety of different sizing and performance options, and with native NDU scale-out/in/up/down capabilities, Silk helps clients right-size the resources to the application to meet current load requirements.

How are scalability and performance in Silk’s Cloud Data Platform guaranteed from a technology perspective?

“Silk is a symmetric active-active architecture across both the front and back end and was originally designed as such.  This means that a Silk cluster can scale out by adding additional controllers – c.nodes (typically today 2-8, but more are possible if someone needed that much performance) in a non-disruptive fashion. These new c.nodes insert into the active cluster and immediately begin processing IO, and are accessing the same resource pool, metadata database, host connections, etc. that the other c.nodes are accessing.  When we add another c.node, we add that much more additional IOPS and throughput, and latency may also be lowered (if it was high before). The amount varies depending on the hardware platform, but typically a few hundred thousand IOPS and a few GB/s of throughput are added for each c.node. Latency of the platform under nominal load is typically about 300us for writes and 500us for reads.”

“Once additional c.nodes are added to the cluster, the hosts can also be modified to add the additional IP targets in (for iSCSI load balancing) that improve performance even further by flattening out access patterns – as every c.node can service IO from any host for any read or any write across the entire capacity footprint of the cluster. This is not a hard requirement however, as only one node needs to have a host connected to it. That c.node will distribute IO to the other c.nodes in the cluster actively.”

“This makes us quite different from other products that are single controller only, or dual-controller but the second controller is for failover only – commonly known as active-passive or “dual-active” when two controllers are both processing IO, but they are accessing different storage pools with different configurations and serving different hosts.”

“As long as there are enough compute resources available in the zone/region where the Silk data pod cluster is being instantiated to support adding additional c.nodes, (or m.nodes in case additional capacity is required) Flex can scale out and/or up the cluster for additional performance or storage capacity, or Flex can create new data pod clusters. The only limitation is in available cloud IaaS hardware.”

Silk’s software stack is fully running on public cloud. Why is that?

“There are only two ways to get the same Tier 1 performance, rich data services, availability, and mobility that is available with on-prem solutions. One: physically co-locate hardware into cloud connected datacenters and cross connect hardware into them – which requires full partnership with the public cloud providers and is extremely expensive and inefficient. Two: leverage and aggregate existing cloud hyperscale infrastructure to create a data platform that delivers the same experience to the end user applications while keeping costs at parity or better compared to cloud native offerings. Silk has of course chosen the second path, which delivers maximum flexibility, total ease of mobility between zones, regions and cloud providers, while removing the dependence on proprietary hardware and keeping costs the same or lower than cloud native IaaS costs (this is done through rich data services and economies of shared scale, neither of which are possible with cloud-native IaaS).”

“Cloud native resources CAN be inefficient and expensive, which is what some businesses are experiencing, because the cloud is a shared nothing architecture, and the three primary resources of cloud (compute, network, and storage) have limits that are interdependent upon each other. To get enough of one resource, it is necessary to provision so much of another resource, even if you do not need it and don’t use it. This is called overprovisioning, and it is essentially wasted dollars. For certain applications and especially legacy architectures (not cloud-native applications using microservices or serverless computing paradigms), this waste and cost can be prohibitive. Silk reduces this interdependence by separating data capacity from data performance and allows for consolidation of resources onto a single platform for a many to one efficiency benefit. Cloud native is entirely one to one, which is inefficient. Silk can dramatically reduce overprovisioning and allow for businesses to truly bend the cost curve down as they expand their footprint; the environment gets more efficient, making business expansion more profitable.”

What about the cost, wouldn’t it be more cost-effective to have it run on premises, in a private environment?

“The cost advantages of running in the cloud are around flexibility and agility – the ability to spin things up immediately and tear them back down when they are no longer needed. These are things you cannot do on-premise – along with the ability to move things to other locations rapidly, whereas on-premise this is much more difficult, time consuming, and costly. The advantage our customers have is that they can now leverage their Silk software license in both the public cloud and on-prem, as the license is portable, and just pay for the IaaS resources that are being deployed as needed.”

What about the security, how is it guaranteed?

“Silk provides AES 256-bit encryption at rest and have FIPS 140-2 level software security certification. Administrative access to the data pod configuration is through local account or secure LDAP directory authentication. Silk manages the keys internally for the entire cluster and provides for cryptographic erase capabilities.”

What impact will the following trends have on the adoption of Silk’s Cloud Data Platform?

  • 5G adoption? “5G will enable much more robust and richer applications to run in real-time/near real-time on edge devices. The requirements for localized hardware to provide the data for these applications will be significant, as the amount of application bandwidth available can be as much as 100x what was previously configured with 4G or WIFI5. Data locality will be extremely important in edge computing, as the time value of data decreases exponentially every second that passes. The need for extremely high-performance data platforms to empower these new applications at the edge is something that Silk plans on filling. As Silk is software defined and essentially just runs on a Linux OS, we can instantiate in a small footprint easily, and provide all the necessary services without complex or costly proprietary hardware.”
  • The use of AI applications? “We are already filling a need for HPC class applications that are leveraging AI. These applications require far more throughput and low latency than before, because the GPUs powering them are incredibly powerful. GPU technology has advanced at a phenomenal rate to the point where saturating multiple 100Gbps links is easily done, and the CPUs are used primarily just for scheduling and monitoring, instead of data processing.  To continually feed enough data into these systems is enormously challenging for cloud providers, as cloud native resources struggle with both the cost of overprovisioning and performance gaps resulting in GPU starvation. Silk is being used today to enable enough throughput at a low enough cost to make using these new GPUs cost effective. While object stores and data lakes are good enough to store the HPC data, when it comes time to actually mine the data for useful results, these new GPUs want something that can move 10GB/s sustained for a 5TB data set without suffering IO wait states. That’s very hard for public clouds to do today.”
  • IoT? “IoT footprints are more along the lines of big data, where the primary value is in large capacity, low-cost environments. Silk is designed for high performance and highly transactional data sets, so IoT isn’t a primary focus for us.”
  • Edge computing? “Same as 5G. WIFI6 and 5G require providers to push data analytics closer to the edge, where they have not been in the past. Tier 1 data platforms need to run smaller and closer to the customer than ever before, but with extremely high performance, and without proprietary hardware requirements as both footprint and low latency are a requirement for costs and customer experience.”

What role does Silk’s proprietary VisionOS software play in the overall proposition, and which features stand out?

“VisionOS is the backbone of the entire Silk Cloud Data Platform. VisionOS offers Tier 1 data services that make customers’ applications in the cloud run with the same level of performance as they would on-prem, with sub-millisecond latency. With real-time data reduction, thin provisioning, and zero-footprint clones, our customers are able to reduce how much capacity they are using – in turn, reducing their cloud costs and dramatically improving their performance.”

“The ability to symmetrically scale out in a non-disruptive fashion is also unique in that Silk can add or remove performance capacity on the fly and can move resources around from one data pod cluster to another within minutes. Silk’s orchestration enables the autonomous data center, with self-healing, self-provisioning and decommissioning, and significantly reduces the amount of management needed to operate the platform. Any server admin can learn the entirely of Silk management in 30 minutes. VisionOS is designed for simplicity, scalability, autonomy, and performance at low acquisition and maintenance costs.  VisionOS is now an 8th generation OE.”

How does Silk’s technology stand out from the competition?

“Our traditional competitors on-prem are still largely offering the same products. Two of these only offer active/passive or dual-active style non-symmetric architectures that are scale-up only. They offer a federated move+manage layer that aggregates management and allows for some basic cloud mobility. The other offers two truly scalable symmetric active-active offerings but these are on-prem only and don’t run in the cloud at all.”

“The primary difference between Silk and these companies is in the scalability and software defined nature of our offerings that allow for us to provide the same feature/functionality in the cloud as we provide on-prem. Our competitors’ cloud offerings are dramatically reduced in feature/function and extremely limited in performance compared to us. One competitor has no cloud offering to speak of for Tier 1 block data storage.”

“We offer a true Tier 1 solution, while we view their solution as more of a dev/test non-production offering. The other companies are much further behind in every way for cloud-based technology.”

Silk is using a dedicated block store and different frontend. Why not unify your file and object offering? Can we expect moderations in this area in the future?

“This is an ongoing request from customers that we solve by utilizing various 3rd party file system heads to front end our block store. Because of the amount of work and time required to build our own enterprise-class file head, we have focused on doing what we do best with our resources. Frankly, there are dozens of file head solutions out there from software defined to hardware proprietary, many of which allow you to put any block storage behind them.”

“We believe that creating a true enterprise-class file and object integrated offering is cost prohibitive at this point with so many other mature market offerings available – the market is heavily saturated. We don’t rule out a partnership at some point, but it is unlikely we’ll develop our own front end in the next 12-18 months.”

Some say Silk can grow at the expense of all-flash array vendors. Can you explain?

“Businesses are increasingly looking to move their mission-critical applications into the public clouds. The data services offerings of public clouds are today not robust or mature enough to enable this shift in a significant fashion, and growth for the Tier 1 space is not nearly what the cloud providers want it to be. While it is easy to move simple Tier 2 and Tier 3 applications to the cloud, where any architecture is almost always ‘good enough’, whether natively or by using containers, the proprietary Line of Business applications that require the power and cost benefit/ROI that on-prem solutions have provided for decades cannot be easily moved to the shared-nothing architecture of the public cloud.”

“Businesses do not want to pay massive premiums through overprovisioning or have to completely refactor applications (into SOA, microservices, or serverless) to get them to work properly in the cloud or have to give up 50% of their feature/functionality either through some partially successful rewrite. It is a real problem.”

“This is where Silk comes in. Rather than buying new all-flash footprint on-premise to continue to run mission-critical applications, customers now can acquire Tier 1 all-flash shared footprint in the cloud with the same architecture and application enabling high performance, without refactoring – Silk enables a robust lift and shift. Some of the dollar spend for high performance data platforms will move from legacy on-prem all-flash vendors to new cloud data platforms as the ability of the public cloud to support mission-critical is now truly enabled in a simple fashion. Beyond this, clients gain the flexibility, agility, operational simplicity, and mobility of the cloud, which is largely what makes the cloud more attractive than on-premise operations. Businesses don’t WANT to spend more on legacy architecture on-prem, they want to move to OPEX models in the cloud. Silk helps enable that shift.”

From a personal point of view, do you miss the (fading) hardware part in your company’s proposition?

“Getting out of the hardware business has been a liberating experience for Silk. It has allowed us to take a fresh look at what our customers need, how they are managing their data, and how we can best serve them.”

“Moving the hardware supply chain, fulfillment, and break/fix functions to our partner, Tech Data, has freed up a lot of time and capital to focus on the software data services, which has been excellent for us. We will continue to evolve our software stack away from anything proprietary hardware-based as far as possible while still maintaining a Tier 1, 5 9’s level of enterprise performance and availability.”

To conclude, what news from Silk can we expect in the coming period?

“This year we went GA on all three public clouds. In the next 12 months look for us to be providing deeper automation and orchestration with cloud native APIs, and to further extend our support for Kubernetes, Terraform, Ansible, Powershell, and other workflow automation tools. The goal is to become as near as possible ‘push-button’ service catalog enabled with these different orchestration toolkits. Simplicity in delivering high value end-user outcomes is the goal for our solutions teams.”

About Silk and Derek Swanson

Silk - LogoSilk’s Cloud Data Platform optimizes cloud infrastructure, allowing organizations to get ten higher performance out of their existing cloud data while spending less. It would make cloud environments run smarter without changing a thing. With real-time data reduction, thin provisioning, and continuous resource optimization, Silk automatically matches cloud data spend to actual data needs at every moment.

Derek Swanson has 20 years of experience as a technology evangelist, systems architect, and data systems engineer. At Silk he manages the worldwide sales engineering organization and is the senior customer-facing technologist and product evangelist in the organization. Prior to Silk, Derek has had a successful career at Sorenson, Code Communications, and Unisys. He holds a Bachelors in Political Science and Government from Brigham Young University.

 

Read here more Executive Interviews on HostingJournalist.com.