During the summer of 2021, there was optimism that the worst supply chain shortages were behind us, and vendors' stockpiles of components and chips would be enough to work through the remaining issues. Instead, early October 2021 saw a worsening of component shortages. Some vendors saw component cancellations, and we are about to exit the year with 9+ month lead times. Many SPs were already moving towards disaggregated routing and the use of Fixed systems.
The current supply chain shortages will increase the rate of adoption to mirror what hyperscalers a doing. The industry can benefit from a better model of Fixed systems.
Component Shortages Worsen Throughout 2021 – Vendors Work Through Supplies
The past several weeks have been turbulent for many system vendors. Everyone is burning through inventory at a quicker rate of replacement. In some cases, component suppliers cut deliveries to vendors, and secondary sources are drying up. Many distributors don't have components, and those that do tend to have higher pricing. This requires many vendors to change out parts and re-qualify systems based on what is available. Few SKUs and reduced complexity is a significant theme for supply chains in 4Q21.
Complex Modular Systems Have Too Many Components
Modular Router and Switch platforms have more components compared to newer Fixed alternatives. Historically, customers preferred these chassis because of higher redundancy (dual supervisors, reduce power supplies, etc.) that would guarantee almost 100% uptime. Some of the heritage also relates to application architectures that require this type of guaranteed service. One could make an analogy that the line care is equivalent to a Fixed 1RU box and that the chassis and all the components that go with it are those additional components. However, networking is different now compared to 5/10/15 years ago, and many of these modular architectures no longer apply to current and future networking needs. These chassis, and their complexity, are now a liability with so many component shortages.
Service Providers (SPs) and Telcos Network Need Dissagregation in Todays Supply Chain World
Disaggregated Routing leverages the hyperscaler supply chain of smaller and easier to build Fixed platforms. It can step up and address the challenges of availability today and provide a cleaner and better model as we advance. Hyperscalers figured this out and are currently benefiting from disaggregated architectures during this supply shortage. Without moving to more[ILR1] fundamental building blocks, Service Providers and Telcos will not keep pace with hyperscalers or absorb supply chain shocks as robustly.
White Box Opportunity
White Box architectures in Routing - a 1-2 RU ‘pizza box’ based on Networking Processor Units (NPUs) that can support multiple networking use cases like core, edge or aggregation routing, become enticing to many customers. While shifting from modular to fixed at a vendor level eliminates some complexity, the customer still depends on the vendor's supply chain. An SP dual vendor strategy might end up being reliant on the same manufacturing or bottleneck. The white box is different. There are nearly a dozen white box vendors shipping today across multiple regions. While there are still some dependencies (Jericho ASIC), the diversification is significant and less risky. ‘
In addition, the same white boxes can be combined or used as stand alone, and support multiple networking use cases – core, edge, aggregation, etc. Stocking on few building blocks and using them across multiple needs provides the tremendous flexibility in dealing with supply chain shortages.
2022 and Beyond
We expect an increase in Fixed Dissagregated Chassis announcements at a customer and vendor level throughout the end of 2021 and into 2022. We believe this will help set up the next upgrade cycle to 800 Gbps and 1.6 Tbps as customers learn and become comfortable with this new topology with 400 Gbps solutions currently shipping and begin demanding more from vendors in future platform updates.
By Alan Weckel, Founder and Technology Analyst at 650 Group.
This blog post is the second and final part of article that about an interview we had with Dritan Bitincka, co-founder of Cribl, an Observability Infrastructure vendor. The interview proceeded on three different tracks: (a) Mr. Bitincka’s Journey to Cribl, and (b) Industry Changes, and (c) The Future of Observability and Cribl. This part of the article captures the last two tracks.
Industry Changes. We shifted gears to the topic of what changes he has seen in the industry. Dritan said, on the top of his mind, that he is amazed at how fast observability tooling and solutions are being adopted. He explained that part of the reason is that now that most of them are cloud-based, organizations can validate the value of the solutions very fast. Before cloud was so popular, he said, it took months with salespersons doing bake-offs of solutions and customizing each premises-based environment. Now, with cloud-based, standardized systems, it takes days or weeks, allowing for decisions to be made right away. The basic message is that with cloud-based systems, customers can now do self-service and can self-evaluate systems. The second message was that the volume of data stored for observability systems is growing about 30% annually. But, if storage was free and infinite, his customers tell him they would storage 5-10x more. This desire to store more is a big part of what Cribl is betting on pursuing in the Observability Infrastructure market. Third, Dritan sees many observability tools companies getting funded – the space is hot, he says (we agree, and AI applications will only push this further).
I asked a bit more about the cloud, and we discussed the cloud version of LogStream that just went to general availability in October. In this model, Cribl offers LogStream as a service for its customers. Mr. Bitincka said the interest in this service has been phenomenal and that there are a ton of proofs of concept trials underway. There is particular interest from organizations that don’t have operations teams.
The Future of Observability and Cribl. Then we shifted to where the observability market is headed over the next 3-5 years. Dritan said that his customers want to instrument as much data as possible, including directly from applications as they run. He explained that Cribl offers AppScope as open-source, allowing users to collect performance data from all sorts of applications. He said that roughly 20% of applications at organizations are fully instrumented using Application Performance Management (APM) because most programs are close-sourced, and it is difficult to instrument them. His view is that there will be an emergence of tooling of applications that goes well beyond that of simple agents that reside on devices throughout the computing, networking, and security systems that are used. By peering into applications very deeply, this will cause a deluge of data that customers can currently use LogStream to handle, control and route. The location where all this new data will reside, Dritan calls it an Observability Lake. Once all this data is saved to long-term storage, he says, all the teams in the organization can self-service access it, and potentially forever. A significant advantage to using such an approach is that, in contrast to a database system where you must know ahead of time how to structure the data, with an Observability Lake approach, the teams can investigate incidents and data there were not expected and play these events back on recent or very old data. I was interested to learn how AppScope works. Dritan explained that AppScope is a black-box instrumentation technology that sits between the operating system and the application. It sees all the interactions between the application and the filesystem, the CPU, the network etc.. It captures all the metadata associated with this traffic and forwards it downstream. And, he explained that it doesn’t matter what language the program is written in, whether it is Ruby, C, Java, etc., because AppScope is just intercepting syscalls. I got the sense from Mr. Bitincka that Cribl is betting that AppScope will challenge the agent-based approach that is so common these days.
I challenged Mr. Bitincka and asked what Cribl’s plans in tackling new observability infrastructure challenges are. What I learned was that the company plans to enhance both LogStream and AppScope further. For LogStream, there are 60 integrations with other systems – the number will grow, and there will be more protocol support and more device support, even including IoT systems. The company’s goal, according to Dritan, is to help customers with the “data generation” phase – how to unearth more data. Customers agree, with Cribl getting many more inbound requests from prospects as they drive towards pervasive and ubiquitous instrumentation. So, the company’s next focus is to generate data at a large scale and in a standard way. He said the company will double-down on AppScope and thereby will develop a universal edge collection system. The idea behind this edge collection system is that it would remove the headaches customers get from collecting, processing, and managing observability data. Our take on this edge collection system strategy is that Cribl will start competing with some more traditional observability vendors who have developed their own agents that reside on computing systems. But, if Dritan’s right that his customers use as many as a dozen tools, potentially all having different agents, then Cribl’s single-collection strategy could prove valuable to customers. This new data collection capability from Cribl would allow for simplified data collection and consolidated Observability Lake storage, thereby allowing customers to use all the analytical tools they want.
Nokia’s Global Analyst Forum this week highlighted two main trends. First, the company says it has caught up to rivals in its 5G radio development. Furthermore, the company expects its wireless systems to become increasingly technologically differentiated from competitors. Second, the company emphasizes its message that it is the “green partner of choice.” We read that the company is making more power-efficient communications equipment. Apart from its significant themes meant for headlines, the company also highlighted that: (a) it’s experiencing strong private wireless growth, (b) its RAN systems are in the pilot phase with hyperscalers like AWS, GCP, and Azure, (c) it is embracing Open RAN faster than other established competitors, (d) it expects the Remote Radio Unit (RRU) to take an increasing fraction of total RAN spending, (e) it sees the RIC as a market expansion, (f) it expects to differentiate in radio in 2022 with its growing Carrier Aggregation capabilities.
Nokia, which has significant revenue exposure to Mobile RAN, is in an interesting phase of its corporate development. With having brought on new CEO, Pekka Lundmark, recently, it abandoned its end-to-end product portfolio strategy. Yet, in recent times, the company’s non-radio portfolio has outperformed radio access network growth trends, which reinforces the idea that its broad portfolio serves it well. One of the company’s primary messages from the conference was that its RAN portfolio has caught up to competitors and that next year it will deliver significant improvements, including Carrier Aggregation and a broader portfolio of Massive MIMO systems. The company also said that it is working with a broad set of infrastructure providers and infrastructure software companies that will be able to support its RAN and core portfolio; examples include Anthos, Kubernetes, VMWare Tanzu, AmazonEKS, OpenShift, among others, operating on AWS, Azure, Google Cloud or on premises-based infrastructure. Nokia is investing in broadening out the appeal of its RAN and core systems both by embracing these various non-Nokia systems, as well as supporting Open RAN. The company says it expects an increasing amount of value to accrue to the RRU and away from baseband, which we see as consistent with its support of so many different infrastructure systems that would run baseband. The company sees revenue upside in the RIC market, part of the Open RAN architecture. The company’s support of Open RAN will lead to the commercialization of Open RAN systems in about two years, according to Nokia.
Furthermore, the company’s telecom core business is experiencing an acceleration in business trends. Like the RAN architecture support for various cloud systems, Nokia is even further along in offering support for its core systems like 5G Core. Management made two comments during its discussions that did an excellent job of explaining how far along the core market is in moving towards a hyperscaler-based infrastructure. First, Nokia said that “50% of RFQs include an option to run on top of the Hyperscaler.” Second, Nokia explained that of 82 of the engagements, 20 have serious public cloud investigations and dialog going.
We are also encouraged by the company’s leadership in Fixed Wireless Access (FWA) and 25G PON. In 5G FWA, the company has some significant antenna and software algorithm capabilities, and we expect new, cutting-edge products in 2022. In 3GPP 5G FWA, the company holds a significant revenue market share lead as of 3Q21, illustrating its robust capabilities. The company made a bet on 25G PON and was a significant contributor to an MSA Group called 25GS-PON. Additionally, Nokia developed its own semiconductors, called Quillion, to support 25G PON (backwards compatible to lower 10G and 1G speeds).
Recently, we had the opportunity to speak with Dritan Bitincka, co-founder of Cribl, an Observability Infrastructure vendor. All three of Cribl’s co-founders were employees at Splunk, a leading observability vendor. It was exciting to hear how Dritan’s experience with Splunk led him and his co-founders to seek a new place in the value chain in the observability market. We also discussed the industry’s future. The interview proceeded on three different tracks: (a) Mr. Bitincka’s Journey to Cribl, and (b) Industry Changes, and (c) The Future of Observability and Cribl. A week after this post, we publish about Industry Changes and The Future of Observability.
Dritan Bitincka’s Journey to Cribl. Dritan is the VP of products, and I noticed that he was very active in posting blog articles about the company’s first product, LogStream, in 2018 and 2019. By the time 2020 came along, Dritan’s posts were occasional (my favorite is here because it explains how simple it is to connect LogStream to Azure Sentinel), and he’s only posted once in 2021. My takeaway, which Dritan confirmed, was that he has been very busy expanding his team and building new products, the typical next phase of a startup in growth mode.
I asked Mr. Bitincka what it was like moving from Splunk to becoming a co-founder at Cribl and found his response both interesting and informative. First, Dritan said that when he was at Splunk, he saw the observability market through the lens of Splunk only. But, what became clear is that many Splunk customers were using as many as a dozen other tools besides Splunk to perform observability. This insight of the fact there are dozens of other tools out there is part of what drove the current products at Cribl because what Cribl’s LogStream product does is it integrates with many “sources” and many “destinations,” with Splunk being just one of them. His second response was that Amazon S3 is one the biggest of the dozen data destinations that customers use and that customers are building analytical solutions on top of S3. He is seeing his customers adopt S3 instead of local storage increasingly. He explained that in the past, organizations would place their data into Splunk or Elasticsearch or other analytics solutions, keep it there for 90 days and then send it to archive. These systems tend to be costly, explains Mr. Bitincka, and hence the data must be sent to archive. But, sending the data to an archive means those organizations cannot use analytical tools on the data. S3, though, while not as responsive as down to the millisecond range as local storage, is now quick enough, explains Dritan. He explained that since S3 is far more affordable, the economics favor using S3 for both current and old data and then making the data accessible for periods much greater than 90 days. And third, I asked Dritan to elaborate more on the trends between local storage and S3 (or other cloud object storage), and what I learned was that the Cribl team is getting a lot more customers requests for S3, or object storage in general. More specifically, customers are “reading” the S3 data, which means they use it to do functions such as “data replay.” Customer requests for object-based storage and for more object reading activity give Mr. Bitincka confidence that his customers will deploy in the cloud.
Mr. Bitincka’s background is in deploying multi-terabyte distributed systems, so I asked him to explain the challenges in deploying these kinds of large-scale systems. I enjoyed this discussion because it shows that as the growth of the observability industry has soared, this growth has caused new headaches. Dritan explained that he had deployed Splunk at somewhere around 150 customers in his years there. During that time, he learned that it became increasingly difficult to manage the systems effectively as they got larger, and often customers would need external tools like Chef or Puppet to handle configurations. The problem is that increasing the size of these systems to higher capacities when using these third-party tools became large drains on administrative or development operations professionals. So, in Cribl’s system, these version-controlled, deployment and configuration authoring capabilities are built-in, and thus they’re more accessible for customers to deploy, maintain and increase capacity. Additionally, he said Cribl’s products also have built-in health monitoring and native cloud tooling that deal with user and machine roles.
Mavenir held its annual analyst event this week and highlighted some important information highlighting its progress in transitioning to a maturing ecosystem player in the telecom equipment industry. The company highlighted its recent Koch Brothers $500M investment; existing investors include Intel/Nvidia and Siris Capital, who remain majority equity holders. The company highlighted that it grew revenues and bookings in the mid 20’s percent year-over-year in its Fiscal 2020, an impressive figure. Two main themes came from the show. First, the company’s RAN portfolio is picking up steam. Second, the company’s portfolio now spans very wide, from telecom core to RAN.
The RAN portfolio has made significant progress. The company claims over 20 deployments in 14 countries. And, Mavenir has demonstrated the capability to deploy on AWS, IBM Cloud, Microsoft Azure, Oracle Cloud, Google Cloud, and VMWare. The company spent a great deal of time reviewing definitions of various Open RAN terminology, to address confusion, spanning from vRAN, O-RAN, C-RAN, Cloud-RAN, and Open vRAN. We’ve seen many public statements from Mavenir, its competitors, operators and pundits, alike, espousing the various benefits of some or all of these systems. We think the point Mavenir was making at its conference is that Open vRAN is the most open, interoperable system. When operators enable open systems, of course, it allows Mavenir and other vendors to bid on deals for networks that have existing equipment from traditional vendors like Ericsson, Nokia, Huawei, and ZTE. We see Mavenir’s efforts to work with various infrastructure companies and systems like AWS and VMWare as a means of gaining a foothold with operators who are trialing or in the early stages of deploying these various infrastructure systems. Speaking of partners, the company claims it has relationships with nearly 15 Remote Radio Unit (RRU) players. The company says it can deliver Massive MIMO capabilities to customers, which means that its RAN systems can satisfy what would be considered mainstream 5G use-cases; this represents very significant progress over last year’s RAN capabilities.
Mavenir’s portfolio is extensive. The company made separate presentations about the following topics: RAN, OSS, Radio, Packet Core, Mobile Core, BSS/Digital Enablement, Security, Private Networks, and Enterprise over three days. With over 5,000 employees spanning the globe, exposure to the most relevant parts of the mobile infrastructure industry, Mavenir is a serious contender for deals. The company also highlighted that its telecom core technology uses modern programming techniques that enable it to operate on cloud infrastructure; among these are fully containerized micro-services design. The company shared that most microservices file sizes are under 25 Mbytes, evidence that the systems are designed as microservices (and can load fast).
The fact that in April 2021, well-known Koch Bros made a $500M “strategic minority” equity investment in the company is an important validation of Mavenir’s place in the telecommunications industry. We see the investment as a reinforcement of the company’s balance sheet and an opening to new customers.
200 Gbps Active Electrical Cable (AEC) to Play Key Role in Current and Next Generation Server Connectivity
Server bandwidth continues to explode. Led by a confluence of next-generation processors, lower process geometry, better software, and the push toward AI, the market is seeing a rapid transition to 50 Gbps, 100 Gbps, and even 200 Gbps ports on each server. AI workloads usher in enormous data sets, and additional accelerators in the server (Smart NICs, DPUs, FPGAs, etc.) continue to push AI traffic bandwidth to outpace the overall network growth of the past five years. Our workload projection has AI-based workloads driving nearly 100% bandwidth growth on the server through 2025 compared to 30-50% in traditional workloads.
With such large amounts of data moving into the server, reliability and power become more important to operators. Power budgets are not increasing this fast, and DAC can only achieve so much speed. AECs proved themselves at multiple cloud providers and hyperscalers earlier in 2021. Early 2022 designs indicated additional design wins with AEC. AECs have high reliability, are on par with DACs, and are far higher than traditional optics. At the same time, AECs have lower power consumption than fiber and can stay comfortably in most operators' power budgets.
Additionally, AECs are specializing. For example Credo’s Y-Split SWITCH AEC is designed to support failover, the CLOS AEC supports front-panel connectivity in distributed disaggregated chassis, and the SHIFT AEC has the ability to shift between different SERDES lanes (56G PAM4 and 28G NRZ), which allows operators to qualify fewer variations of the cable, future-proofing some connections.
As we look on the OCP Summit floor today, virtually and in person, we see many demos of 200 Gbps. In particular, we spotted implementations at two locations using 200G SWITCH AECs, which Credo announced earlier today. One was with NVIDIA and Wiwynn, and the other with Arrcus and UfiSpace.
The AEC cable will play a more significant role in server connectivity in 2022 and in the overall 56G SERDES connectivity generation to enable new and more powerful applications (AI/ML and existing cloud workloads) to take advantage of higher-speed links.
By Alan Weckel, Founder and Technology Analyst at 650 Group.