2020/11/19 Update: This is part 1 of a series of posts. Part 2 is available now as well.
TL;DR Summary
AvidThink (with assistance from technology experts) ran a real-world basic test suite on AWS Wavelength on Verizon’s network. While the tests achieved impressive throughput (> 3Gbps download speeds on UDP and easily over > 1Gbps on TCP), we did not discern any significant latency or throughput differences between a server in the SF Bay Area AWS Wavelength zone and a server in a nearby Amazon EC2 region.
Whether this was due to the test’s simplicity or conditions specific to the SF Wavelength zone when we ran the test is unclear. Perhaps due to the SF Bay Area already having a well-connected EC2 region close by, the differential performance between that and the Wavelength site is minimal. In other geographic regions without an EC2 region in proximity, the Wavelength sites could demonstrate superior performance. Nevertheless, conversations with AWS and Verizon partners who are currently running pilots on Wavelength indicate that they see a significant reduction in latency and improvements in the quality of customer experience.
Regardless, we were impressed by the ease of extending EC2 workloads into Wavelength zones and believe the uniform EC2 interfaces and familiar services will serve developers well as they build edge-enabled applications. AvidThink aims to expand its test efforts to better characterize telco edge sites’ performance as more of them come online.
Introduction – the Rise of the Mobile Edge
5G and edge — two of the hottest trends today, spanning both the worlds of telco and clouds. The intersection where 5G and edge, telco and cloud, meet has attracted much interest from the entire ecosystem. Whether mobile network operators (MNOs), wireline operators, hyperscalers, network equipment providers (NEPs), or system integrators (SIs), everyone wants in on this land grab. A simple scan of news headlines shows many industry giants jumping on early to establish thought leadership and foster an ecosystem while formulating workable and scalable business models:
- Cloud players: Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), Baidu, Alibaba, Tencent,
- Communication service providers (CSPs): AT&T, China Mobile, Telefonica, SK Telecom, Verizon, Vodafone
- NEPs and infrastructure software vendors: Ericsson, IBM/Red Hat, HPE, Nokia, VMware
It’s not clear whether this land grab will yield benefits in the future. Metaphorically: is there oil or mineral reserves hidden under that land? Is the land fertile? Is the location strategic? Ultimately, is there gold in ’em hills? Regardless, the fear of missing out (FOMO) has driven large early investments. Will the early entrants win? Or will fast followers grab the market after early entrants do the hard work but run out of funds and steam?
Regardless of how this plays out, this is an area of interest for many. There are many topics of exploration for the edge. For this post, we’ll discuss our initial findings from a recent series of experiments that we’ve run on AWS Wavelength — a joint edge offering from Verizon and AWS.
What is AWS Wavelength?
To begin, here’s a view of AWS Wavelength from Amazon’s site:
Source: Amazon Web Services
AWS Wavelength is multiple racks of AWS Outpost running at an MNO’s cell aggregation sites: mobile switching offices (MSOs) or switching centers (MSCs). By being right next to the 4G packet gateways (PGW) in 4G or 5G non-standalone networks, or post-UPF (N6 interface) for 5G standalone networks, they offer substantially lower latency to mobile devices (user equipment or UEs) than any external site on the Internet. Each of these Wavelength Zones is paired with a neighboring AWS region (as seen above) over a private high-speed link to allow fast data transfers from the site to the region (incurring no extra EC2 charges since these are in-region data transfers). EC2 servers in the Wavelength zone and the associated parent region can all sit within a virtual private cloud (VPC), allowing the Wavelength instances to communicate directly with other collaborating servers in the region.
Since space is a constraint in these Wavelength Zones, AWS offers a subset of EC2 instance types. The list currently includes t3.medium, t3.xlarge, r5.2xlarge for general workloads, and g4dn.2xlarge for heavy GPU-usage workloads. These are all EBS-based instances, with snapshots and AMIs hosted in the paired region (aka the parent region). There are only on-demand instances; AWS does not yet offer reserved instances on Wavelength.
AWS also offers a non-carrier flavor of edge sites called Local Zones. Amazon has situated these sites near metro areas outside of carrier facilities. The only Local Zone that’s available today is in Los Angeles, focused on the media and entertainment market in that region. Local Zones have more variety of support instance types than Wavelength Zones and differ in other ways, but we’ll leave that discussion for another day.
Benefits of the Edge and AWS Wavelength
We realize that the edge brings many benefits, including reduced latency, a better quality of network service, lower backhaul bandwidth costs, improved resiliency and autonomy, and compliance with data jurisdiction mandates. One of the key promises of mobile edge offerings is improved latency and throughput, and lower jitter. For carriers, the mobile edge is essential in supporting their ability to offer differentiated network services, especially end-to-end quality-of-service (QoS) and network slicing.
AWS and Verizon have taken a bold step by launching a generally-available (GA) solution — AWS Wavelength — in August of this year. There are currently five sites: the original 2 in San Francisco Bay Area and Boston, plus New York, Washington DC, and Atlanta. There are five more sites promised by year’s end. The dynamic duo of AWS and Verizon have brought on application partners for their launch. Many innovative young companies are looking to leverage AWS Wavelength to provide improved application performance and experiences. AWS has also announced agreements with KDDI, SK Telecom, and Vodafone, and we expect to see some of those global launches by the end of 2020.
In AvidThink’s conversations with AWS partners, they’ve reported improved performance for their applications: impressive throughput (on Verizon’s mmWave offering, aka 5G Ultra Wideband), and lower latencies experienced as reduced start-up times for applications and streaming media plays.
Real-World Experience with AWS Wavelength
We wanted to directly experience this incredible edge technology, so we applied for access to AWS Wavelength on our AWS account via submission of a simple AWS Wavelength request form. We procured a 5G mmWave capable phone — the Samsung S20 5G UW, capable of accessing Verizon’s mmWave offerings (bands n260 and n261). Qualcomm’s Snapdragon x55 modem-RF system powers Samsung’s S20 family, enabling 5G in both the sub-6 GHz and the mmWave bands.
Our AWS setup is as follows:
- A t3.medium instance on Oregon parent region (don’t ask me why San Francisco Wavelength Zone is paired with Oregon and not Northern California, the same reason the Local Zone in Los Angeles is parented by Oregon)
- A t3.medium instance inside the San Francisco Bay Area Wavelength zone
- A t3.medium instance in the Northern California region (used for comparison)
To enable easier management, we set up a VPC that included the Oregon parent server and the Wavelength zone server to console into the Wavelength zone. We learned the hard way that carrier IPs on Wavelength zones were only accessible from the carrier network (yep, should have RTFM). Note that it appears from subsequent ad-hoc testing that Verizon or AWS has rectified this.
Simple AWS Wavelength Test Methodology
We set out to run a suite of simple tests. For our real-world experiment, we relied on trusty ICMP-based ping for round-trip times (RTT) and iperf3 for TCP and UDP throughput tests. Since we were looking at the differential performance between the phone + Wavelength and phone + EC2 region, we felt the simple tools sufficed for now.
Our automated test suite performed the following tasks:
- A series of pings (50 counts) against each of the three servers
- TCP upload and download tests in iperf3 against each of the three servers (TCP 1 stream upload and TCP 1, 3, 5 concurrent stream downloads)
- UDP upload and download tests in iperf3 against each of the three servers (UDP upload of 10Mbps, 30Mbps, 100Mbps, download of 10Mbps, 30Mbps, 100Mbps, 200Mbps, 500mbps, 1Gbps, 2Gbps). We added 3Gbps and 4Gbps tests when we realized 5G UW had more download capacity than expected in our original scripts.
The suite was executed via a script running in a Termux (Android app) shell instance on the phone with all other apps shut off.
We executed the test over a WiFi network tied to a wireline Internet link to get an idea of wired ping times to the two EC2 region instances. We then proceeded to run the benchmark on the Verizon 4G LTE network, Verizon 5G (non mmWave network), and the Verizon 5G UW network.
Where’s Wally or where’s UW?
Verizon has a few 5G UW zones in the San Jose area (where we ran our first series of tests). But as many tech reviewers have attested, it’s hard to lock onto a UW band radio. And once locked, you don’t want to move — tree foliage can cause loss of the tiny UW icon on the phone, much less people walking past and blocking the signal. We had many strange looks while we tried to stabilize our test phone in a position in line-of-sight from the mmWave radios while the tests ran.
Analysis of the Results with AWS Wavelength on Verizon and Our Learnings
We’ll start by stating that the download speeds over 5G mmWave blew us away! We were able to achieve over 3Gbps one early morning at a Verizon UW cell site. That was an impressive achievement on Verizon’s part.
However, we could not see any significant reduction in round-trip latency by using the San Francisco Bay Area Wavelength zone, compared to the AWS NorCal region (US-West-1). Both RTTs were lower than RTTs to the Oregon region but the difference between ping times to the closest region and Wavelength were negligible. The results were the same for both 4G and 5G UW. Further, there were limited throughput differences between running traffic between the phone and the EC2 regions or the phone and the Wavelength zone. In both the 4G LTE and 5G tests, the Wavelength zone did show higher TCP upload rates, but there was no appreciable difference in download rates.
We noticed the large variability in ping times to the EC2 servers across the regions and the Wavelength zone. Compared to the consistency of ping times between each of the servers (including Wavelength zone), it points to the carrier network likely being the primary source of that variability.
Based on prior conversations with AWS Wavelength partners, we expected to see a reduction in latency and a marked increase in throughput across the board. Perhaps due to the simplistic nature of our tests, we didn’t experience those benefits. It’s also possible that some of these AWS and Verizon partners were testing at the Chicago 5G Edge pilot site, which is not a public Wavelength site today, and may have a different setup than the production network in the SF Bay Area.
Here are the tables that summarize our findings across the test runs. We’ve primarily focused on the 5G UW results here but provide ping tests comparisons when the phone was on 4G LTE mode for comparison.
Table 1. 5G UW Ping Tests
Table 2. 4G LTE Ping Tests
Table 3. 5G UW Throughput Tests – TCP
Table 4. 5G UW Throughput Tests – UDP
More detailed statistics are available at the end of this post. Feel free to check the results out. We did further testing beyond our initial data capture but were still unable to capture an experience of reduced latency or higher throughput when compared to using the closest EC2 region (NorCal in this case, versus the Oregon parent zone). We also repeated the tests on sub-6GHz 5G on Verizon’s network but the performance were in-line with the 4G LTE results. Again, there were no discernible differences on latency or throughput.
Potential Sources of Errors
Given the simple test setup we used, multiple sources of variability and errors could have affected the findings. We tried to be diligent with possible errors, even in our simple tests. For instance, when we didn’t see lower RTT times as expected, we triple-verified that our Wavelength server was, in fact, in the Wavelength zone and had a Carrier IP assigned.
Regardless, there are possible sources of errors in our test, including:
- UE (Phone) application and OS impact — while we closed out all other applications on the phone, there’s still system services and potential background tasks that could impact the measurement. It’s possible (though unlikely) that this skewed the results for a specific server target and not the others since we did a few different test runs, which returned consistent results (at least during one particular time of day).
- Use of ICMP echo — we realize that ICMP pings do not necessarily indicate the total latency that an application might experience given that the response comes from the network stack in the OS and not a user-space application. Nevertheless, since we’re comparing just the relative changes between server targets, it’s a valid relative comparison. Further, AWS and Verizon, during their launch, indicated in their technical Q&A video that ICMP pings were likely a decent interim approach to measure latency, at least until they build a better tool/API for application developers. Some technical experts have suggested using different instance types and measuring application-level RTT for more accurate characterization. We’ll look into both suggestions for future tests; however, we stand behind our current simple methodology as a relative comparison measure for RTT and throughput.
- Small sample size — we realize for the results to have any statistical significance, we would need to run extended testing under a broader set of conditions across more locations over a longer period (days/weeks). We designed our test to reflect a simple single-user experience of the 5G mmWave and AWS Wavelength combination at a single point in time. As we have resources, we’ll be expanding to collect more test data.
Observations and Hypotheses
Based on our direct experience with 5G mmWave and AWS Wavelength, we’d like to make the following observations and a couple of hypotheses:
- 5G mmWave is impressive but finicky. Having experienced the technology, we’re looking forward to the day where mmWave coverage provides us gigabit services in our favorite spots — coffee shops, transit locations, the office as part of a private 5G network, and even at home. Note that the >3Gbps performance was for blasting UDP packets at high-speed from server to UE, without regard for loss. The TCP performance results will be more indicative of what real-world applications will experience. The numbers there are still impressive, with most >1Gbps. However, we think that mid-band 5G with a couple of hundred Mbps that’s more reliable and with wider coverage might be the middle path for now.
- Differentiating network latency vs. throughput. While the results were inconclusive as to whether AWS Wavelength reduces RTT, our testing indicates that it’s possible to attribute faster application performance to lower latency due to increased throughput. For example, AWS and Verizon’s partners indicated faster streaming media start times and improved application performance. What they experienced could be primarily due to running on Verizon UW with Gbps throughput that transfers data faster, as opposed to a lower RTT.
- Limited information on the underlying network. As for the Verizon 5G UW network, it’s unclear in our testing whether we were dealing with a 5G SA core (5GC) or a 5G NSA setup with a 4G EPC that has higher latency. Likewise, it wasn’t clear to us how many network hops the UE (phone) was to the actual AWS Wavelength servers (though the traceroute output we’ve provided below shows quite a few, we’re not sure how accurate that may be).
- The edge should not just be about latency. While our primary goal in this test did focus on latency (and throughput), the edge provides more benefits than latency. If you want to learn more about our views on the edge, check out AvidThink’s recent 2020 Edge and Beyond report. We’ll have more to say about this in future blogs.
- Quantifying ROI on edge premium not possible yet. We’re not sure yet how best to quantify the ROI of paying a 20-30% premium over regular cloud instances for an edge instance. We realize that the bulk of latency and performance guarantees may have to come from the carriers like Verizon who own the last mile. At this point, it’s unlikely the carriers who are experimenting with these services will provide any hard SLAs. We suspect that carriers will have to offer SLAs before large-scale edge application deployment by major enterprises.
- Make an informed decision between regional clouds and the edge. Our tests were run in the SF Bay Area, which happens to have a nearby AWS EC2 region (NorCal). Given that AWS usually ensures strong connectivity from their regions to major carriers and other significant internet sites, the performance differential between the Wavelength site and the EC2 region will be less significant than in geographic regions without a nearby EC2 region. For application developers, it might be worth making the call between using a nearby region versus paying a premium for Wavelength in these locations. It’ll boil down to application orchestration complexity (having different models of deployment in different locations) versus paying the edge premium.
- Developers benefit from an edge that’s an extension of the cloud. For application developers, the seamless extension of a cloud platform is a convenient route to edge applications. For those familiar with Amazon EC2 (or Azure or GCP), it’s convenient to be able to pick a deployment zone and magically start an instance in a carrier’s switching center or central office with its attendant performance benefits. This frictionless onboarding into the edge will help in encouraging the ecosystem to experiment and innovate.
- Accessible edge testing ground is important to innovation. It’s essential to have a service like AWS Wavelength available and accessible. Without the wider ecosystem’s ability to experiment and try out new ideas, we can’t embark on the journey to discover the 5G killer apps that the collective industry is seeking. Just as 4G LTE allowed innovations like Uber and Lyft and the public cloud-enabled Zynga, Netflix, and many other services to scale, the public edge cloud can do the same for a new generation of mobile applications. We understand it’s an early investment on both AWS and Verizon’s part that may not yield returns for some time. It may be a “build it but they still won’t come” situation, but we’re hopeful that it’s not.
Interim Conclusions and Future Steps
We’re disappointed we were unable to achieve low-latencies on Wavelength in the SF Bay Area and will continue to try. We’ll be expanding our tests to other Wavelength zones as we recruit other volunteers (if you want to help, drop us a note at [email protected]).
Furthermore, we’re expecting Microsoft Azure and AT&T (and other carriers) to offer their version of a telco edge cloud in the next few months. We’d love to test that once it’s available. And we’re keeping our fingers crossed that Google will have something from its Google Global Mobile Edge Cloud (GMEC) in the market early next year.
If you have ideas on what tests we should perform next or want to chat with us about the edge, drop us a line. We are optimistic on the edge but realistic in terms of what’s achievable today. Nevertheless, we believe it’s up to us to keep pushing the boundaries to drive our collective learning. We hope you’ve found this article and our early findings useful. You can reach the AvidThink research team at [email protected].
We look forward to hearing from you!
2020/11/19 Update: Update: Part 2 of this series is available now.
Acknowledgements
AvidThink would like to thank our SE friends at WWT for suggestions and assistance with our Amazon EC2 setup. We would also like to thank Nithin Michael, founder/CTO at Mode.net, for his early feedback and suggestions. The automation scripts we wrote for the AWS Wavelength setup were based on information gleaned from public posts on AWS’ websites. All errors and omissions are the sole responsibility of AvidThink.
Disclosure
As an analyst and advisory firm, AvidThink may have past or ongoing engagements with companies covered by our research. This test was conducted independently, without additional assistance from Verizon or AWS beyond what a normal paying customer of either would receive.
Additional Data
The testing tool we used, iperf3, provides both TCP and UDP test modes. The TCP test mode is likely to better reflect application performance on the network, but doing a raw UDP blast can help determine the packet carrying capacity of the link — it’s a pretty brute force test on the network though. And as you’ve seen in the article above, when we try to force too many packets through, the network just ends up dropping them.
You will also note that we use multiple streams for the TCP tests. Under some conditions, using multiple streams might allow the application to achieve higher throughput (especially if there’s some loss on the link). However, TCP implementations are getting more sophisticated than the original simple slow-start and congestion avoidance algorithm. Without trying to figure out the actual TCP implementation in the underlying test systems, we simply ran tests with 1, 3 and 5 streams.
As additional data, we present our tests on the 4G network. Note that Wavelength also works on Verizon’s 4G LTE network. In this test, we did see improved upload speed in the Wavelength zone compared to the other regions. However, the download tests were not conclusive, with the NorCal region outperforming Wavelength in 2 of the download tests.
Table5. 4G LTE Throughput Tests – TCP
In the UDP tests, we again saw no appreciable difference in upload or download performance. However, you’ll likely notice the high loss rates once we pushed past the bandwidth available on the LTE network.
Table 6. 4G LTE Throughput Tests – UDP
Next, we present ping tests (50 count) we ran between the Wavelength zone and the Oregon server, and between the Norcal Server and the Oregon server. You will note that the ping results between these servers are highly stable when compared to those taken over by the mobile network.
Table 7. Ping Tests between AWS Servers
Beyond this, we tried to understand why the variability was so high between the UE and the Wavelength server. We attempted to do a traceroute from the phone to the Wavelength server. Since we didn’t want to run as root our Android phone, we used the tracepath command line tool instead, which is a non-privileged implementation using UDP instead of ICMP. Of course, all these tools are dependent on intermediate nodes returning TTL time exceeded messages, which may not always happen. In any case, the following results are what we found, which if accurate, shows quite a few hops between the UE and the Wavelength server (at least 7 hops before we get no further information). Unfortunately, because not all routers on the path provided visibility and returned TTL time exceeded messages, we can’t draw any definitive conclusions.
TRACEPATH FROM UE TO WAVELENGTH SERVER
1?: [LOCALHOST] pmtu 1428
1: 225.sub-66-174-219.myvzw.com 281.019ms
1: 225.sub-66-174-219.myvzw.com 38.830ms
2: no reply
3: 146.sub-69-83-165.myvzw.com 93.850ms
4: no reply
5: 234.sub-69-83-160.myvzw.com 108.814ms
6: 63.sub-69-82-83.myvzw.com 52.057ms asymm 9
7: 136.sub-69-83-161.myvzw.com 37.591ms
8: 10.210.188.96 39.563ms
9: no reply
[REPEATED LINES]
30: no reply
Too many hops: pmtu 1428
Resume: pmtu 1428
TRACEPATH FROM UE TO US-WEST-2 (OREGON) SERVER
1?: [LOCALHOST] pmtu 1428
1: 225.sub-66-174-219.myvzw.com 74.949ms
1: 225.sub-66-174-219.myvzw.com 36.926ms
2: no reply
3: 146.sub-69-83-165.myvzw.com 117.884ms
4: no reply
5: 234.sub-69-83-160.myvzw.com 81.078ms
6: 132.sub-69-83-161.myvzw.com 63.572ms
7: 0.csi1.SNVACANX-MSE01-BB-SU1.ALTER.NET 57.439ms asymm 10
8: no reply
9: no reply
10: 0.ae28.GW7.SJC7.ALTER.NET 155.213ms
11: 204.148.55.30 56.318ms asymm 10
12: 54.240.242.213 52.257ms
13: 54.240.242.57 54.305ms asymm 11
14: 150.222.97.38 74.601ms asymm 27
15: 52.93.132.96 108.857ms asymm 23
16: no reply
[REPEATED LINES]
30: no reply
Too many hops: pmtu 1428
Resume: pmtu 1428
TRACEPATH FROM UE TO US-WEST-1 (NORCAL) SERVER
1?: [LOCALHOST] pmtu 1428
1: 225.sub-66-174-219.myvzw.com 373.190ms
1: 225.sub-66-174-219.myvzw.com 40.198ms
2: no reply
3: 146.sub-69-83-165.myvzw.com 99.005ms
4: no reply
5: 234.sub-69-83-160.myvzw.com 210.506ms
6: 132.sub-69-83-161.myvzw.com 45.561ms
7: 0.csi1.SNVACANX-MSE01-BB-SU1.ALTER.NET 126.453ms asymm 10
8: no reply
9: no reply
10: 0.ae28.GW7.SJC7.ALTER.NET 89.164ms
11: 204.148.55.30 57.521ms asymm 10
12: 54.240.242.219 48.806ms asymm 13
13: 52.93.47.34 45.542ms asymm 12
14: 52.93.47.243 92.404ms
15: 54.240.242.83 47.062ms asymm 12
16: 52.93.47.98 48.604ms asymm 11
17: no reply
18: no reply
19: no reply
20: no reply
21: no reply
22: no reply
23: ec2-54-177-203-112.us-west-1.compute.amazonaws.com 150.431ms reached
Resume: pmtu 1428 hops 23 back 18
The only definitive test was the one to the NorCal server, though multiple hops in the middle had no response. Regardless, this is provided more as background information than any deep analysis.