Paper Session II: Can You Hear Me Now?
MobiCom’17, October 16-20, 2017, Snowbird, UT, USA
Experience: An Open Platform for Experimentation with
Commercial Mobile Broadband Networks
Özgü Alay1 , Andra Lutu1 , Miguel Peón-Quirós2 , Vincenzo Mancuso2 , Thomas Hirsch3 ,
Kristian Evensen3 , Audun Hansen3 , Stefan Alfredsson4 , Jonas Karlsson4 , Anna Brunstrom4 ,
Ali Safari Khatouni5 , Marco Mellia5 , Marco Ajmone Marsan2,5
3
1 Simula Research Laboratory, Norway
2 IMDEA Networks Institute, Spain
4 Karlstad University, Sweden
5 Politecnico di Torino, Italy
Celerway Communications, Norway
ABSTRACT
(e.g., volunteers hosting the equipment in their homes), all multihomed to three operators using commercial grade subscriptions.
Thorough systematic repeatable end-to-end measurements are
essential for evaluating network performance, assessing the quality
experienced by end users and experimenting with novel protocols. While existing experimental platforms, such as Planetlab [29],
RIPE Atlas [31] or CAIDA Ark [3], meet these requirements, they
are limited to fixed broadband networks and are not multihomed.
MONROE is a one-of-a-kind platform that enables controlled experimentation with different commercial mobile carriers. It enables
users to run custom experiments and to schedule experimental campaigns to collect data from operational MBB and WiFi networks,
together with full context information (metadata). For example,
MONROE can accommodate performance evaluation of different
applications (e.g., web and video) over different networks or testing
different protocols and solutions under the same conditions.
Objective performance data is essential for regulators to ensure
transparency and the general quality level of the basic Internet
access service [24]. Several regulators responded to this need with
ongoing nationwide efforts [6]. Often, they do not open the solutions to the research community to allow for custom experimentation, nor do they grant free access to the measurement results
and methodology. MONROE aims to fill this gap and offers free
access to custom experimentation. The MONROE project selected
27 different external users to deploy their own custom experiments
on the MONROE system with the purpose of testing and further
improving the platform based on their feedback.
A common alternative to using controlled testbeds such as MONROE is to rely on end users and their devices to run tests by visiting
a website [26] or running a special application [13]. The main advantage of such crowdsourcing techniques is scalability: it can collect
millions of measurements from different regions, networks and
user equipment types [10]. However, repeatability is challenging
and one can only collect measurements at users’ own will, with
no possibility of either monitoring or controlling the measurement
process. Mostly due to privacy reasons, crowd measurements do
not always provide important context information (e.g., location,
type of user equipment, type of subscription, and connection status
(2G/3G/4G and WiFi)). MONROE is complementary to crowdsourcing approaches and the control over the measurement environment
tackles the shortcomings of crowd data, though at the cost of a
smaller geographical footprint [8]. Furthermore, MONROE supports the deployment of different applications and protocols, and
enables benchmarking tools and methodologies.
Open experimentation with operational Mobile Broadband (MBB)
networks in the wild is currently a fundamental requirement of the
research community in its endeavor to address the need of innovative solutions for mobile communications. Even more, there is a
strong need for objective data about stability and performance of
MBB (e.g., 3G/4G) networks, and for tools that rigorously and scientifically assess their status. In this paper, we introduce the MONROE
measurement platform: an open access and flexible hardware-based
platform for measurements and custom experimentation on operational MBB networks. The MONROE platform enables accurate,
realistic and meaningful assessment of the performance and reliability of 11 MBB networks in Europe. We report on our experience
designing, implementing and testing the solution we propose for
the platform. We detail the challenges we overcame while building and testing the MONROE testbed and argue our design and
implementation choices accordingly. We describe and exemplify
the capabilities of the platform and the wide variety of experiments
that external users already perform using the system.
1
INTRODUCTION
Mobile broadband (MBB) networks have become the key infrastructure for people to stay connected everywhere they go and while on
the move. Society’s increased reliance on MBB networks motivates
researchers and engineers to enhance the capabilities of mobile
networks by designing new technologies to cater for a plethora
of new applications and services, growth in traffic volume and a
wide variety of user devices. In this dynamic ecosystem, there is
a strong need for both open objective data about the performance
and reliability of commercial operators, as well as open platforms
for experimentation with operational MBB providers.
In this paper, we introduce MONROE: the first open access hardware-based platform for independent, multihomed, large-scale experimentation in MBB heterogeneous environments. MONROE comprises
a large set of custom hardware devices, both mobile (e.g., via hardware operating aboard public transport vehicles) and stationary
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from
[email protected].
MobiCom ’17, October 16–20, 2017, Snowbird, UT, USA.
© 2017 ACM. ISBN 978-1-4503-4916-1/17/10. . . $15.00
DOI: https://doi.org/10.1145/3117811.3117812
70
Paper Session II: Can You Hear Me Now?
MobiCom’17, October 16-20, 2017, Snowbird, UT, USA
In the rest of the paper, we report on our experience designing, implementing and using the platform. We detail the design
considerations and demonstrate the versatility of our approach
(Section 2). We explain how we cater for the requirements of experimenters and enable them to deploy myriad measurements on
operational commercial MBB networks. The MONROE measurement node (hereinafter, the node or the MONROE node) sits in
the center of the system and is the most important element, conditioning the proper functionality of the measurement system. We
describe our experience with the MONROE system implementation
and detail the hardware selection for the MONROE measurement
node (Section 3). We forged the node to be flexible and powerful
enough to run a wide range of measurement and experimental tasks,
including demanding applications like adaptive video streaming. In
the same time, we ensured that the node software design translates
into a robust implementation (Section 4) that is also easily evolved
and upgraded in order to sustain the most recent technological
innovations. We further present the user access and scheduling solution we offer experimenters for exploiting the available resources
of the platform in a fair manner (Section 5). Finally, we demonstrate that the MONROE system is a fitting solution to conduct a
wide range of experiments over commercial cellular networks. To
showcase its capabilities, we describe different categories of experiments MONROE supports (Section 6), which give an overview of
the main categories of experiments MONROE users are conducting
at the time of writing. Additionally, we expand on our experience
interacting with the external users (Section 7).
2
User Access & Scheduling
Experiment Deployment
MONROE Node
(Mobile or Stationary)
Operator 1
Operator 2
INTERNET
Results
Operator 3
Back-end Servers
Experimentation on
Commercial MBB Networks
Figure 1: The MONROE platform: MONROE Nodes operate
in trains, buses or inside homes and each connects to three
commercial mobile operators in each country with MONROE presence. Users access the available resources and deploy their experiments via the User Access and Scheduling.
Measurement results synchronize to external repositories
operating in the back-end.
Rich context information: While analyzing the measurements,
context information is crucial. The platform should monitor the
network conditions, the time and location of the experiment, as
well as the metadata from the modems, including, for example, cell
ID, signal strength and connection mode.
Easy to use platform: It is crucial to make it easy for users
to access the system and deploy experiments on all or a selected
subset of nodes. This requires a user friendly interface together
with a well managed and fair scheduling system.
SYSTEM DESIGN
Throughout the design process of MONROE, we interacted with
the users of the platform (e.g., universities, research centers, industry and SMEs) and collected their feedback on requirements for
platform functionality. This allowed us to gauge experimenters’
expectations and use them to sketch the platform specifications.
2.2 Design Overview
We shaped the main building blocks of the MONROE platform
such that we can meet the above-mentioned requirements. Note
that while implementing different components of the platform,
operational aspects also impacted the design choices, which we
will discuss in detail in Sections 4-5. Next, we give an overview of
the purpose and functionality of the main building blocks of the
MONROE system, which we illustrate in Figure 1. All the software
components of the MONROE system are open source [17].
MONROE Node: MONROE operates 150 nodes in 4 countries
in Europe (Spain, Italy, Sweden and Norway). The measurement
node resides at the core of our platform. Its design comprises two
main notions, namely the hardware configuration, and the software
ecosystem. In terms of hardware, each node has a main board that is
a small programmable computer and supports (at least) 4 interfaces:
three 3G/4G modems and one Wifi modem. To cover a diverse set of
mobility scenarios, we customize a portion of the nodes (i.e., 95 out
of 150 total nodes) to operate on public transport vehicles (buses
and trains) and also in delivery trucks. In Section 3, we detail the
choices for the node hardware implementation and our experience
with running two node prototypes.
The node software is based on a Linux Debian “stretch” distribution to ensure compatibility with multiple hardware configurations
and to enable a large set of experiments. Furthermore, especially
considering the experimentation on protocols, Linux is the only
operating system with sufficient hardware support for research and
2.1 Requirements
We summarize the main requirements as follows.
Large scale and Diversity: To give a representative view of
the characteristics of an entire network, we need to collect measurements from a large number of vantage points. Furthermore, we
should strive to collect measurements under diverse geographical
settings, from major cities to remote islands.
Mobility: Mobility is what makes MBB networks unique compared to other wireless networks. To provide insight into the mobility dimension of MBB networks, it is imperative that the platform
integrates a deployment under realistic mobility scenarios.
Fully programmable nodes: To accommodate the wide range
of experiments users contemplate to run on the platform, we should
forge measurement devices that are flexible, powerful and robust.
Multihoming support: To compare different mobile operators
and/or different wireless technologies under the same conditions,
the same node should connect to multiple providers at the same
time (multihoming support). This further makes the platform particularly well suited for experimentation with methods that exploit
aggregation of multiple connections.
71
Paper Session II: Can You Hear Me Now?
MobiCom’17, October 16-20, 2017, Snowbird, UT, USA
implementation of transport protocols due to the accessibility of
the source code, flexibility and community maintenance to ensure
operability with other systems. On top of the operating system, the
nodes run: (i) the management software that performs the normal
jobs expected on any mobile device, (ii) the maintenance software
that monitors the operational status of the nodes and diminishes
manual maintenance intervention and (iii) the experimentation enablers, that enable experiment deployment (via the scheduler client)
and feed rich context information to the experiments. To provide
agile reconfiguration and access for the experimenter to different
software components, the experiments run in the Docker lightweight virtualized environment. This also ensures containment of
external actions in the node system. We periodically transfer the results of the experiments from the nodes to a remote repository. We
further detail in Section 4 the node software ecosystem and present
our evaluation of potential node internal performance overheads.
User access and scheduling: MONROE enables access to platform resources through a user-friendly web portal [19] that allows
authenticated users to use the MONROE scheduler to deploy their
experiments. The MONROE Scheduler facilitates exclusive access
to the nodes (i.e., no two experiments run on the node at the same
time) while ensuring fairness among users by accounting data quotas. We provide the details and the implementation choices for the
user access and scheduling policies in Section 5.
3
As a result of the forced update, all our MiFis became inaccessible for the MONROE system. Furthermore, the MiFis themselves
were prone to resets or to enter a working state (transparent PPP)
from which we could only restore them to normal operation by
draining their batteries, or performing a manual reboot by pushing
the power button. Finally, after 6 months of operation, some of the
MiFis showed clear signs of swollen batteries. This problem brought
serious safety concerns for the nodes operating in places other than
our own (controlled) premises (e.g., public transport vehicles). We
thus modified the hardware configuration to use internal modems
operating in the miniPCIe slots of the APU board.
Current Node Configuration: We decided to increase the control over the MONROE node and base its implementation on a
dual-APU system. One of the two APUs in each node has two
MC7455 miniPCI express (USB 3.0) modems [33], while the other
has one MC7455 modem and a WiFi card. We chose Sierra Wireless
MC7455 as our 4G modem since, at the time of the upgrade, it was
supporting the most recent category (CAT6) an industrial grade
modem could provide. This design eliminates the risk brought on by
the use of batteries, avoids any forced updates (the new modems are
not routers), simplifies resets (no draining of battery) and increases
our overall control over the system.
Takeaways: APUs showed very stable performance, while repurposing the MiFis to behave as simple modems presented major
challenges (e.g., forced updates and swollen battery problems). We
thus bring forward a more compact and robust node configuration
that relies on internal modems operating in miniPCIe slots. This
also simplifies the node since we avoid potential NAT and routing
issues the MiFis might trigger.
HARDWARE IMPLEMENTATION
Given the requirements we drew from MONROE stakeholders (Section 2), the measurement device needs to be small, able to function
in different environments (buses, trains, homes), affordable, robust,
sufficiently powerful and should support the mainline Linux kernel.
The size and price constraints limited us to evaluate different Single
Board Computers (SBCs). There is a large amount of different SBCs
available to the consumer public, with different CPU architectures
and hardware configurations. However, most contain hardware
requiring the use of proprietary drivers, thus restricting us to old
kernels or making it impossible to compile custom kernels. We evaluated several options, including popular ones such as Raspberry
Pi [30], Odroid [25], Beaglebone [1] and we selected PC Engines
APU [28]. We chose the APU because it provides sufficient processing power, storage and memory for the foreseeable future at a
reasonable cost. APUs integrate a 1Ghz 64 bit quad core processor,
4GB of RAM and a 16GB HDD. APUs have 3 miniPCI express slots,
two of which support 3G/4G modems.
Modem Selection: To multihome to three mobile operators and
a WiFi hotspot, we initially equipped the PC Engines APU board
with an Yepkit self-powered USB hub [38], three USB-based CAT4
MF910 MiFis [42] and one WiFi card [4]. The reason we chose the
MF910 MiFi is because, at the time we selected the hardware, it was
the most modern device sold by operators we measured.
In the prototype validation phase, this implementation presented
some major obstacles. While the APUs proved to be very stable, the
MiFis proved more challenging than expected. First of all, in the
last quarter of 2016, the MiFis’ vendor issued a forced update to the
firmware. The update was applied despite the fact that we took special care to configure the devices not to receive automatic updates.
4
NODE SOFTWARE IMPLEMENTATION
In this section, we describe in detail the node software ecosystem
and present the justification for our implementation choices.
4.1 Software Ecosystem
Figure 2 presents the elements that coexist in the MONROE node
software ecosystem, namely the node management software, the
node maintenance software and the experimentation enablers.
The node management software integrates a set of core components that run continuously in the background. They perform
low-level work in line with the normal jobs expected on any mobile device or computer. These include (i) a Device Listener, which
detects, configures and connects network devices, (ii) a Routing
Daemon, which acquires an IP address through DHCP, sets up routing tables, (iii) a Network Monitor, which monitors interface state,
checks the connectivity of the different interfaces and configures
default routes. The node operates behind a firewall, which we configure with strict rules to increase node security.
The node maintenance software integrates components that
monitor the node status and trigger actions to repair or reinstall
when malfunctioning. A system-wide watchdog ensures that all core
components (node management) are running. However, during
the first few months, we experienced loss of connection to nodes
because of problems that watchdogs could not tackle, such as file
system corruptions which can occur due to frequent sudden power
loss in mobile nodes. Thus, we defined and implemented a robust
72
Experimentation Enablers
Paper Session II: Can You Hear Me Now?
MobiCom’17, October 16-20, 2017, Snowbird, UT, USA
node as a self-healing system. We implement this functionality in
the node maintenance software that takes automatic actions when
the node malfunctions.
Metadata
Multicast
Users’
Experiments
Metadata
Collector
Connectivity
measurements
(ping)
Scheduler
Client
op1
op2
op3
Internal NAT Function
Watchdog
4.2 Experiment Containment
monroe Namespace
host Namespace
Routing
Daemon
Docker Virtualization. The node design we propose mandates
that MONROE users execute their experiments inside Docker containers, which provide isolation from the host node. This is true
both for default monitoring measurements and external users experiments. Docker containers are based on a layered file system,
where a container can reuse layers shared with other containers.
MONROE provides the default base image for the experiment
containers, which integrates the base operating system installation with default tools that are potentially useful for many experiments. The lightweight containers provide just the contents that
are unique for the particular experiment, significantly reducing the
download and deployment time overhead and accountable traffic
volume. Experiments running inside a container have access to the
experimental network interfaces. They can read and write on their
own file system, overlaid over that of the base MONROE image.
Finally, there are specific paths (e.g., /MONROE/results/) where the
experiments can write their results and that the node automatically
transfers to the MONROE servers. Our public software repositories
contain all the files necessary to build new user experiments, as
well as experiment templates and examples.
Internal NAT Function. To ensure the minimum impact of user
experiments gone wrong, we define the monroe network namespace
where experiment containers run. For each physical interface that
the network-listener detects as available, we create a virtualized
ethernet, veth, interface pair, and move one end to the monroe
namespace. We then add routing rules in the network namespace
to allow routing by interface. In order to allow the network devices
in the host namespace to communicate with the ones in the monroe network namespace, we define an internal Network Address
Translation (NAT) function. We use iptables NAT masquerading
rules in the host namespace to configure the NAT function. Finally,
we add the corresponding routing rules to map each veth interface
to the correct physical interface.
Overhead Quantification. The internal network design introduces two potential overheads that might impact performance measurements: (i) the internal NAT function that connects the network
devices in the host namespace with their corresponding duplicates
in the monroe namespace, and (ii) the Docker containers we use to
separate the processes that correspond to a certain experiment that
runs inside the container. Thus, prior to detailing the measurement
results of different commercial MBB operators, we focus here on
these two design overheads and aim to quantify their impact (if
any) on performance measurement results. More specifically, we
quantify the delay overhead by running ICMP ping measurements,
and the impact on throughput by running HTTP downloads.
To instrument our system benchmarking measurements we use
a single APU node running the Debian “stretch” MONROE image
with a local Fast Ethernet link. Using a local link allows us to
minimize the impact of the network on our measurements, and
focus on the impact of the system overheads. We run http download
measurements with curl and ICMP ping measurements with fping
Device
Listener
Network
Monitor
BootOS
Maintenance
Management
wwan0
wwan1
wwan2
Figure 2: Node Software Ecosystem.
node recovery method, called BootOS, that enables a hard restart of
the node (i.e., a reinstallation of the operating system to a known
working baseline). This method allows us to recover both from file
system errors that prevent system boot-ups, and software configurations that may lead to loss of connectivity. To achieve this goal,
we trigger a two-stage boot loader process at node start-up. In the
first stage, we start the BootOS, which resides entirely in RAM and
only uses read-only hard-drive access for its normal operation. The
BootOS verifies that the filesystem of the APU is not corrupt, and
that no forced reinstallation has been requested. It then proceeds
to boot the MainOS, which contains the MONROE system software.
If the filesystem is corrupt, or in case of a forced reinstallation, the
BootOS reinstalls an image of a known working installation.
The experimentation enablers include the scheduling client,
the default experiments, and the services for external experiments.
Within the node software ecosystem, we differentiate between
the user experiments and the management and maintenance software by configuring a separate monroe network namespace where
experiments run. This increases our control over the ecosystem
and limits the impact external users can have on the node. This
separation further allows us to account (as part of the scheduling
system) the traffic volume each user consumes. We require that
each experiment runs inside a virtualized environment (Docker
container) to ensure separation and containment of processes. The
Scheduling Client communicates with the Scheduler to enable experiment deployment per user request. It periodically checks for
new experiment containers to run in the node and deploys them in
advance to their scheduled execution time. Section 5 offers more
details on the scheduling system. The metadata broadcasting service
runs continuously in the background and relays metadata through
ZeroMQ [41] in JSON [12] format to experiment containers. The
nodes periodically run connectivity measurements (e.g., ping), and
this together with metadata allow us to monitor the node’s state
and the overall health of the platform. Furthermore, the Tstat [9]
passive probe provides insights on the traffic patterns at both the
network and the transport levels, offering additional information
on the traffic each interface exchanged during an experiment.
Takeaways: Containment of users activity in the node is paramount to avoid security risks, node malfunctioning events, unreliable results and, more severely, node loss. We prevent foreign
unauthorized access to the node with a strict firewall. Then, continuous monitoring of the platform is crucial and we enable it by
implementing monitoring functions in the node management software. Node maintenance is expensive, so it is important to forge the
73
Paper Session II: Can You Hear Me Now?
MobiCom’17, October 16-20, 2017, Snowbird, UT, USA
the network namespace where users can run their experiments from
the host namespace, where the monitoring and management software runs. This introduces two potential overheads in the system,
which we quantify and show to have little or no impact.
1.00
CDF
0.75
0.50
5
Configuration
Docker, NAT
0.25
Docker, no NAT
No Docker, NAT
No Docker, no NAT
0.00
5.0
7.5
10.0
12.5
RTT [ms]
15.0
Figure 3: CDFs of ICMP RTTs [ms] measured against 8.8.8.8
per testing configuration over Fast Ethernet link.
1.00
Configuration
no Docker, no NAT
no Docker, NAT
Docker, no NAT
0.75
CDF
Docker, NAT
0.50
0.25
0.00
93.0
93.5
94.0
Download Speed [Mbps]
USER ACCESS AND SCHEDULING
We provide access to the MONROE platform through a user-friendly
interface consisting of an AngularJS-based web portal [19]. As part
of the MONROE federation with the Fed4FIRE [7] initiative, the user
access follows the Fed4FIRE specifications in terms of authentication and resource provisioning. Through the portal, experimenters
interact with the scheduler and deploy their experiments without
accessing directly the nodes. The scheduler API is accessible to
enable experiment deployment automation. The scheduler prevents
conflicts between experiments (i.e., only one user can run an experiment on a certain node at a time) and assigns resources to each
user based on their requirements and resource availability.
Given the challenging scenarios we aim to cover in our testbed,
nodes in MONROE have potentially unreliable connectivity and
low bandwidth. This is the norm for node in buses, trains and
trucks, which follow the schedule of the host vehicle. Experiment
scheduling therefore accounts for two factors: (i) the node may not
have connectivity at the time of the experiment and (ii) a high lead
time when deploying containers means that experiments should be
deployed early. Furthermore, experimenters may require to run synchronous measurements on multiple nodes. The common approach
to task scheduling and decentralized computing, which deploys
jobs to registered nodes based on their availability, struggles with
these constraints. Therefore, for the MONROE scheduler, we follow
a calendar-based approach, assigning time slots to experiments.
Deployment of experiments takes place up to 24 hours in advance,
as soon as the node retrieves information about the assigned task.
This allows both immediate scheduling on nodes that are not otherwise occupied, and scheduling synchronous experiments on low
availability nodes well in advance. It also allows synchronizing
experiment runtime with vehicle schedules when available.
In addition to managing the time resource, the scheduler handles
data quotas assigned by the contracts with the MBB operators. We
assign each experimenter a fix data quota. In addition, we may assign users a quota on computing time (i.e., maximum time the users
can run experiments on the node). We designed the quota system to
provide fair usage of the available resources. An important factor to
ensure fairness in day-to-day usage, is that a certain data quota is
reserved by the experimenter in advance, and subtracted from the
user quota for the duration of the experiment. Experimenters may
subsequently refund the remaining quota. Hence, it is not possible
to block large quantities of resources without having been assigned
the necessary budget, even if the resources are not actually used.
From March 2016 until March 2017, the MONROE scheduler has
been actively used by 30 users. A total of 75, 002 experiments have
successfully ran on the platform, while 7, 972 scheduled experiments failed. There are many different reasons for failed experiments, for example that the container exits unexpectedly or the data
quota is exceeded. Note that these failures are expected especially
for the new users that are trying to familiarize themselves with
the platform. We are running an open conversation with our users,
94.5
Figure 4: CDFs of Downloads Speed [Mbps] measured per
testing configuration over Fast Ethernet link.
to quantify the impact of the internal NAT function and of the
Docker virtualization. We focus on four configurations for our
testing setup, namely: no NAT and no Docker (experiments run
in host namespace), no NAT but Docker (experiments run inside
a Docker container in the host namespace), internal NAT and no
Docker (experiments run in the monroe namespace) and internal
NAT and Docker (experiments run inside a Docker container in the
monroe namespace).
To quantify the delay overhead, we collect 1,000 RTT samples
against the Google DNS server 8.8.8.8 on the Ethernet connection
on all four configurations. Figure 3 shows the results of the measurements. We conclude that the overhead of the NAT function internal
to the node is insignificant. In average, we see a penalty in the order
of 0.1ms, (i.e., in the range of clock granularity in Linux systems).
We note that the Docker and NAT combination introduces a slight
delay, which is not overwhelming.
For the throughput measurements, we download 1GB of data
from a server configured in the local network. We collect 30 samples
for each testing configuration. In Figure 4 we show the cumulative
distribution of download speed per namespace and operator, for
each of the different targets. We find that there is a 1% performance
penalty that using the internal NAT function and the Docker virtualization introduces in average. We report no direct impact of using
the Docker containers, which we expected, since the purpose of
the Docker virtualization is purely for experiment containment.
Takeaways: Our priority in the node software implementation
phase is keeping the nodes within normal functioning parameters
for as long as possible and limiting direct maintenance intervention,
while allowing external users to run a wide range of complex measurements with minimum interference. To achieve this, we separate
74
Paper Session II: Can You Hear Me Now?
MobiCom’17, October 16-20, 2017, Snowbird, UT, USA
500
gathering feedback from them and updating the user access and
scheduling policies accordingly.
Takeaways: Resource allocation and experiment scheduling on
MONROE is challenging because nodes have potentially unreliable
connectivity (e.g., nodes in mobility scenarios) and limited data
quota due to commercial-grade subscriptions. A calendar-based
approach for scheduling addresses these requirements by taking
into account per user and per node data quota, and synchronized
experiment start time.
450
Timestamp
400
350
300
250
200
150
100
Longitude
6
OPEN EXPERIMENTATION
Latitude
RTT [ms]
Figure 5: 3D graph average RTT for an operator in Sweden.
Multiple laps are shown using the Y-axis offset based on relative timestamps to visually show the different trips.
Starting from the platform design phase, we have been working
together with our stakeholders to understand their requirements
from the MONROE system and which experiments have the highest
appeal (Section 2). We then took this process further and, throughout 2016 and 2017, the MONROE consortium organized two open
calls for external experimenters. After a thorough selection process
based on peer-review, we funded 27 different projects (12 projects in
May 2016 from the first open call (OC1) and 15 projects in February
2017 from the second open call (OC2)) to be among the first users
of the MONROE system.
The experiments that these projects proposed are very diverse
and cover a wide range of scenarios, from simple network performance measurements to innovative protocols evaluation to application performance assessment. All experimenters were encouraged
to propose SW extensions to the platform (e.g., measurement packages that can be offered to the MONROE community) as well as HW
extensions to the infrastructure (e.g., deploying MONROE nodes in
locations with no previous MONROE coverage, or increasing the
density of MONROE nodes in locations with MONROE coverage).
6.1.1 Mobile Broadband Performance. To measure a mobile network in a reliable and fair way, it is important to identify the metrics that accurately capture its performance. Different stakeholders
have different metrics of interest and we argue that MONROE is
able to cater all of them. For example, regulators need connectivity, coverage and speed information to monitor whether operators
meet their advertised services. Operators are interested in spatiotemporal data reporting the operational connectivity information
to further identify instability and anomalies.
One important feature of the MONROE platform is that its deployment in public transportation vehicles allows users to evaluate
MBB performance in diverse and complex urban mobility environments. A unique characteristic of this deployment is the repeatability of measurements obtained by many runs on the same route,
at different hours. For example, Figure 5 shows RTT (ICMP ping)
measurements for an operator in Sweden, as measured by the node
operating aboard the same bus during several working days. In the
figure, dot colors encode the range of values for the measured RTTs
and we observe variations in RTT among different trips through
the same location. Repeated measurements provide high confidence
and diminish noise in the data, whereas measurement samples at
the same location but at different hours allow for the analysis on
the time-of-the-day effect (e.g., rush hour versus normal hours).
6.1 MONROE Experiments
We report next on our experience accommodating the 12 OC1
measurement campaigns on the platform1 , together with the base
experiments deployed by the consortium. We are currently offering
to the community a series of experiments [18], which any external
users can deploy on their own. This goes toward achieving our goal
of shaping MONROE into an Experimentation as a Service (EaaS)
platform. We group all these experiments in three main categories:
Mobile Broadband Performance, Service Oriented QoE and Innovative Protocols and Services. These categories also fit to the range
of measurements that our users are currently curating and have
been already actively deploying. The distribution of experiment
runs on the MONROE platform to the time of writing among these
categories is: Mobile Broadband Performance (19%), Service Oriented QoE (36%) and Innovative Protocols and Services (45%). The
volume of data that experiments in different categories consume
varies, with Service Oriented QoE taking the largest quota (60%),
while Innovative Protocols and Services are the least demanding
(10%), despite registering the largest number of experiment runs. We
further detail each category and provide examples of experiments
and analysis one can perform using MONROE.
6.1.2 Service Oriented Quality of Experience. An important measurement dimension to explore comes from the great interest in
how users perceive individual services and applications over different terminals (e.g., mobile phones, tablets, and computers). The
recent proliferation of user-centric measurement tools (such as Netalyzr [13]) to complement available network centric measurements
validates the increasing interest in integrating the end user layer
in network performance optimization. MONROE enables experimentation with essential services and applications, including video
streaming, web browsing, real-time voice and video, and file transfer services. The service oriented measurements give a good basis
for investigating the mapping from Quality of Service to Quality
of Experience. With such a mapping, operators can gain better understanding of how their customers perceive the services delivered
by their network. From the end users and service providers perspective, they could acquire more knowledge of the performance
over different MBBs and then choose the network that delivers the
best quality for services that are of interest to them. Furthermore,
application developers (e.g., Youtube, Netflix and Spotify) heavily
1 We
mention that, at the time of writing, only the 12 projects from OC1 are actively
using the MONROE platform. Though already approved, the additional 15 projects
from OC2 have not started actively using the MONROE system.
75
Paper Session II: Can You Hear Me Now?
MobiCom’17, October 16-20, 2017, Snowbird, UT, USA
Table 1: The summary of the operators and their settings.
OP
op0
IT op1
op2
op0
ES op1
op2
op0
SE op1
op2
op0
NO
op1
NAT
No
Yes∗
Yes
Yes
Yes
No
Yes∗
No
No
No
Yes∗
PEP
Yes
Yes
Yes∗
No
No
Yes
Yes∗
Yes
Yes∗
Yes∗
Yes∗
# IP
262
129
1484
272
244
1652
3486
4679
472
46
L4 mangling
80
80,443,8080
No
No
No
80
No
No
No
No
No
column reports the number of public IPs seen in server-side traces
(i.e., the “size” of the PEP boxes). The last column shows if the PEP
changes the TCP headers (e.g., removing/adding/changing options),
and on which ports. Overall, the picture varies with different PEP
configurations for different operators.
Figure 6: Page download time for 11 operators using data
from 37 MONROE nodes in Spain, Italy, Sweden and Norway while fetching www.bbc.com and www.ebay.com; each subplot corresponds to a country-target pair, and each boxplot
to a unique operator in each country.
rely on the underlying network characteristics while optimizing
their services for the best user’s experience.
To showcase the capabilities of the platform, Figure 6 reports the
page load time (PLT) measured in the MONROE platform using a
headless [21] browser to fetch two popular websites (www.bbc.com
and www.ebay.com) from 37 nodes in four countries with MONROE
coverage. If we focus on the PLT as an objective indicator for the
quality of experience and track it in comparison with metadata
information, we further enable the analysis of the mapping between
QoS metrics to the end-user experience.
6.2 External Projects Overview
The OC1 experiments focus mainly on Service Oriented Quality of
Experience and assessing the QoE for popular applications, such
as interactive video-conferencing (e.g., webRTC) or popular videoon-demand (VoD) application (e.g., YouTube) across multiple mobile carriers. There are 5 out of 12 such projects currently funded
from OC1. Their number decreased to only one among the projects
funded in OC2. A notable example in this category is a project that
integrated the YoMoApp tool [35] with MONROE to monitor the
QoE for YouTube. The resulting Yomo-Monroe [32] has two components: Yomo-docker and Yomo-browser-plugin. Yomo-docker [40]
container provides features to estimate YouTube’s “Quality of Experience.” For this purpose, the docker container independently
performs experiments and monitors the quality at the end-user side.
Yomo-browser-plugin [39] monitors the quality of YouTube video
streaming in the browser. This experiment is a software extension
of MONROE and will be available as MONROE EaaS.
Three OC1 projects leverage MONROE data (from base experiments and metadata collection) to build and run anomaly detection algorithms or LTE performance benchmarking across multiple
carriers. One particular OC1 project focuses on emergency communications and tests different protocol innovations. The project
customized MONROE nodes with mobile, Ethernet connections,
fixed point-to-point wireless links, low-band radio interfaces and
satellite broadband – all classes of links that are used routinely
in disaster situations to provide communications. The project proposes an application use-case for emergency communications in
disaster situations. This is based on IP multi-homed support that
enables resilient differentiated services, and allows an application
to select the best available transport path. The users produced
PATHspider-monroe [14, 27], a version of the PATHspider tool [15]
adapted for running on MONROE nodes. PATHspider is a tool for
A/B testing of path transparency to certain features in the Internet.
We note that this experiment is also available as MONROE software
extension and will be available as MONROE EaaS.
In OC2, the projects funded shifted mainly towards cloud and
smart cities experimentation, with 8 out of 15 proposals active in
these topics (e.g., smart city security monitoring with MONROE,
6.1.3 Innovative Protocols and Services. Another significant use
case for MONROE is investigating the impact of middleboxes in
the current Internet ecosystem. These range from address and port
translators (NATs) to security devices to performance enhancing
TCP proxies. Middleboxes are known to introduce a series of issues
and hinder the evolution of protocols such as TCP. Since middleboxes are ubiquitous in MBB networks [34, 36, 37], in collaboration
with the H2020 MAMI project [16] we aim to observe and characterize middlebox operations in the context of real-world MBB
deployments. MONROE further enables assessment of existing protocols, paving the way for protocol innovation.
As an example in this category, we investigated whether the
operators measured with MONROE use Performance Enhancement
Proxy (PEP) [2] to improve end-users’ quality of experience. These
proxies provide higher performance and faster error recovery [5,
11, 23]. We ran throughput tests (http downloads) from all the
MONROE nodes on different ports against the same responder,
where we also run an instance of Tstat on server. We then crosscompared the Tstat analysis logs on the nodes (client-side) with the
server-side logs to examine if the proxy splits the TCP connection.
Table 1 shows the global view of operators. Yes, Yes∗ , and No in the
table mean always, sometimes, or never. The third column indicates
the usage of NAT in the operator network. For instance, op2 in Italy
always uses NAT, and sometimes connections are routed through a
PEP. On the contrary, op1 sometimes assigns public IP addresses, but
HTTP traffic always goes through a PEP device. The forth column
indicates if the performance seen on the client and server side are
different. A mismatch hints for the presence of a PEP. The fifth
76
Paper Session II: Can You Hear Me Now?
MobiCom’17, October 16-20, 2017, Snowbird, UT, USA
analysis of latency-critical connected vehicle applications). The OC2
external users have strong proposals for understanding the synergies between mobile carriers and popular cloud service providers
(e.g., characterizing mobile content providers in the wild or tackling
net neutrality in MBB networks). A couple of OC2 users aim to use
the MONROE platform and the data we produce to device machine
learning algorithms for informing self-organizing networks (SON).
7
development nodes (for building the measurement tools in the lab)
and testing nodes (for testing the measurement tools in a limited
portion of the actual platform). We further decided to guarantee
our external users exclusive usage of reserved resources.
Apart from offering experimenters a system ready to accommodate their measurements, we also encouraged external users to
propose hardware extensions and enhancements to the platform.
This allows us to grow the platform, increase its geographical footprint and engage with the stakeholders to create a relevant product.
Fostering a community around the MONROE platform also means
producing a rich variety of software measurement tools. Apart from
the experiments we maintain withing the consortium, we also encouraged external users to bring software extensions to the platform
and contribute to the MONROE EaaS initiative. Numerous of our
external users responded positively to this initiative and integrated
new measurements software packages with MONROE, which they
offer openly to the community. Finally, the MONROE platform aims
to be complementary to other measurement infrastructure. Thus,
it has been an important goal for us to be able to deploy measurement tools that also run on other hardware-based platforms or in
crowd-sourcing platforms. Furthermore, we collaborate with other
publicly funded projects in need of mobile measurement infrastructure. An example for this is the Horizon 2020 project NEAT [22]
that is planning to use MONROE to evaluate their software and
API for optimized transport protocol and network selection.
EXTERNAL USERS EXPERIENCE
In this section, we present a summary of our interaction with the
external users while supporting them in the process of using the
platform for experimentation. The OC1 experimenters had exclusive access to the prototype platform. Thus, we relied upon their
feedback to refine the platform to its current version. We collected
this feedback in the form of a report the users provided, which integrated a grading system for different components of the platform.
The experimenters we selected through OC2 have access to a more
mature version of the MONROE platform, which we aim to further
improve pending their additional feedback.
7.1 External Users Feedback
Users reported that the documentation we provided in the MONROE User Manual [20] is very useful, receiving a score of 4.5/5
and that the MONROE experiment templates and examples are
easy to reuse (4.2/5). The virtualization method based on Docker
containers was very well received, with a mark of 5/5. However,
our users strongly suggested to create more multimedia material
to complement the written user manual, particularly showing the
complete life cycle of an experiment from container creation to
scheduling and retrieval of results. Access to metadata was also
seen as easy and useful (4.5/5).
As expected when accessing a prototype platform, our users
saw some issues at the beginning, giving a mark for easiness and
usability of 3.6/5. Regarding scheduling of experiments, in general
our users had some troubles understanding all the details of the
scheduling process, pointing towards the need for more step-bystep instructions such as the requested multimedia material. Some
topics that were particularly troubling for the platform users were
the binding to specific interfaces in a multihomed platform and the
optimization of the size of the Docker containers.
We provided important additions based on users feedback. First,
by opening the scheduler API for command-line tools, we enable
submission of experiments programmatically (indeed, a tool for
that purpose was released openly by one of the users). Second, we
enable SSH access to containers in testing nodes for debugging
purposes. The users reported that debugging in batch mode was
otherwise very complex and tedious since every debug run had to
wait for scheduling, execution and retrieval of results. Finally, the
user access and scheduler system now supports rescheduling of the
past experiments on the same or different nodes.
8
CONCLUSIONS
In this paper, we reported on our experience designing an open
large-scale measurement platform for experimentation with commercial MBB networks. MONROE is a completely open system
allowing authenticated users to deploy their own custom experiments and conduct their research in the wild. The platform is
crucial to understand, validate and ultimately improve how current
operational MBB networks perform towards providing guidelines
to the design of future 5G architectures. We described our experience with the MONROE system implementation and detailed the
hardware selection for the MONROE measurement node, its software ecosystem and the user access and scheduling solution. We
emphasized the versatility of the design we propose, both for the
overall platform and, more specifically, for the measurement nodes.
In fact, the node software design is compatible with a number of
different hardware implementations, given that it can run on any
Linux-compatible multihomed system. Our current hardware solution is the most fitting for the set of requirements and the predicted
usage of MONROE, which we evaluated based on our discussions
and interaction with the platform’s users.
ACKNOWLEDGMENTS
This work is funded by the EU H2020 research and innovation
programme under grant agreement No. 644399 (MONROE), and by
the Norwegian Research Council RFF project No. 245698 (NIMBUS).
For more information, visit https://www.monroe-project.eu/. The
authors would like to express their gratitude to the reviewers and,
particularly, to Aruna Balasubramanian, for their invaluable advices
to improve this work.
7.2 Lessons Learned
To achieve the goals of the external users, we learned that the
large availability of experimental resources in MONROE is mandatory, while still giving experimenters strong control of the testing
environment. For this reason, each user had access to a series of
77
Paper Session II: Can You Hear Me Now?
MobiCom’17, October 16-20, 2017, Snowbird, UT, USA
REFERENCES
[21] MONROE. WebWorks Experiment. https://github.com/MONROE-PROJECT/
Experiments/tree/master/experiments/WebWorks
[22] H2020 NEAT Project. A New, Evolutive API and Transport-Layer Architecture
for the Internet. https://www.neat-project.org/
[23] Marc C. Necker, Michael Scharf, and Andreas Weber. 2005. Performance of
Different Proxy Concepts in UMTS Networks. Springer Berlin Heidelberg, Berlin,
Heidelberg, 36–51. DOI:https://doi.org/10.1007/978-3-540-31963-4_4
[24] networld2020. 2016. Service Level Awareness and open multi-service internetworking - Principles and potentials of an evolved Internet ecosystem. (2016).
[25] Odroid. http://www.hardkernel.com.
[26] OOKLA. http://www.speedtest.net/.
[27] PATHspider-monroe. https://github.com/mami-project/pathspider-monroe.
[28] PC Engines. APU2C4. https://www.pcengines.ch/apu2c4.htm
[29] Planetlab. https://www.planet-lab.org/.
[30] RaspberryPi. http://www.raspberrypi.org.
[31] RIPE Atlas. https://atlas.ripe.net/.
[32] Anika Schwind, Michael Seufert, Özgü Alay, Pedro Casas, Phuoc Tran-Gia, and
Florian Wamser. 2017. Concept and Implementation of Video QoE Measurements
in a Mobile Broadband Testbed. In Proc. of the IEEE/IFIP Workshop on Mobile
Network Measurement.
[33] Sierra-Wireless. MC7455 miniPCI express (USB 3.0) modem.
https:
//www.sierrawireless.com/products-and-solutions/embedded-solutions/
products/mc7455/
[34] Narseo Vallina-Rodriguez, Srikanth Sundaresan, Christian Kreibich, Nicholas
Weaver, and Vern Paxson. 2015. Beyond the radio: Illuminating the higher layers
of mobile networks. In Proceedings of the 13th Annual International Conference
on Mobile Systems, Applications, and Services. ACM, 375–387.
[35] Florian Wamser, Michael Seufert, Pedro Casas, Ralf Irmer, Phuoc Tran-Gia, and
Raimund Schatz. 2015. YoMoApp: A Tool for Analyzing QoE of YouTube HTTP
Adaptive Streaming in Mobile Networks. In European Conference on Networks
and Communications (EuCNC).
[36] Zhaoguang Wang, Zhiyun Qian, Qiang Xu, Zhuoqing Mao, and Ming Zhang.
2011. An untold story of middleboxes in cellular networks. In Proc. of SIGCOMM.
[37] Xing Xu, Yurong Jiang, Tobias Flach, Ethan Katz-Bassett, David Choffnes, and
Ramesh Govindan. 2015. Investigating Transparent Web Proxies in Cellular
Networks. In Proc. of Passive and Active Measurement.
[38] Yepkit. USB Switchable Hub YKUSH Homepage. https://www.yepkit.com/
products/ykush
[39] Yomo-browser-plugin. https://github.com/lsinfo3/yomo-browser-plugin.git.
[40] Yomo-docker. https://github.com/lsinfo3/yomo-docker.git.
[41] ZeroMQ. http://zeromq.org/.
[42] ZTE. USB-based CAT4 MF910 MiFi, product specification.
http://www.
ztemobiles.com.au/downloads/User_guides/MF910_Help1.0.pdf
[1] Beagleboard.org. http://beagleboard.org.
[2] J. Border, M. Kojo, J. Griner, G. Montenegro, and Z. Shelby. 2001. Performance
Enhancing Proxies Intended to Mitigate Link-Related Degradations. (2001).
[3] CAIDA. Archipelago (Ark) Measurement Infrastructure. http://www.caida.org/
projects/ark/
[4] Compex. WLE600VX. http://www.pcengines.ch/wle600vx.htm
[5] Viktor Farkas, Balázs Héder, and Szabolcs Nováczki. 2012. A Split Connection
TCP Proxy in LTE Networks. In Information and Communication Technologies.
Springer, 263–274.
[6] FCC. 2013. 2013 Measuring Broadband America February Report. Technical Report.
FCC’s Office of Engineering and Technology and Consumer and Governmental
Affairs Bureau.
[7] FED4FIRE. http://www.fed4fire.eu/.
[8] Mah-Rukh Fida, Andra Lutu, Mahesh Marina, and Ozgu Alay. 2017. ZipWeave:
Towards Efficient and Reliable Measurement based Mobile Coverage Maps. Proc.
IEEE INFOCOM (May 2017).
[9] A. Finamore, M. Mellia, M. Meo, M. M. Munafo, P. D. Torino, and D. Rossi. 2011.
Experiences of Internet traffic monitoring with tstat. IEEE Network 25, 3 (May
2011), 8–14. DOI:https://doi.org/10.1109/MNET.2011.5772055
[10] Matthias Hirth, Tobias Hobfeld, Marco Mellia, Christian Schwartz, and Frank
Lehrieder. 2015. Crowdsourced network measurements: Benefits and best practices. Computer Networks 90 (2015), 85–98.
[11] M. Ivanovich, P. W. Bickerdike, and J. C. Li. 2008. On TCP performance enhancing proxies in a wireless environment. IEEE Communications Magazine 46, 9
(September 2008), 76–83. DOI:https://doi.org/10.1109/MCOM.2008.4623710
[12] JSON. http://www.json.org/.
[13] Christian Kreibich, Nicholas Weaver, Boris Nechaev, and Vern Paxson. 2010. Netalyzr: illuminating the edge network. In Proceedings of the 10th ACM SIGCOMM
conference on Internet measurement. ACM, 246–259.
[14] Iain R. Learmonth, Andra Lutu, Gorry Fairhurst, David Ros, and Özgü Alay. 2017.
Path Transparency Measurements from the Mobile Edge with PATHspider. In
Proc. of the IEEE/IFIP Workshop on Mobile Network Measurement.
[15] Iain R. Learmonth, Brian Trammell, Mirja Kühlewind, and Gorry Fairhurst. 2016.
PATHspider: A tool for active measurement of path transparency. In Proceedings
of the 2016 Applied Networking Research Workshop. 62–64.
[16] H2020 MAMI Project. Measurement and Architecture for a Middleboxed Internet.
https://mami-project.eu/
[17] MONROE. Open Source Code. https://github.com/MONROE-PROJECT
[18] MONROE. Open-Source Experiments. https://github.com/MONROE-PROJECT/
Experiments
[19] MONROE. User Access Portal. https://www.monroe-system.eu
[20] MONROE. User Manual. https://github.com/MONROE-PROJECT/UserManual
78