Datatonic + Vodafone: AI-Powered 5G
How to Forecast Mobile Network Traffic Using Machine Learning
5G technology is a game-changer. It will enable unprecedented high-speed internet on the mobile network. For telecom providers, this new technology represents both a huge business opportunity and a challenge in network operations. The increase in flexibility and in complexity for network operations creates favourable grounds for AI. In this post, we present the work done by Datatonic and Vodafone Group R&D to build traffic forecasting ML models, which is a key stepping stone towards an AI-powered 5G network.
How 5G is a game-changer
5G is here, and it is a game-changer in the telco industry and beyond. 5G technology will deliver mobile high-speed internet, very low latency and allow for seamless connection to a huge amount of devices on networks. This means high-speed internet on the move, for all users – both human and machine…
For telecom providers, 5G technology offers both great opportunities and great challenges to deliver a service that exceeds customers expectations. In terms of opportunities, 5G will unlock value across a wide range of new use cases, beyond current imagination. To mention just a few examples: massive IoT, connected vehicles, and mission-critical health [1,2]. On the other hand, the challenges are to leverage the flexibility of 5G virtual radio and core networks, while managing the complexity brought by the predicted huge amount of devices and the mix of physical and virtual network assets. Operating and optimising such complex networks becomes a superhuman task, and this is where AI becomes a critical part of 5G.
Vodafone has embraced the opportunities presented by 5G and telco digital transformation: 1. Next-generation network infrastructure, 2. Open digital architecture , 3. Big Data & AI and 4. Cloud technology.
Such pillars can eventually drive a transformation in business objectives:
- Deepening customer engagement: customer centric operations and highest quality of service on the networks will lead to high customer satisfaction and high retention
- Efficient and autonomous networks: the software based 5G network, powered by AI, will reduce the effort required to operate the network, prevent issues, reduce time to detect and fix anomalies, and improve assets utilisation, overall reducing operation costs
- Growing offerings in B2B and B2C: network virtualisation and massive IoT connectivity will unlock new services and business models, and unlock new revenues.
“Companies like Vodafone are already unleashing the power of Big Data, Analytics and Cloud in many areas. However, much more is to come. It will make us more customer-centric and the business more efficient.”
– Luke Ibbetson, Director of Vodafone Group R&D.
AI-powered traffic forecasting model
In the context of Vodafone’s introduction of an open digital infrastructure, Datatonic built and delivered an ML solution to forecast internet traffic on a mobile network of a large European city, using anonymised open-source data .
The project aimed at the following main objectives:
- Assess feasibility to build a unique ML model to forecast traffic on a network of 10,000 locations points,
- Demonstrate how Google Cloud Technology can accelerate the development of ML solutions,
- Explore further architectural advances and steps towards the realization of different use cases.
The benefit of using Google Cloud Platform is the power of managed and scalable services to accelerate development. Those services enable data scientists to focus on the essentials when prototyping new ML solutions: the approach and the models, instead of setting up resources and environments.
The following Google Cloud Platform components were used in our work. The raw historical data was uploaded to Google Cloud Storage, a scalable blob storage component. Dataflow processed and transferred at scale the raw data into BigQuery. BigQuery was used to explore and transform the data using SQL in preparation of ingestion by a ML model. AI Platform notebook provided a scalable, managed Jupyter notebooks service for prototyping ML models using Python and TensorFlow, enabling us to easily switch between CPUs or GPUs depending on the workload. The ML models were trained and served using AI Platform training and prediction, respectively.
Finally, the predictions of the model were stored into BigQuery to enable deeper analysis of predictions and model performance across the network.
Step 1: Data exploration and business insights
We started with a thorough exploration of the data. With unsupervised ML techniques, we can identify various usage patterns across the different locations in the networks. For example, some zones exhibited peaks of traffic just before and just after local business hours, corresponding to workers commuting to and from work. Other zones showed high traffic during local office hours and dropped significantly outside those hours and on the weekends, which identifies regions as business districts. Similar methods help identify regions as residential areas, suburban areas, and nightlife districts.
Step 2: Development of ML forecasting models
After identifying areas that exhibit different behavioural patterns of network usage, the next natural task is to address forecasting.
It is good practice to start development by defining a baseline. The baseline will help put the performance of complex machine learning models into context. We decided to use a persistence model, also called naive forecast. The persistence model consists of using the value at the previous time step to predict the value at the next time step. The persistence model is often used as a baseline in time series forecasting: it is simple to implement, fast to infer predictions and performs well as a first approximation.
An extensive comparison of the model performances for our use case is not straightforward, because of the spatial and temporal dimensions in the particular data. The errors for each model were calculated using the predictions over a test set representing one week of unseen data and over the whole network. For simplicity, we decided to reduce the performance of a model into two values: the average (over one week, and over 10,000 locations) of the mean absolute error (MAE), and the average of the mean squared error (MSE).
The first model built was a MultiLayer Perceptron neural network. This approach relies on data scientists to engineer features to capture specific insights observed in the data and to feed these features directly as input to the model. Time series tend to have seasonality (monthly, weekly, daily patterns) which, if captured in features, will help with the model performance. We thus engineered features related to the date (day of the week) and time (hour of the day) and created embeddings for these features. Then we created lag features: we used the historical values for the target variable (internet usage) three hours before the prediction time, one day before, and one week before. Finally, we defined a categorical feature to represent each location in the network. Leveraging those features, the trained model predicts the internet usage at the location in the network and the time of the day defined by the input features.
The second model we explored was a densenet neural network . The approach, in this case, is to consider the network as an “image” and try to predict how the next frame (i.e. the values at each node of the network in 1 hour) will look. With this approach, the only inputs are some sequences of network internet usage. The data scientists rely entirely on the model to find hidden patterns in the data and are not required to create features (which can introduce insights as well as bias) to feed to the model.
Both approaches led to similar performance, and they both exhibited significant improvement compared to standard time series forecasting baseline methods (we observed a 25% decrease of mean absolute error compared to the naive persistence model). They also reach similar performance as SARIMA, a well known statistical method for time series forecasting. However, the benefit of the model we implemented over SARIMA is scalability: our single model generates predictions for all of the 10,000 locations, whereas using SARIMA requires one model for each location (i.e. 10,000 models to cover the whole network !).
Step 3: Leverage ML model insights to take actions
Beyond model performance considerations, it is important to realise that the prediction of the internet traffic on the network, as provided by our model, is not an end in itself, but rather a key stepping stone to leverage 5G technology. It is crucial to define the use cases that those predictions will support, and the impact and value that will result from it. In other words, traffic prediction does not have a business impact on its own. It is how you take action on these predictions that will define the impact and the value for your business.
“The telecoms industry is envisaging impact in multiple areas, including closed loop network automation. This is not an easy feat to achieve: from network telemetry data to predictions to automated decision making concerning how to adjust the right knobs in the network. Once this works, it will be magic.”
– says Guenter Klas who is leading the AI Research Cluster in Vodafone Group R&D.
“Working closely with the experts from Datatonic, we have been able to achieve our targets very fast and as desired.”
The goal for the traffic forecasting solution is to use AI to enable optimal provisioning of resources on the network at a large scale and thus minimise operation costs and energy consumption for a complex network. For this specific use case, the prediction of the model will be used to suggest when network resources (e.g. for required bandwidth) should be upscaled or downscaled, in order to find the right balance between providing to customers excellent access to their services without lags or delay in access, and avoiding wasting operational resources by providing much more network capacity than needed.
While this solution initially aims at becoming an assistant for human decision making, the next step is to leverage this solution to deliver the insights to a second layer that will automate making the decisions on how much to up/downscale or otherwise adjust resources according to the forecasting prediction. Those solutions will enable a transition into closed loop zero-touch network operation, where humans monitor the solutions with minimal required interventions.
Datatonic and Vodafone developed a 100% Google Cloud-native set of ML models for cellular internet traffic forecasting. This AI-enabled traffic forecasting capability is the beginning of the journey to closed-loop autonomous operation. One step at a time, this path will lead to automated, highly optimised network operations, where humans can rely on AI to assist them in running incredibly complex systems, and it helps Vodafone “connect for a better future”. The Humans will then have more time available to create new services, new business models, new ecosystems, … to make the most of the new technology and to offer new and unique customer experiences, beyond our current imagination.
 The Telecom Italia Big Data Challenge, Link to dataset
 Citywide Cellular Traffic Prediction Based on a Hybrid Spatiotemporal Network