Unlocking Inmarsat’s Data Potential with Innovative Aircraft Matching

Client

Inmarsat

Tech stack

Google Cloud

Solution

AI for operations

Service

AI + Machine Learning

Inmarsat, the world leader in global mobile satellite communication, works closely with Datatonic to enhance its data, analytics and machine learning capabilities. One of the main projects in this partnership is an innovative aircraft matching solution to maximise the value of Inmarsat’s aircraft data from more than 17,000 aircraft. The solution is a highly scalable, automated pipeline on Google Cloud to match Inmarsat’s unique hardware IDs to external satellite aircraft data. This enables Inmarsat to get accurate and up-to-date aircraft tail numbers and, as a result, Inmarsat has gained key business insights and a deeper understanding of their product offering.

Our impact

  • Achieved a greater than 93% match rate between unique hardware IDs and aircraft, with Inmarsat Aviation’s first solution of this kind.
  • Implemented and productionised a highly scalable and fully automated processing pipeline in Google Cloud that runs on a daily basis.
  • Pioneered the use of Workflows to orchestrate Google Cloud pipelines.

 

The challenge

Inmarsat, a world-leading British satellite telecommunications company, provides telephone and data services to users worldwide using portable or mobile terminals, which communicate with ground stations through satellites.

Historically, Inmarsat has been reliant on manually-entered and maintained data for the aircraft using its products and services. When aircraft are transferred, sold or re-registered, these changes aren’t always captured, resulting in a database of customer information that becomes outdated over time. Inmarsat recognised the value of using extensive internal data to improve insights and its product offering. However, this was only possible once the unique ID of their onboard hardware could be matched against the correct aircraft.

Once matched, Inmarsat can enrich its dataset to help them extract powerful insights and improve product offerings.

“We received positive feedback from Inmarsat’s Maritime team about Datatonic, which had worked with them on a similar project to match vessel names and IDs. We were keen to see if this success could be repeated for the Aviation business unit.

The Datatonic team worked seamlessly with Inmarsat’s own internal teams, particularly IT and data engineering. They adapted to Inmarsat’s complex security and system requirements, demonstrating Datatonic’s knowledge as subject matter experts, and delivered an outcome that surpassed expectations.” – Jeremy Jackson, Lead Insight Partner, Inmarsat

 

Our solution

Over the course of 10 weeks, Datatonic designed and built a scalable data processing pipeline on Google Cloud Platform. The platform has allowed Inmarsat to match its unique hardware IDs to commercially registered aircraft with high accuracy through satellite geolocation data on a daily basis.

The project was split into two phases: first, a data science proof-of-concept to assess the feasibility and design the matching algorithm, followed by a production phase, to orchestrate and automate the end-to-end solution.

As part of the solution, our team of data scientists and data engineers built the following:

  1. An SQL pipeline in BigQuery to clean, process and match the internal and external data sources. This enabled going from raw datasets to final tables that can be used by business users and in reporting.
  2. matching algorithm suited for aircraft particularities, which consisted in identifying aircraft and unique hardware IDs that were co-located at a given minute. Custom confidence metrics were designed to assess the quality of all candidate matches, filter out inaccurate matches, and report on match performance.
  3. Tableau dashboards were built on top of reporting tables to analyse and monitor matching performance over time.
  4. The end-to-end data processing pipeline was executed in BigQuery and orchestrated with Google’s new Workflows tool to be executed on a daily basis, upon receiving new data. Continuous monitoring, alerting and testing was also incorporated into the solution for robustness.

Cloud Workflows was chosen instead of Cloud Composer as the orchestrator for two reasons: 1) Workflows is a fully serverless technology, making it easy to deploy and includes all necessary monitoring options, 2) The Workflows billing model is usage-based (per step executed), making it the right choice for a solution that requires processing upon receiving new data.

Matching terminal data to commercially registered aircraft with high accuracy through satellite geolocation data, on a daily basis has enabled Inmarsat to enrich its dataset and extract powerful business insights and improve product offerings.

“The data output from this project has been integrated into all of our key reports and dashboards. It has facilitated a number of planned projects and could enable future projects that haven’t yet been conceived.” – Jeremy Jackson, Lead Insight Partner, Inmarsat

 

About Cloud Workflows

Workflows (see more here) is a fully-managed serverless product, that helps call a sequence of services in a stateful durable execution. It enables the orchestration and automation of Google Cloud and HTTP-based API services.

Workflows allows us to define the flow of the business logic in a YAML file definition with the capacity to automate complex processes, including batch and event-driven jobs, error handling, sequences of operations between services: take the outputs of one, route them into the inputs of another, define conditions, have code pieces that weight or pull, do retries, and more.  Workflows is particularly helpful with Google Cloud services that perform long-running operations, as Workflows will wait for them to complete, even if they take hours or days.

Interacting with Google services has become an even easier task thanks to connectors. Connectors simplify calling services because they handle the formatting of requests for us, providing some kind of function call where things are abstracted away, so we don’t need to know the details of a Google Cloud API or worry about the retries, the polling, the authentication, the formatting of the message, etc. This means that we basically know if the operation is completed and if we have a success or a failure.

However, Workflows scalability can become complex when working with a declarative syntax in YAML. But with an organised and structured implementation with Terraform, based on workflows and sub-workflows, and an intuitive naming convention design, it is possible to define a scalable framework that fits the future growth of solutions.

If you need help with your Workflows implementation, reach out to our team here.