A brief introduction to Federated Learning — FL Series Part 1

6 min readDec 14, 2021

Federated learning was first introduced by Google in 2017 (1) to improve text prediction in mobile keyboard using machine learning models trained by data across multiple devices. The new technology branch of machine learning has been sought-after ever since because it doesn’t require uploading personal data to a central server to train the models, which was a breakthrough in traditional machine learning to address data privacy issues.

In the illustration (Figure 1), different mobiles train their local models using private data residing on the devices A and then aggregate the parameters or updates B generated from each trained model to improve the global model C. The process repeats until a desired and high-quality model is attained. Federated learning has become the solution to resolve the conflicts between data privacy concerns and data sharing needs, as it sends the models to the data rather than the other way around.

Federated learning was initially used by Google to solve business-to-customer problems, but later further developed by other industrial pioneers in extended scenarios. WeBank played an important role in advancing the technology by categorizing federated learning and introducing the new machine learning technology to China in 2018. Since then, we started to explore its potential in business-to-business, especially financial applications (2).

The Chief AI Officer of WeBank, Yang Qiang has an analogy to explain how federated learning works (Figure 2). If machine learning models are sheep, the data is grass. A sheep farmer can buy tons of hay and move them to a central spot to feed the sheep. He can grow the livestock that way, but it’s not ideal because the accumulating and moving feeds cost money and bear risks — collecting and transferring data to a central server pose security risks. Privacy concerns and regulations prevent and restrain data movement, so in the analogy, the farmer may no longer be able to relocate the feed for accumulation. However, if the farmer decides to graze the sheep in multiple fields, he wouldn’t need to bear the risk of moving the feeds. In that way, user privacy protection is improved by training models where the data sources are, rather than feeding the raw data to a centralized server.

A team of researchers and AI engineers at WeBank have devoted themselves to advancing vertical federated learning by applying it to supporting financial inclusion and practicing federated transfer learning in B2B applications (3). Let’s look at the basic types of federated learning before taking a deep dive into real business use cases.

In terms of the feature and the sample ID distribution of the island datasets, federated learning is categorized into horizontal federated learning, vertical federated learning, and federated transfer learning. You may see other types of federated learning defined by other advocates, to name a few: cross-silo, cross-device, model-centric, data-centric, and more. Some say the terms matter little in the art of engineering solutions. However, the names reflect the demand for real-world use cases and various approaches to apply the new technology.

*Figure 3, Horizontal Federated Learning*

Horizontal federated learning (Figure 3) is applicable to horizontal data, unsurprisingly! With horizontal data, datasets available have a consistent set of features but are different in samples. For instance, one bank in Shanghai and another in Singapore that offers similar financial services online may have different user groups because of their respective operational locations. So, the data features are almost identical, but the intersection of user data sets should be very small. In this case, each bank trains their models locally and send encrypted gradients to the server to train a universal model. The banks will get a new model after the server aggregates the gradients (3).

Vertical federated learning (Figure 4) is very exciting for the intensively scrutinized banks, since it allows them to collaborate with non-banking firms to offer better-personalized services without compromising privacy. Vertical federated learning is applicable to the cases where data sets are from the same samples but have very different features. A bank in Shanghai seeking data collaboration with a local e-commerce platform can use vertical federated learning to create a prediction model for financial product purchases. The bank records the customers’ incomes, expenses, and credit histories while the online retailer has the browsing and purchasing histories of the clients, which all involve sensitive personal data. In this case, vertical federated learning is applied to aggregate the different features and compute the training loss and gradients in a privacy-preserving manner (3). As illustrated in Figure 4, encryption-based user IDs are aligned to confirm the intersection of clients from the bank and the e-commerce company in Shanghai. A collaborator is needed in the system to handle the encryption. The bank and the online retailer won’t talk to each other directly; that’s the essence of the privacy enhancement technology in the data-hungry age. Instead, they send masked gradients to the collaborator, who will return decrypted gradients and loss to the two parties after the global model is trained.

What if companies from different industries in different geographical areas seek data collaboration, but the data sets they can contribute limited overlap in both features and users? Neither horizontal nor vertical federated learning is helpful under such circumstances, and that’s where federated transfer learning comes in (Figure 5). Now we have a global online retailer and a bank in Shanghai. They have almost no overlapping customers, meaning completely different samples, because they operate in different regions. The intersection of their customer characteristics is extremely small, since they are in different industries. Federated transfer learning is helpful in this case because it fills in missing labels from a pre-trained model to expand the scale of the available data. Federated transfer learning is the combination of vertical federated learning and transfer learning.

As the above figure shows, data from the global retailer A and bank B share a small overlap in the data sample. Federated transfer learning optimizes the prediction model on target domain party (B in this case), leveraging the knowledge from the source-domain party (A in this case), by learning a common feature representation between A and B.

This article has touched upon what federated learning is and its three fundamental types. We will talk about why federated learning will become the key to unlocking the value of data in the digital era and its potential applications in different industries in the next article of this federated learning series.

Story: Sookie Tao

Editor: Lilith Hu

Reference:

1. Communication Efficient Learning of Deep Networks from Decentralized Data，https://arxiv.org/pdf/1602.05629.pdf

2. 2021 Global Federal Learning Research and Application ， http://blog.kurokoz.com/reports/2021-federal-learning-global-research-and-application-trends-report

3. Federated Machine Learning: Concept and Applications，https://www.arxiv-vanity.com/papers/1902.04885/

4. Federated Learning, https://fate.fedai.org/2020/03/10/the-webank-ai-group-present-the-first-monograph-on-federated-learning-2/

5. What is Federated AI? https://www.digfingroup.com/what-is-federated-ai/

A brief introduction to Federated Learning — FL Series Part 1

Written by OPEN ZONE