Original Source Here
With every new start, comes up new challenges. And trust the beauty of new challenges, they are here to help us grow.
So let’s discuss about the challenges.
- Some data is very specific to a particular user. And that will bring down the overall model performance. We don’t want the model to memorize the rare data from a particular user.
Solution : a) By devicing a mechanism to control the amount an individual user can contribute towards the overall result. b) By adding noise to more specific data. This is also referred to as Differential Privacy. I found this article quite intuitive.
2. Now we have our new model formed from aggregated results. But how can we see how the model is performing on new data before rolling out the update?
Solution: Simple! We can apply the same concept of train validation split that we all know of. Instead here we will have the users as our experiment!!!! Sounds interesting hunh? We will split users as training and validation. From a universal set of smartphone users, we have a small proportion of them who will validate the result. And the rest will train the model. So the model is tested on real time data.
3. Can simple averaging aggregation work for all the algorithms? Let’s explain this with two examples:
a) Let’s take Normal Bayes (openCV). The mean vector and the covariance matrix are highly infuenced by the number of samples per class. Now let’s say we have two smartphones and a binary classification problem with class A and class B.
Where x ki ( j ) represents the value of the i-th feature attribute of the j-th sample belonging to the class k in the training sample, and the final n -dimensional (a total of n feature attributes) mean vector of class k ‘μk’ is estimated as:
So the mean vector’s values are highly influenced by the number of samples per class. So the mean vector μA(1) of user 1 has 75% influence of class A and μA(2) 25%. So when we merge them by taking the average 1/2(μA(1) + μA(2)) are we able to retain the information specific to the classes?
b) For algorithms like SVM whose weight is nothing but the support vectors that depend on the number of samples in the dataset, can we have a weight matrix that is of constant size for all the results? We night need to device aggregating algorithms specific to Machine learning task at hand.
4. Trade off between Privacy and Accuracy :
Sometimes to increase the privacy of the user’s data, some noise is added which results in the data being deviated from it’s actual behaviour thus resulting in some accuracy drop.
Federated learning can solve a lot of problems related to user’s privacy whule improving the model performance for better recommendations. This is a fairly new domain, and more research can solve a lot of challenges faced by what we call as Federated Learning.
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot