Federated analytics and the future sebastian
Sebastian Ifeanyi Obeta
Cambridge University, United Kingdom
: J Comput Eng Inf Technol
Abstract
It becomes ironic that the geometric increase in data and big data creates a bottleneck for researchers in some fields, such as the genome, with decries centred on a lack of data. It, therefore, suggests the need for big data to be pulled into one central location or raises the need to have an extensive data storage infrastructure. In many applications, the data generated sits in different silos, creating significant concerns about privacy and confidentiality. The lack of data leads to different strategies like data augmentation and synthetic data to enhance the performance and results of machine learning models. It is a known fact that the machine learning model performs better and is more accurate when the dataset is rich and sufficient.On the other hand, machine learning frequently displays unexpected and startling behaviours and creates unfairness in machine-learned models through bias. If the original dataset has biases, the data supplemented from it will also have biases. As a result, determining the best data augmentation approach is critical. Regarding the practical applicability of decentralised machine learning schemes, many significant machine learning algorithm questions still need to be answered. Considering the geometric increase of data through the internet of things, edge computing technologies, which cut across different applications and devices and acquire the 5 V’s (velocity, variety, veracity, value, volume). Data are generated, transported, and analysed in large quantities in an edge-cloud computing environment. In many applications, the edge devices and the data generated in the edge belong to heterogeneous owners, which raises data privacy issues. If analytics is to be brought to the different data silos, it raises more concerns about data privacy. Federated analytics brings analytics to data, extracting insights from data without any data being stored in one location. Federated learning remotely trains machine learning models and feeds the federated learning model aggregated prediction results. Federated learning is a subset of federated analytics. In this research, we x-rayed different ways federated analytics/learning addresses algorithmic challenges, data partition, privacy concerns, its mitigation strategy, and how to resolve third-party participation in the training processing through block chain technological concept.