Presentation of a new method based on modern multivariate approaches for big data replication in distributed environments

PLoS One. 2021 Jul 9;16(7):e0254210. doi: 10.1371/journal.pone.0254210. eCollection 2021.

Abstract

As the amounts of data and use of distributed systems for data storage and processing have increased, reducing the number of replications has turned into a crucial requirement in these systems, which has been addressed by plenty of research. In this paper, an algorithm has been proposed to reduce the number of replications in big data transfer and, eventually to lower the traffic load over the grid by classifying data efficiently and optimally based on the sent data types and using VIKOR as a method of multivariate decision-making for ranking replication sites. Considering different variables, the VIKOR method makes it possible to take all the parameters effective in the assessment of site ranks into account. According to the results and evaluations, the proposed method has exhibited an improvement by about thirty percent in average over the LRU, LFU, BHR, and Without Rep. algorithms. Furthermore, it has improved the existing multivariate methods through different approaches to replication by thirty percent, as it considers effective parameters such as time, the number of replications, and replication site, causing replication to occur when it can make an improvement in terms of access.

MeSH terms

  • Algorithms*
  • Big Data*
  • Humans
  • Multivariate Analysis

Grants and funding

The author received no specific funding for this work and the publication fee is waived.