Communications - Scientific Letters of the University of Zilina 2021, 23(4):A223-A232 | DOI: 10.26552/com.C.2021.4.A223-A232

Various Approaches Proposed for Eliminating Duplicate Data in a System

Roman Čerešňák1, Karol Matiaško ORCID...1, Adam Dudáš2
1 Department of Informatics, Faculty of Management Science and Informatics, University of Zilina, Zilina, Slovakia
2 Department of Computer Science, Faculty of Natural Siences, Matej Bel University, Banska Bystrica, Slovakia

The growth of big data processing market led to an increase in the overload of computation data centers, change of methods used in storing the data, communication between the computing units and computational time needed to process or edit the data. Methods of distributed or parallel data processing brought new problems related to computations with data which need to be examined. Unlike the conventional cloud services, a tight connection between the data and the computations is one of the main characteristics of the big data services. The computational tasks can be done only if relevant data are available. Three factors, which influence the speed and efficiency of data processing are - data duplicity, data integrity and data security. We are motivated to study the problems related to the growing time needed for data processing by optimizing these three factors in geographically distributed data centers.

Keywords: duplication data; distributed data processing; software architecture

Received: November 27, 2020; Accepted: January 26, 2021; Prepublished online: June 23, 2021; Published: October 1, 2021  Show citation

ACS AIP APA ASA Harvard Chicago Chicago Notes IEEE ISO690 MLA NLM Turabian Vancouver
Čerešňák, R., Matiaško, K., & Dudáš, A. (2021). Various Approaches Proposed for Eliminating Duplicate Data in a System. Communications - Scientific Letters of the University of Zilina23(4), A223-232. doi: 10.26552/com.C.2021.4.A223-A232
Download citation

References

  1. KAMAL, J., MURSHED, M., BUYYA, R. Workload-aware incremental repartitioning of shared-nothing distributed databases for scalable OLTP applications. Future Generation Computer Systems [online]. 2016, 56(C), p. 421-435. ISSN 0167-739X. Available from: https://doi.org/10.1016/j.future.2015.09.024 Go to original source...
  2. PATIL, J., BARVE, S. S. DDFP: Duplicate detection and fragment placement in deduplication system for security and storage space. In: 1st International Conference on Intelligent Systems and Information Management ICISIM 2017: proceedings [online]. 2017. Available from: https://doi.org/10.1109/ICISIM.2017.8122177 Go to original source...
  3. DOLEV, S., FLORISSI, P., GUDES, E., SHARMA, S., SINGER, I. A survey on geographically distributed big-data processing using MapReduce. IEEE Transactions on Big Data [online]. 2017, 5(1), p. 60-80. eISSN 2332-7790. Available from: https://doi.org/10.1109/TBDATA.2017.2723473 Go to original source...
  4. GAI, K., QIU, M., ZHAO, H. Security-aware efficient mass distributed storage approach for cloud systems in big data. In 2nd IEEE International Conference on Big Data Security on Cloud IEEE BigDataSecurity 2016, 2nd IEEE International Conference on High Performance and Smart Computing IEEE HPSC 2016 and IEEE International Conference on Intelligent Data and Security IDS: proceedings [online]. Vol. 1. 2016. Available from: https://doi.org/10.1109/BigDataSecurity-HPSC-IDS.2016.68 Go to original source...
  5. CAO, N., WANG, C., LI, M., REN, K., LOU, W. Privacy-preserving multi-keyword ranked search over encrypted cloud data. IEEE Transactions on Parallel and Distributed Systems [online]. 2014, 25(1), p. 222-233. ISSN 1045-9219, eISSN 1558-2183. Available from: https://doi.org/10.1109/TPDS.2013.45 Go to original source...
  6. KOTENKO, I., SAENKO, I., BRANITSKIY, A. Framework for mobile internet of things security monitoring based on big data processing and machine learning. IEEE Access [online]. 2018, 6, p. 72714-72723. ISSN 2169-3536. Available from: https://doi.org/10.1109/ACCESS.2018.2881998 Go to original source...
  7. DIAO, Z., WANG, Q., SU, N., ZHANG, Y. Study on data security policy based on cloud storage. In: 3rd IEEE International Conference on Big Data Security on Cloud BigDataSecurity 2017, 3rd IEEE International Conference on High Performance and Smart Computing HPSC 2017 and 2nd IEEE International Conference on Intelligent Data and Security: proceedings [online]. 2017. ISBN 978-1-5090-6296-6, p. 145-149. Available from: https://doi.org/10.1109/BigDataSecurity.2017.12 Go to original source...
  8. RAO, L., LIU, X., XIE, L., LIU, W. Minimizing electricity cost: optimization of distributed internet data centers in a multi-electricity-market environment. In: IEEE INFOCOM 2010: proceedings [online]. 2010. Available from: https://doi.org/10.1109/INFCOM.2010.5461933 Go to original source...
  9. JUNGHARE, S. A., MAHALLE, V. S. Overview of secure distributed de-duplication system with improved reliability. In: International Conference on Inventive Systems and Control ICISC 2017: proceedings [online]. 2017. Available from: https://doi.org/10.1109/ICISC.2017.8068613 Go to original source...
  10. SHAHRI, H. H., BARFORUSH, A. A. Z. Data mining for removing fuzzy duplicates using fuzzy inference. In: Annual Conference of the North American Fuzzy Information Processing Society - NAFIPS 2004: proceedings [online]. 2004. ISBN 0-7803-8376-1. Available from: https://doi.org/10.1109/NAFIPS.2004.1336229 Go to original source...
  11. LI, J., CHEN, X., HUANG, X., TANG, S., XIANG, Y., HASSAN, M. M., ALELAIWI, A. Secure distributed deduplication systems with improved reliability. IEEE Transactions on Computers [online]. 2015, 64(12), p. 3569-3579. ISSN 0018-9340, eISSN 1557-9956. Available from: https://doi.org/10.1109/TC.2015.2401017 Go to original source...
  12. KVET, M. KVET, M. Temporal database management: temporal registration. In: International Conference on Information and Digital Technologies IDT 2017: proceedings [online]. 2017. ISBN 978-1-5090-5688-0, p. 227-233. Available from: https://doi.org/10.1109/DT.2017.8024301 Go to original source...
  13. KVET, M. Data distribution in ad-hoc transport network. In: International Conference on Information and Digital Technologies IDT 2019: proceedings [online]. 2019. ISBN 978-1-7281-1401-9, p. 275-282. Available from: https://doi.org/10.1109/DT.2019.8813437 Go to original source...
  14. DOBRUCKY, B., HARGAS, L., MARCOKOVA, M., KONARIK, R., KONIAR, D. A LabVIEW implementation of digital waveform quadrature oscilator by virtual instrumentation. In: International Siberian Conference on Control and Communications SIBCON 2017: proceedings [online]. 2017. eISSN 2380-6516. Available from: https://doi.org/10.1109/SIBCON.2017.7998547 Go to original source...

This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, distribution, and reproduction in any medium, provided the original publication is properly cited. No use, distribution or reproduction is permitted which does not comply with these terms.