Challenges of big data storage and management

Main Article Content

Rajeev Agrawal Christopher Nyamful

Abstract

The amount of data generated daily by industries, large organizations and research institute is increasing at a very fast rate. These huge volumes of data need to be kept not just for analytic purposes, but also in compliance with laws and service level agreements to protect and preserve data. Storage and management are major concern in this era of big data. The ability for storage devices to scale to meet the rate of data growth, enhance access time and data transfer rate is equally challenging. These factors, to a considerable extent, determine the overall performance of data storage and management. Big data storage requirements are complex and thus needs a holistic approach to mitigate its challenges. This paper examines the challenges of big data storage and management. In addition, we also examines existing current big data storage and management platforms and provide useful suggestions in mitigating these challenges.


Keywords: big data, storage systems, challenges, performance.

Downloads

Download data is not yet available.

Article Details

Section
Articles

References

[1] Bouganim, L., Jónsson, B., & Bonnet, P. (2009). uFLIP: Understanding flash IO patterns. arXiv preprint arXiv:0909.1780.

[2] Chervenak, A., Vellanki, V., & Kurmas, Z. (1998). Protecting file systems: A survey of backup techniques. Paper presented at the Joint NASA and IEEE Mass Storage Conference.

[3] Fontana, R. E., Hetzler, S. R., & Decad, G. (2012). Technology Roadmap Comparisons for TAPE, HDD, and NAND Flash: Implications for Data Storage Applications. Magnetics, IEEE Transactions on, 48(5), 1692-1696. doi: 10.1109/TMAG.2011.2171675

[4] Geer, D. (2008). Reducing the storage burden via data deduplication. Computer, 41(12), 15-17.

[5] He, X., & Zhao, L. (2013). A Data Management and Analysis System in Healthcare Cloud. Paper presented at the Service Sciences (ICSS), 2013 International Conference on.

[6] Ji, C., Li, Y., Qiu, W., Awada, U., & Li, K. (2012). Big data processing in cloud computing environments. Paper presented at the Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks, I-SPAN.

[7] Nakamura, S., Nakayama, K., & Nakagawa, T. (2009). Optimal backup interval of database by incremental backup method. Paper presented at the Industrial Engineering and Engineering Management, 2009. IEEM 2009. IEEE International Conference on.

[8] Policroniades, C., & Pratt, I. (2004). Alternatives for Detecting Redundancy in Storage Systems Data. Paper presented at the USENIX Annual Technical Conference, General Track.

[9] Reed, D. A., Gannon, D. B., & Larus, J. R. (2011). Imagining the future: Thoughts on computing. Computer(1), 25-30.

[10] Renuga, K., Tan, S., Zhu, Y., Low, T., & Wang, Y. (2009). Balanced and efficient data placement and replication strategy for distributed backup storage systems. Paper presented at the Computational Science and Engineering, 2009. CSE'09. International Conference on.

[11] Rui-Xia, Y., & Bo, Y. (2012). Study of NAS Secure System Base on IBE. Paper presented at the Industrial Control and Electronics Engineering (ICICEE), 2012 International Conference on.

[12] Singhal, R., Bokare, S., & Pawar, P. (2010). Enterprise storage architecture for optimal business continuity. Paper presented at the Data Storage and Data Engineering (DSDE), 2010 International Conference on.

[13] Sun, G.-Z., Dong, Y., Chen, D.-W., & Wei, J. (2010). Data backup and recovery based on data de-duplication. Paper presented at the Proceedings of the 2010 International Conference on Artificial Intelligence and Computational Intelligence-Volume 02.

[14] Sun, Y., & Xu, Z. (2004). Grid replication coherence protocol. Paper presented at the Parallel and Distributed Processing Symposium, 2004. Proceedings. 18th International.

[15] White, T. (2012). Hadoop: The definitive guide: " O'Reilly Media, Inc.".

[16] You, L. L., Pollack, K. T., & Long, D. D. (2005). Deep Store: An archival storage system architecture. Paper presented at the Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on.

[17] YouTube Data Statistics. (2015). Retrieved 01-15-2015, 2015, from http://www.youtube.com/yt/press/statistics.html

[18] Zheng, S., Li, M.-C., & Sun, W.-F. (2011). DRCSM: a Novel Decentralized Replica Consistency Service Model. Journal of Chinese Computer Systems, 32(8), 1622-1627.

[19] Zhou, R., Liu, M., & Li, T. (2013). Characterizing the efficiency of data deduplication for big data storage management. Paper presented at the Workload Characterization (IISWC), 2013 IEEE International Symposium on.