Automatic NoSQL Schema Design: A Workload-Driven Schema Design Approach for NoSQL Wide Column Stores

Document Type : Persian Original Article

Authors

1 Department of Computer Engineering, Gonbad Kavoos Branch, Islamic Azad University, Gonbad Kavoos, Iran

2 Faculty of Science and Computer Engineering, Shahid Beheshti University, Tehran, Iran.

Abstract

NoSQL systems are suitable solutions for big data projects and offer a high level of flexibility in design. A good and efficient schema design for NoSQL wide column stores is not only based on the application’s conceptual data model but also on the queries defined in an application’s workload. In these databases, a manual schema design relies on rules of thumb to choose a good schema. Utility these rules without practices is a big challenge in this area. Because these rules are vague and generic, and must be adapted to each application. One of the ways that researchers use to overcome this challenge is presenting automated schema design. The main contribution of this paper is automated schema design for NoSQL wide column stores. This research proposes a workload-driven approach for the mapping from the application conceptual data model to a database schema with a goal of optimizing query performance. This approach uses workload information to achieve a good workload performance, which result in an optimized schema by minimizing the number of requests to the wide column stores. The experimental results show that automated schema generated by the proposed approach leads to a good workload performance.

Keywords


[1] H. H. Bhatt and A. P. Mankodia, "A Comprehensive Review on Content-Based Image Retrieval System: Features and Challenges," Data Science Intelligent Applications, vol. 52, pp. 63-74, 2021.
[2] K. Wanjale, T. Borawake, and S. Chaudhari, "Content based image retrieval for medical images techniques and storage methods-review paper," IJCA Journal, vol. 1, no. 19, pp. 105-107, 2010.
[3] T. W. Cai, J. Kim, and D. D. Feng, "Content-based medical image retrieval," in Biomedical information technology: Elsevier, 2008, pp. 83-113.
[4] A. Kumar, J. Kim, W. Cai, M. Fulham, and D. Feng, "Content-based medical image retrieval: a survey of applications to multidimensional and multimodality data," Journal of digital imaging, vol. 26, no. 6, pp. 1025-1039, 2013.
[5] H. A. Al-Jubouri, "Content-based image retrieval: Survey," Journal of Engineering Sustainable Development, vol. 23, no. 03, pp. 42-63, 2019.
[6] C. B. Akgül, D. L. Rubin, S. Napel, C. F. Beaulieu, H. Greenspan, and B. Acar, "Content-based image retrieval in radiology: current status and future directions," Journal of digital imaging, vol. 24, no. 2, pp. 208-222, 2011.
[7] P. Das and A. Neelima, "An overview of approaches for content-based medical image retrieval," International journal of multimedia information retrieval, vol. 6, no. 4, pp. 271-280, 2017.
[8] M. Alkhawlani, M. Elmogy, and H. El Bakry, "Text-based, content-based, and semantic-based image retrievals: a survey," International Journal of Computer and Information Technology, vol. 4, no. 01, pp. 58-66, 2015.
[9] J. Bromley et al., "Signature verification using a “siamese” time delay neural network," International Journal of Pattern Recognition and Artificial Intelligence, vol. 7, no. 04, pp. 669-688, 1993.
[10] D. Chicco, "Siamese neural networks: An overview," Artificial Neural Networks, pp. 73-94, 2021.
[11] J. Bromley, I. Guyon, Y. LeCun, E. Säckinger, and R. Shah, "Signature verification using a" siamese" time delay neural network," Advances in neural information processing systems, pp. 737-737, 1994.
[12] P. Baldi and Y. Chauvin, "Neural networks for fingerprint recognition," neural computation, vol. 5, no. 3, pp. 402-418, 1993.
[13] A. Mehmood, M. Maqsood, M. Bashir, and Y. Shuyuan, "A Deep Siamese Convolution Neural Network for Multi-Class Classification of Alzheimer Disease," Brain Sciences, vol. 10, no. 2, pp. 1-15, 2020.
[14] G. Koch, R. Zemel, and R. Salakhutdinov, "Siamese neural networks for one-shot image recognition," in ICML deep learning workshop, 2015, vol. 2: Lille, pp. 1-30.
[15] W. Rawat and Z. Wang, "Deep convolutional neural networks for image classification: A comprehensive review," Neural computation, vol. 29, no. 9, pp. 2352-2449, 2017.
[16] W. Liu, Z. Wang, X. Liu, N. Zeng, Y. Liu, and F. Alsaadi, "A survey of deep neural network architectures and their applications," Neurocomputing, vol. 234, pp. 11-26, 2017.
[17] N. Malviya, N. Choudhary, and K. Jain, "Content based medical image retrieval and clustering based segmentation to diagnose lung cancer," Advances in Computational Sciences and Technology, vol. 10, no. 6, pp. 1577-1594, 2017.
[18] K. A. Gladis, "Integration of global and local features based on hybrid similarity matching scheme for medical image retrieval system," nternational Journal of Biomedical Engineering and Technology, vol. 31, no. 3, pp. 292-314, 2019.
[19] Y. Cai, Y. Li, C. Qiu, J. Ma, and X. Gao, "Medical image retrieval based on convolutional neural network and supervised hashing," IEEE Access, vol. 7, pp. 51877-51885, 2019.
[20] H. Kasban, D. Salama, and Applications, "A robust medical image retrieval system based on wavelet optimization and adaptive block truncation coding," Multimedia Tools and Applications, vol. 78, no. 24, pp. 35211-35236, 2019.
[21] N. F. Haq, M. Moradi, and Z. J. Wang, "A Deep Community Based Approach for Large Scale Content Based X-Ray Image Retrieval," Medical Image Analysis,, vol. 68, pp. 1-16, 2020.
[22] K. France and A. Jaya, "Classification and retrieval of thoracic diseases using patch-based visual words: a study on chest x-rays," Biomedical Physics & Engineering Express, vol. 6, no. 2, pp. 1-9, 2020.
[23] H. J. Hwang et al., "Content-Based Image Retrieval of Chest CT with Convolutional Neural Network for Diffuse Interstitial Lung Disease: Performance Assessment in Three Major Idiopathic Interstitial Pneumonias," Korean Journal of Radiology, vol. 22, no. 2, pp. 281-290, 2020.
[24] A. Sze-To and H. Tizhoosh, "Searching for Pneumothorax in Half a Million Chest X-Ray Images," in International Conference on Artificial Intelligence in Medicine, 2020: Springer, pp. 453-462.
[25] N. Darapureddy, N. Karatapu, and T. K. Battula, "Optimal weighted hybrid pattern for content based medical image retrieval using modified spider monkey optimization," 2020.
[26] B. Renita and S. Christopher, "Novel real time content based medical image retrieval scheme with GWO-SVM," Multimed Tools Application, vol. 79, pp. 1-17, 2020.
[27] K. Chethan and R. Bhandarkar, "An Efficient Medical Image Retrieval and Classification using Deep Neural Network," Indian Journal of Science and Technology, vol. 13, no. 39, pp. 4127-4141, 2020.
[28] P. Haripriya and R. Porkodi, "Parallel deep convolutional neural network for content based medical image retrieval," Journal of Ambient Intelligence and Humanized Computing, vol. 12, no. 1, pp. 781-795, 2021.
[29] N. F. Haq, M. Moradi, and Z. J. Wang, "A deep community based approach for large scale content based X-ray image retrieval," Medical Image Analysis, vol. 68, p. 101847, 2021.
[30] B. Hu, B. Vasu, and A. Hoogs, "X-MIR: EXplainable Medical Image Retrieval," in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022, pp. 440-450.
[31] R. Pinapatruni and S. B. Chigarapalle, "Adversarial image reconstruction learning framework for medical image retrieval," Signal, Image and Video Processing, pp. 1-8, 2022.
[32] P. Hanchuan, L. Fuhui, and C. Ding, "Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1226-1238, 2005, doi: 10.1109/TPAMI.2005.159.
[33] J. Zhao, Y. Zhang, X. He, and P. Xie, "COVID-CT-Dataset: a CT scan dataset about COVID-19," Machine Learning, pp. 1-14, 2020.
[34] Z. I. Amanullah Asraf. COVID19, Pneumonia and Normal Chest X-ray PA Dataset. [Online]. Available: https://data.mendeley.com/datasets/jctsfj2sfn/1