گروه کامپیوتر، واحد میانه، دانشگاه آزاد اسلامی، میانه، ایران
چکیده
امروزه، حفظ حریم خصوصی داده های منتشر شده، از اهمیت زیادی برخوردار است. الگوریتمهای مختلفی برای حفظ حریم خصوصی داده ها ابداع شده است. مهمترین مشکلات این الگوریتمها، غیر قابل اثبات بودن حریم خصوصی آنها، اهمیت کم در نظر گرفته شده برای دانش پیش زمینه فرد متخاصم و عدم کاهش ابعاد داده های اولیه است. در این مقاله، برای اثبات حریم خصوصی داده ها از مفهوم حریم خصوصی تفاضلی در انتشار داده های توریع شده عمودی و رده بندی آنها استفاده شده است. در حریم خصوصی تفاضلی، دانش پیش زمینه فرد متخاصم در مورد داده های منتشر شده، اهمیت ندارد. همچنین، با استفاده از تبدیل موجک هار، ابعاد داده ها کاهش داده شده است. در نهایت، حریم خصوصی داده های منتشر شده به صورت ریاضی اثبات شده است، همچنین، دقت داده ها، با استفاده از الگوریتم رده بند «K-نزدیکترین همسایه» نیز اندازه گیری شده است و به صورت ریاضی اثبات شده است که الگوریتم پیشنهادی، خدشه هایی با انحراف معیار کمتری را نسبت به الگوریتم مورد مقایسه، به داده ها اضافه میکند و در نتیجه دقت رده بندی بالاتری دارد. نتایج به دست آمده نشان دهنده امن تر بودن الگوریتم ارائه شده نسبت به الگوریتمهای مورد مقایسه است.
C. Aggarwal and P.S Yu, “Privacy-Preserving Data Mining: Models and Algorithms” Springer-Verlag, New York, 2008.
C.M. Fung, K. Wang, R. Chen and P.S. Yu, “Privacy-Preserving Data Publishing: A Survey on Recent Developments,” ACM Computing Surveys 42(4), Article 14, 2010.
Agawam, R. Srikant, “Privacy-Preserving Data Mining,” Proceedings of ACM SIGMOD Conference, pp. 439-450, 2000.
Sweeney “k-Anonymity: A Model for Protecting Privacy,” Int. J. Unc. Fuzz. Knowl. Based Syst., 10 (5), pp. 557-570, 2002.
Samarati, L. Sweeney, “Protecting Privacy when Disclosing Information: k-Anonymity and its Enforcement Through Generalization and Suppression,” IEEE Symp. on Security and Privacy, pp. 188-199, 1998.
J. Bayardo, R. Agrawal, “Data Privacy through Optimal K-Anonymization,” Proceedings of ICDE Conference, pp. 217-228, 2005.
Machanavajjhala, J. Gehrke, and D. Kifer, “L-diversity: Privacy beyond k-anonymity,” IEEE ICDE Conference, pp. 24, 2006.
Gu, M. Li, L. Xiong and Y. Cao, "Providing Input-Discriminative Protection for Local Differential Privacy," 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 505-516, 2020.
Dwork, “Differential Privacy,“ Proceedings of the 33rd International Colloquium on Automata, Languages and Programming (ICALP’06), pp. 1–12, 2006.
C.M. Fung, K. Wang, A.W.C Fu and P.S. Yu, “Introduction to Privacy-Preserving Data Publishing Concepts and Techniques,” Taylor & Francis Group, 2011.
Dwork, F. McSherry, K. Nissim and A. Smith, “Calibrating noise to sensitivity in private data analysis,” in TCC, pp. 265–284 2006.
Kenthapadi, A. Korolova, I. Mironov and N. Mishra,”Privacy via the Johnson-Lindenstrauss Transform,” CoRR, abs/1204.2606, 2012.
Xiaokui, W. Guozhang, G.Johannes, “Differential Privacy via Wavelet Transforms,” IEEE Transactions on Knowledge and Data Engineering 23(8), pp. 1200-1214, 2011.
G. Divanis, V.S. Verykios,” Association Rule Hiding for Data Mining,” Springer, 2010.
Cui, W.K. Wong, and D.W. Cheung, “Privacy-Preserving Clustering with High Accuracy and Low Time Complexity,” In Proceedings of the 14th International Conference on Database Systems for Advanced Applications, DASFAA 2009, Brisbane, Australia, 2009.
R.M. Oliveira, and O.R. Za¨iane, “ Privacy preserving clustering by data transformation,” Proc. of the 18th Brazilian Symposium on Databases, Manaus, Amazonas, Brazil, pp. 304–318, 2003.
Du and Z. Zhan, “Using randomized response techniques for privacy-preserving data mining,” In Proc. of the 9th International Conf. on Knowledge Discovery and Data Mining (KDD’03), pp. 505–510, 2003.
Huang, W. Du, and B. Chen, “Deriving Private Information from Randomized Data,” In Proceedings of the ACM SIGMOD Conference on Management of Data, Baltimore, Maryland, USA, pp. 37-48, 2005.
Vaidya and C. Clifton, “Privacy-Preserving K-means Clustering over Vertically Partitioned Data,” In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 206 – 215, 2003.
B. Johnson and J. Lindenstrauss, “Extensions of Lipschitz Maps into a Hilbert Space,” Contemporary Mathematics 26, 189–206, 1984.
Bingham and H. Mannila, “Random Projection In Dimensionality Reduction: Applications To Image And Text Data,” In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 245–250, 2001.
Z. Fern, and C.E. Brodley, “Random projection for high dimensional data clustering: a cluster ensemble approach,” In Proc. of the 20th international conf. on machine learning, Washington DC, USA, pp. 102-110, 2003.
R.M. Oliveira and O.R. Za¨ıane, “A Privacy-Preserving Clustering Approach Toward Secure and Effective Data Analysis for Business Collaboration,” Computers & Security 26(1), 81-93, 2007.
Blum, C. Dwork, F. McSherry, and K. Nissim, “Practical Privacy: The SuLQ Framework,” In: Proceedings of the 2005 SIGMOD/PODS Conference, pp. 128–138, 2005.
C. Dwork, and N. Nissim, “Privacy-Preserving Data mining on Vertically Partitioned Databases,” In: Proceedings of the 24th Annual International Cryptology Conference Advances in Cryptology (CRYPTO’04), Santa Barbara, California, USA, pp. 528-544, 2004.
Nissim, S. Raskhodnikova, and A. Smith, “Smooth Sensitivity and Sampling in Private Data Analysis,” In: Proceedings of the 39th ACM Symposium on the Theory of Computing (STOC’07), pp. 75-84, 2007.
McSherry, “Privacy Integrated Queries: An Extensible Platform for Privacy-Preserving Data Analysis,” In: Proceedings of the 2009 SIGMOD/PODS Conference, pp. 19-30, 2009.
Feldman, A. Fiat, H. Kaplan, and K. Nissim, “Private Coresets.” In: Proceedings of the 41st Symposium on Theory of Computing (STOC), pp. 361–370, 2009.
Xu, X. Cheng, S. Su, et al., Differentially private frequent sequence mining, IEEE Trans. Knowl. Data Eng. 28 (11), 2016.
Zhou, S. Qin, H. Zhou, et al., A differential privacy noise dynamic allocation algorithm for big multimedia data, Multimedia Tools Appl. (8), pp. 1-19, 2018.
Shen, Z. Lu, A new lower bound of privacy budget for distributed differential privacy, in: proceedings of International Conference on IEEE Parallel and Distributed Computing, Applications and Technologies. pp. 25–32, 2018.
Wang, Y. Zhang, X. X. Lu, Z. Wang, Real-time and spatiotemporal crowd-sourced social network data publishing with differential privacy, IEEE Trans. Dependable Secure Comput. 15 (4), 2018.
Ou, Z. Qin, S. Liao, T. Li, and D. Zhang, “Singular spectrum analysis for local differential privacy of classifications in the smart grid,” IEEE Internet of Things Journal, 2020.
Sun, X., Xu, R., Wu, L. et al. A differentially private distributed data mining scheme with high efficiency for edge computing. J Cloud Comp 10, 7, 2021.
Mukherjee, et al, “A privacy preserving technique for distance-based classification with worst case privacy guarantees,” Data & Knowledge Engineering, 66, pp. 264–288, 2008.
Fan, J. He, M. Guo, P. Li, Z. Han, R. Wang, “Privacy preserving classification on local differential privacy in data centers,” Journal of Parallel and Distributed Computing, Vol. 135, pp. 70-82, 2020.
Li, M. Sun, “Privacy-Preserving Classification of Personal Data with Fully Homomorphic Encryption: An Application to High-Quality Ionospheric Data Prediction,” International Conference on Machine Learning for Cyber Security, pp. 437-446, 2020.
Li, J. Li, Z. Huang, CZ. Gao, W.B. Chen, K. Chen, “Privacy-preserving outsourced classification in cloud computing,” Cluster Computing, Vol. 21, pp. 277–286, 2018.
R.E. Dishabi, M.A. Azgomi, "Differential privacy preserving clustering in distributed datasets using Haar wavelet transform," Intelligent Data Analysis, vol. 19, no. 6, pp. 1323-1353, 2015.
Canetti, U. Feige, O. Goldreich, and M. Naor, “Adaptively Secure Multi-Party Computation,” Proceedings of the 28th Annual Symposium on Theory of Computing (STOC’96), ACM Press, pp. 639–648, 1996.
S. Burrus, R.A. Gopinath, and H. Guo, “Introduction to Wavelets and Wavelet Transforms,” A Primer Prentice Hall, 1997.
Zhang, T.B. Ho, Y. Zhang and M.S. Lin, “Unsupervised Feature Extraction for Time Series Clustering Using Orthogonal Wavelet Transform,” Journal Informatica 30(3), pp. 305-319, 2006.
F. Vysochanskij, and Y.I. Petunin, “Justification of the Three Sigma Rule for Unimodal Distributions,” Theory of Probability and Mathematical Statistics 21, 25–36, 1980.