تولید خودکار مجموعه داده آزمون با هدف بهبود مکان‌یابی خطا مبتنی بر تحلیل علّی-آماری

نوع مقاله : مقاله پژوهشی فارسی

نویسندگان

دانشکده مهندسی کامپیوتر، دانشگاه علم و صنعت ایران، تهران، ایران.

چکیده

روش‌های آماری مکان‌یابی خطا در نرم‌افزار وابستگی زیادی به داده‌های ورودی برنامه داشته و با تغییرات داده‌ها دچار ناپایداری می‌شوند. از این رو، تولید داده آزمون مناسب نقش کلیدی در کیفیت فرآیند مکان‌یابی خطای نرم‌افزار ایفاء می‌کند. در این مقاله، روشی برای بهبود مکان‌یابی خطا با تولید داده‌های آزمون جدید ارائه می‌شود. مجموعه آزمون به صورت کمینه و هدفمند، جهت تعیین شاخه خطادار و پس از آن جملات مظنون به خطای درون شاخه، تولید می‌گردد. ابتدا، محدوده جملات مظنون به خطا در یک مسیر اجرایی مشخص می‌شود. برای این کار، در مسیر اجرایی خطادار، شرط‌ها از انتها به ابتدا نقیض شده و با استفاده از حل‌کننده Z3  داده‌ آزمون برای مسیر مورد نظر ایجاد می‌گردد. سپس، برنامه مجدداً با داده آزمون‌های به دست آمده توسط فن اجرای نمادین پویا اجرا می‌شود. با توجه به موفق و یا ناموفق بودن اجرا، مشخص می‌کنیم که کدام شاخه مظنون به خطا است. بدین ترتیب، محدوده جملات برای اعمال روش علّی-آماری به حداقل ممکن می‌رسد. ارزیابی روش پیشنهادی روی چهار پروژه از مجموعه محک Defect4J، انجام شده است. نتایج نشان دهنده کشف %75 از خطاها با بررسی حداکثر یک درصد از کد این برنامه‌ها است که در مقایسه با کارهای موجود %98/17 بهبود دارد. همچنین، متوسط جملات مورد بررسی جهت کشف خطا، در بدترین حالت به میزان %78/16 کاهش داشته است.

کلیدواژه‌ها


[1] Kshirasagar Naik and Priyadarshi Tripathy, Software testing and quality assurance: theory and practice. John Wiley & Sons, 2011.
[2] F. Feyzi and S. Parsa, “FPA-FL: Incorporating static fault-proneness analysis into statistical fault localization,” Journal of Systems and Software, vol. 136, pp. 39–58, Feb. 2018, doi: 10.1016/j.jss.2017.11.002.
[3] Y. Yang, F. Deng, Y. Yan, and F. Gao, “A fault localization method based on conditional probability,” in 2019 IEEE 19th International Conference on Software Quality, Reliability and Security Companion (QRS-C), Jul. 2019, pp. 213–218. doi: 10.1109/QRS-C.2019.00050.
[4] T. Shu, T. Ye, Z. Ding, and J. Xia, “Fault localization based on statement frequency,” Inf Sci (N Y), vol. 360, pp. 43–56, Sep. 2016, doi: 10.1016/j.ins.2016.04.023.
[5] W. E. Wong, R. Gao, Y. Li, R. Abreu, and F. Wotawa, “A survey on software fault localization,” IEEE Transactions on Software Engineering, vol. 42, no. 8, pp. 707–740, Aug. 2016, doi: 10.1109/TSE.2016.2521368.
[6] A. Aghamohammadi, S.-H. Mirian-Hosseinabadi, and S. Jalali, “Statement frequency coverage: a code coverage criterion for assessing test suite effectiveness,” Inf Softw Technol, vol. 129, p. 106426, Jan. 2021, doi: 10.1016/j.infsof.2020.106426.
[7] N. Neelofar, L. Naish, J. Lee, and K. Ramamohanarao, “Improving spectral-based fault localization using static analysis,” Softw Pract Exp, vol. 47, no. 11, pp. 1633–1655, Nov. 2017, doi: 10.1002/spe.2490.
[8] A. Dutta, S. S. Srivastava, S. Godboley, and D. P. Mohapatra, “Combi-FL: Neural network and SBFL based fault localization using mutation analysis,” J Comput Lang, vol. 66, p. 101064, Oct. 2021, doi: 10.1016/J.COLA.2021.101064.
[9] H. L. Ribeiro, P. A. R. de Araujo, M. L. Chaim, H. A. de Souza, and F. Kon, “Evaluating data-flow coverage in spectrum-based fault localization,” in 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 2019, pp. 1–11.
[10] S. Pearson et al., “Evaluating and improving fault localization,” in 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE), May 2017, pp. 609–620. doi: 10.1109/ICSE.2017.62.
[11] G. Candea and P. Godefroid, “Automated software test generation: some challenges, solutions, and recent advances,” 2019, pp. 505–531. doi: 10.1007/978-3-319-91908-9_24.
[12] P. Ammann and J. Offutt, Introduction to software testing. Cambridge: Cambridge University Press, 2016. doi: DOI: 10.1017/9781316771273.
[13] E. Nikravan and S. Parsa, “Improving dynamic domain reduction test data generation method by Euler/Venn reasoning system,” Software Quality Journal, vol. 28, no. 2, pp. 823–851, Jun. 2020, doi: 10.1007/s11219-019-09471-4.
[14] F. Belli, M. Beyazıt, A. T. Endo, A. Mathur, and A. Simao, “Fault domain-based testing in imperfect situations: a heuristic approach and case studies,” Software Quality Journal, vol. 23, no. 3, pp. 423–452, Sep. 2015, doi: 10.1007/s11219-014-9242-6.
[15] P. Godefroid, N. Klarlund, and K. Sen, “DART: directed automated random testing,” ACM SIGPLAN Notices, vol. 40, no. 6, pp. 213–223, Jun. 2005, doi: 10.1145/1064978.1065036.
[16] L. de Moura and N. Bjørner, “Z3: An efficient SMT solver,” 2008, pp. 337–340. doi: 10.1007/978-3-540-78800-3_24.
[17] K. Luckow et al., “JDart: A dynamic symbolic analysis framework,” 2016, pp. 442–459. doi: 10.1007/978-3-662-49674-9_26.
[18] B. Korel and J. Laski, “Dynamic program slicing,” Inf Process Lett, vol. 29, no. 3, pp. 155–163, Oct. 1988, doi: 10.1016/0020-0190(88)90054-3.
[19] C. Hammacher, K. Streit, S. Hack, and A. Zeller, “Profiling Java programs for parallelism,” in 2009 ICSE Workshop on Multicore Software Engineering, May 2009, pp. 49–55. doi: 10.1109/IWMSE.2009.5071383.
[20] G. K. Baah, A. Podgurski, and M. J. Harrold, “Causal inference for statistical fault localization,” in Proceedings of the 19th international symposium on Software testing and analysis - ISSTA ’10, 2010, p. 73. doi: 10.1145/1831708.1831717.
[21] G. K. Baah, A. Podgurski, and M. J. Harrold, “Mitigating the confounding effects of program dependences for effective fault localization,” in Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering - SIGSOFT/FSE ’11, 2011, p. 146. doi: 10.1145/2025113.2025136.
[22] H. Li, Y. Liu, Z. Zhang, and J. Liu, “Program structure aware fault localization,” in Proceedings of the International Workshop on Innovative Software Development Methodologies and Practices, Nov. 2014, pp. 40–48. doi: 10.1145/2666581.2666593.
[23] X. Zhang, N. Gupta, and R. Gupta, “Locating faults through automated predicate switching,” in Proceedings of the 28th international conference on Software engineering, May 2006, pp. 272–281. doi: 10.1145/1134285.1134324.
[24] N. Bayati Chaleshtari and S. Parsa, “SMBFL: slice-based cost reduction of mutation-based fault localization,” Empir Softw Eng, vol. 25, no. 5, pp. 4282–4314, 2020, doi: 10.1007/s10664-020-09845-4.
[25] D. Jeffrey, N. Gupta, and R. Gupta, “Fault localization using value replacement,” in Proceedings of the 2008 international symposium on Software testing and analysis - ISSTA ’08, 2008, p. 167. doi: 10.1145/1390630.1390652.
[26] J. A. Jones and M. J. Harrold, “Empirical evaluation of the tarantula automatic fault-localization technique,” in Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering  - ASE ’05, 2005, p. 273. doi: 10.1145/1101908.1101949.
[27] ben Liblit, Cooperative Bug Isolation, vol. 4440. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007. doi: 10.1007/978-3-540-71878-9.
[28] T. Chen, X. Zhang, S. Guo, H. Li, and Y. Wu, “State of the art: dynamic symbolic execution for automated test generation,” Future Generation Computer Systems, vol. 29, no. 7, pp. 1758–1773, Sep. 2013, doi: 10.1016/j.future.2012.02.006.
[29] F. Feyzi and S. Parsa, “A program slicing-based method for effective detection of coincidentally correct test cases,” Computing, vol. 100, no. 9, pp. 927–969, Sep. 2018, doi: 10.1007/s00607-018-0591-z.
[30] A. Bandyopadhyay, “Mitigating the effect of coincidental correctness in spectrum based fault localization,” in 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation, Apr. 2012, pp. 479–482. doi: 10.1109/ICST.2012.130.
[31] Y. MIAO, Z. CHEN, S. LI, Z. ZHAO, and Y. ZHOU, “A clustering-based strategy to identify coincidental correctness in fault localization,” International Journal of Software Engineering and Knowledge Engineering, vol. 23, no. 05, pp. 721–741, Jun. 2013, doi: 10.1142/S0218194013500186.
[32] X. Wang, S. C. Cheung, W. K. Chan, and Z. Zhang, “Taming coincidental correctness: coverage refinement with context patterns to improve fault localization,” in 2009 IEEE 31st International Conference on Software Engineering, 2009, pp. 45–55. doi: 10.1109/ICSE.2009.5070507.
[33] J.-F. Bergeretti and B. A. Carré, “Information-flow and data-flow analysis of while-programs,” ACM Transactions on Programming Languages and Systems, vol. 7, no. 1, pp. 37–61, Jan. 1985, doi: 10.1145/2363.2366.
[34] N. Tsantalis and A. Chatzigeorgiou, “Identification of extract method refactoring opportunities for the decomposition of methods,” Journal of Systems and Software, vol. 84, no. 10, pp. 1757–1782, Oct. 2011, doi: 10.1016/j.jss.2011.05.016.
[35] E. Alpaydin, Introduction to machine learning, 4th edition. MIT Press, 2020. Accessed: Jul. 24, 2022. [Online]. Available: https://mitpress.mit.edu/books/introduction-machine-learning-fourth-edition
[36] X. Mao, Y. Lei, Z. Dai, Y. Qi, and C. Wang, “Slice-based statistical fault localization,” Journal of Systems and Software, vol. 89, pp. 51–62, Mar. 2014, doi: 10.1016/j.jss.2013.08.031.
[37] F. Feyzi and S. Parsa, “Inforence: effective fault localization based on information-theoretic analysis and statistical causal inference,” CoRR, vol. abs/1712.0, Dec. 2017, doi: 10.1007/s11704-017-6512-z.
[38] D. G. Kleinbaum and M. Klein, Logistic regression. New York, NY: Springer New York, 2010. doi: 10.1007/978-1-4419-1742-3.
[39] R. Just, D. Jalali, and M. D. Ernst, “Defects4J: a database of existing faults to enable controlled testing studies for Java programs,” in Proceedings of the 2014 International Symposium on Software Testing and Analysis - ISSTA 2014, 2014, pp. 437–440. doi: 10.1145/2610384.2628055.
[40] D. Zou, J. Liang, Y. Xiong, M. D. Ernst, and L. Zhang, “An empirical study of fault localization families and their combinations,” IEEE Transactions on Software Engineering, vol. 47, no. 2, pp. 332–347, Feb. 2021, doi: 10.1109/TSE.2019.2892102.
[41] W. E. Wong, V. Debroy, R. Gao, and Y. Li, “The DStar method for effective software fault localization,” IEEE Trans Reliab, vol. 63, no. 1, pp. 290–308, Mar. 2014, doi: 10.1109/TR.2013.2285319.