Unsupervised domain adaptation via Bregman Divergence minimization and Adaptive Classifier learning

Document Type : Persian Original Article

Authors

1 Faculty of IT & Computer Engineering, Urmia University of Technology, Urmia, Iran

2 Faculty of IT & Computer Engineering, Urmia University of Technology, Urmia, Iran.

Abstract

In pattern recognition and image classification, the common assumption that the training set (source domain) and test set (target domain) share the same distribution is often violated in real-world applications. In this case, traditional learning models may not generalize well on test sets. To tackle this problem, domain adaptation try to exploit training data with same distribution from other related source domain to generalize model for target domain.This paper presents a domain adaptation method which learns to adapt the data distribution of the source domain to that of the target domain where no labeled data of the target domain is available. Our method jointly learns a low dimensional representation space and an adaptive classifier. In fact, we try to find a representation space and an adaptive classifier on this representation space such that the distribution gap between the two domains is minimized and the risk of the adaptive classifier is also minimized.
In this paper, we propose a novel solution to tackle unsupervised domain adaptation for classification. In the unsupervised scenario where no labeled samples from the target domain is available, our model transforms data such that the source and target distributions become similar. To compare two distributions, our approaches make use of Bregman divergence. However, this does not suffice to generalize the model. Here, we propose to make better use model matching along with representation learning to tackle distribution mismatch across domains. The framework extends classification model by adding an adaptive classifier, which generalizes the target classifier far from the source data. Then this framework guarantees the target classifier minimizes the empirical risk in target domain and maximize manifold consistency with source data structure. Our empirical study on multiple open data sets validates that our proposed approach can consistently improve the classification accuracy compared to the basic machine learning and state-of-the-art transfer learning methods.

Keywords


  [1]     J. Li, Y. Wu, and K. Lu, “Structured domain adaptation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 8, pp. 1700–1713, 2017.
  [2]     B. Gong, K. Grauman and F. Sha, “Connecting the dots with landmarks: discriminatively learning domain-invariant features for unsupervised domain adaptation”, Proceedings of the International Conference on Machine Learning, vol. 28, no. 1, pp. 222-230, 2013.
  [3]     H. Wang, H. Huang, F. Nie, and C. Ding, “Cross-language web page classification via dual knowledge transfer using nonnegative matrix tri-factorization,” in Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pp. 933–942, ACM, 2011.
  [4]     H. Liu and L. Yu, “Toward integrating feature selection algorithms for classification and clustering,” IEEE Transactions on knowledge and data engineering, vol. 17, no. 4, pp. 491502, 2005.
  [5]     J. Tahmoresnezhad and S. Hashemi, “Diret: An effective discriminative dimensionality reduction approach for multi-source transfer learning,” Scientia Iranica. Transaction D, Computer Science & Engineering, Electrical, vol. 24, no. 3, pp. 1303–1311, 2017.
  [6]     M. Singha, D. Deb, and S. Roy, “Hybrid feature extraction method for partial face recognition,” Int. J. Emerg. Technol. Adv. Eng. Website, vol. 4, pp. 308–312, 2014.
  [7]     Saenko K, Kulis B, Fritz M, Darrell T. Adapting visual category models to new domains. Computer Vision–ECCV 2010. 2010:213-26.
  [8]     K. Saenko, B. Kulis, M. Fritz, and T. Darrell, “Adapting visual category models to new domains,” in European conference on computer vision, pp. 213–226, Springer, 2010.
  [9]     M. Long, J. Wang, G. Ding, J. Sun and P. S. Yu, “Transfer joint matching for unsupervised domain adaptation”, IEEE conference on computer vision and pattern recognition, pp. 1410-1417, 2014.
[10]     Y. Aytar and A. Zisserman, “Tabula rasa: Model transfer for object category detection,” in Computer Vision (ICCV), 2011 IEEE International Conference on, pp. 2252–2259, IEEE, 2011.
[11]     G.Griffin, A. Holub and P. Perona, “Caltech-256 object category dataset”, Technical Report7694, 2007.
[12]     J. J. Hull, “A database for handwritten text recognition research”, IEEE Trans. Pattern Anal. Mach. Intell, vol. 16, no. 5, pp. 550–554, 1994.
[13]     T. Sim, S. Baker and M. Bsat, “The CMU pose, illumination, and expression (PIE) database”, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition, pp. 53-58, 2002.
[14]     M. Long, J. Wang, G. Ding, S. J. Pan, and S. Y. Philip, “Adaptation regularization: A general framework for transfer learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 5, pp. 1076–1089, 2014.
[15]     J. Tahmoresnezhad and S. Hashemi, “Visual domain adaptation via transfer feature learning,” Knowledge and Information Systems, vol. 50, no. 2, pp. 585– 605, 2017.
[16]     L. Luo, X. Wang, S. Hu, C. Wang, Y. Tang, and L. Chen, “Close yet distinctive domain adaptation,” IEEE Transactions on Image Processing, vol. 25, no. 2, pp. 850–863, 2017.
[17]     Ggbdx
[18]     L. M. Bregman, “The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming,” USSR computational mathematics and mathematical physics, vol. 7, no. 3, pp. 200–217, 1967.
[19]     S. Si, D. Tao, and B. Geng, “Bregman divergence-based regularization for transfer subspace learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 7, p. 929, 2010.
[20]     M. Long, J. Wang, G. Ding, J. Sun and S. YuPhilip, “Transfer feature learning with joint distribution adaptation”, IEEE international conference on computer vision, pp. 2200-2207, 2013.
[21]     Y. Xu, X. Fang, J. Wu, X. Li, and D. Zhang, “Discriminative transfer subspace learning via low-rank and sparse representation,” IEEE Transactions on Image Processing, vol. 25, no. 2, pp. 850–863, 2016.
[22]     W. Dai, Q. Yang, G.-R. Xue, and Y. Yu, “Boosting for transfer learning,” in Proceedings of the 24th international conference on Machine learning, 2007, pp. 193–200.
[23]     Y. Tsuboi, H. Kashima, S. Hido, S. Bickel, and M. Sugiyama, “Direct density ratio estimation for large-scale covariate shift adaptation.” Information and Media Technologies, vol. 4, no. 2, pp. 529–546, 2009.
[24]     J. Quionero-Candela, M. Sugiyama, A. Schwaighofer, and N. D. Lawrence, “Dataset Shift in Machine Learning”, The MIT Press, 2009.
[25]     Z. Ding, M. Shao, and Y. Fu, “Deep low-rank coding for transfer learning,” in Proceedings of the 24th International Conference on Artificial Intelligence. AAAI Press, 2015, pp. 3453–3459.
[26]     M. Long, J. Wang, G. Ding, J. Sun, and P. Yu, “Transfer feature learning with joint distribution adaptation,” in Proceedings of the IEEE International Conference on Computer Vision, 2013, pp. 2200– 2207.