值的計算,便於實際應用。Aitkin & Rubin (1985) 則基於貝葉斯框架,對成分權重引入先驗分布,也能夠使零分布的估計變得簡化。上述工作關注於如何選擇合適的高斯成分個數,事實上還存在另一種思路:在EM迭代估計的過程中,動態調整高斯成分的個數。Ueda et al. (2000)提出split and merge EM (SMEM),在EM算法的迭代更新中,動態將部分高斯成分合併成一個,或者將一個高斯成分替換成若干個。該方法中高斯成分的總個數時不變的。Zhang et al. (2004) 在此基礎上提出一種Competitive EM算法,它在EM算法的迭代中以似然函數為標杆,也能夠分裂或者合併高斯成分,同時對高斯成分的個數不做限制,可以自動選擇使當前似然函數值最大化的高斯成分個數。
最後,既然高斯分布可以作為混合模型的基本成分,那麼其他的分布是否也可以用來構建混合模型呢?答案是肯定的。Weibull分布、t分布、對數高斯分布、Gamma分布,甚至不限定形式的任意分布函數,都可以用於混合模型。關於這樣更為一般性的混合分布,可參考McLachlan et al.(2019)對有限混合模型的綜述。雖然多種分布形式都可以用於有限混合模型的建模,但高斯混合模型無疑是混合模型中最常見、最實用的一類。這得益於高斯分布的優良數學性質。Li & Barron (1999) 指出,若考察所有高斯混合分布函數構成的集合,以及所有分布函數組成的集合,那麼L1度量下,前者在後者中是稠密的。因而,對任意一個未知的分布密度函數,都可以用一組高斯成分的混合來估計。高斯混合模型的優良性質和計算上的便捷奠定了它在各種數據分析、數據挖掘任務中難以動搖的地位。
參考文獻Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1-22.
Singh, R., Pal, B. C., & Jabr, R. A. (2009). Statistical representation of distribution system loads using Gaussian mixture model. IEEE Transactions on Power Systems, 25(1), 29-37.
Wang, Y., Chen, W., Zhang, J., Dong, T., Shan, G., & Chi, X. (2011). Efficient volume exploration using the gaussian mixture model. IEEE Transactions on Visualization and Computer Graphics, 17(11), 1560-1573.
Zivkovic, Z. (2004). Improved adaptive Gaussian mixture model for background subtraction. In Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004. (Vol. 2, pp. 28-31). IEEE.
Singh, N., Arya, R., & Agrawal, R. K. (2016). A convex hull approach in conjunction with Gaussian mixture model for salient object detection. Digital Signal Processing, 55, 22-31.
Cho, J., Jung, Y., Kim, D. S., Lee, S., & Jung, Y. (2019). Moving object detection based on optical flow estimation and a Gaussian mixture model for advanced driver assistance systems. Sensors, 19(14), 3217.
Melnykov, V., & Maitra, R. (2010). Finite mixture models and model-based clustering. Statistics Surveys, 4, 80-116.
Scrucca, L., Fop, M., Murphy, T. B., & Raftery, A. E. (2016). Mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. The R Journal, 8(1), 289.
Robert, C. P. (1996). Mixtures of distributions: inference and estimation. Markov Chain Monte Carlo in Practice, 441, 464.
Mengersen, K. L., Robert, C., & Titterington, M. (2011). Mixtures: Estimation and Applications (Vol. 896). John Wiley & Sons.
Rasmussen, C. E. (1999, November). The infinite Gaussian mixture model. In NIPS (Vol. 12, pp. 554-560).
Ueda, N., & Nakano, R. (1998). Deterministic annealing EM algorithm. Neural Networks, 11(2), 271-282.
Chen, J. (1995). Optimal rate of convergence for finite mixture models. The Annals of Statistics, 221-233.
Schwarz, G. (1978). Estimating the dimension of a model. The annals of Statistics, 461-464.
Hartigan, J. A. (1985). Statistical theory in clustering. Journal of Classification, 2(1), 63-76.
McLachlan, G. J. (1987). On bootstrapping the likelihood ratio test statistic for the number of components in a normal mixture. Journal of the Royal Statistical Society: Series C (Applied Statistics), 36(3), 318-324.
Tibshirani, R., Walther, G., & Hastie, T. (2001). Estimating the number of clusters in a data set via the gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2), 411-423.
Chen, H., Chen, J., & Kalbfleisch, J. D. (2004). Testing for a finite mixture model with two components. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 66(1), 95-115.
Aitkin, M., & Rubin, D. B. (1985). Estimation and hypothesis testing in finite mixture models. Journal of the Royal Statistical Society: Series B (Methodological), 47(1), 67-75.
Ueda, N., Nakano, R., Ghahramani, Z., & Hinton, G. E. (2000). SMEM algorithm for mixture models. Neural Computation, 12(9), 2109-2128.
Zhang, B., Zhang, C., & Yi, X. (2004). Competitive EM algorithm for finite mixture models. Pattern Recognition, 37(1), 131-144.
Li, J. Q., & Barron, A. R. (1999). Mixture Density Estimation. In NIPS (Vol. 12, pp. 279-285).
McLachlan, G. J., Lee, S. X., & Rathnayake, S. I. (2019). Finite mixture models. Annual Review of Statistics and Its Application, 6, 355-378.
- END -