The heterogeneity in data and the sparsity of variables are two important problems and can-not be ignored in big data era. In this paper,a new Logit model is proposed when the data is of heterogeneity,sparsity and the dependent variable is binary. The results show that the method can effectively distinguish the redundant variables in different groups. On the other hand,It shows that the model can predict well by Gmeans and other evaluation indicators. Finally,the method is applied to the research on financial early warning of listing corporation and some meaningful results are obtained,which shows the method in this paper has some practical value.
|Translated title of the contribution||The Study of Variable Selection in Logit Model Based on Heterogeneous Data|
|Publication status||Published - 2017|