Abstract:
© 2020 IEEE. We consider the problem of classification for imbalanced samples with rare classes. A common problem for machine learning methods in such setting is that a rare class would have extremely high classification error compared to more widespread classes. In general, this problem could be mitigated with re-sampling or fitting additional weights to control the classification errors in classes, though those methods are computationally expensive for large datasets and sometimes fail to attain appropriate results. It this paper we present cost-efficient modifications of Linear Discriminant Analysis allowing to mitigate the problem by minimizing maximal classification error among the classes. For example, this allows achieving more robust machinery malfunction detection algorithms where our expectations on recall would be more consistent among different malfunction types.