Probabilistic learning for pulsar classification

In this work, we explore the possibility of using probabilistic learning to identify pulsar candidates. We make use of Deep Gaussian Process (DGP) and Deep Kernel Learning (DKL). Trained on a balanced training set in order to avoid the effect of class imbalance, the performance of the models, achieving relatively high probability of differentiating the positive class from the negative one (roc-auc∼0.98), is very promising overall. We estimate the predictive entropy of each model predictions and find that DKL is more confident than DGP in its predictions and provides better uncertainty calibration. Upon investigating the effect of training with imbalanced dataset on the models, results show that each model performance decreases with an increasing number of the majority class in the training set. Interestingly, with a number of negative class 10× that of positive class, the models still provide reasonably well calibrated uncertainty, i.e. an expected Uncertainty Calibration Error (UCE) less than 6%. We also show in this study how, in the case of relatively small amount of training dataset, a convolutional neural network based classifier trained via Bayesian Active Learning by Disagreement (BALD) performs. We find that, with an optimized number of training examples, the model — being the most confident in its predictions — generalizes relatively well and produces the best uncertainty calibration which corresponds to UCE = 3.118%.

Reference:
Probabilistic learning for pulsar classification, Sambatra Andrianomena, arXiv:2205.05765