These scripts are associated with the following publication: “Classification and Ranking of Fermi LAT Gamma-ray Sources from the 3FGL Catalog Using Machine Learning Techniques”, Saz Parkinson, P. M. (HKU/LSR, SCIPP), Xu, H. (HKU), Yu, P. L. H. (HKU), Salvetti, D. (INAF-Milan), Marelli, M. (INAF-Milan), and Falcone, A. D. (Penn State), The Astrophysical Journal, 2016, in press (http://arxiv.org/abs/1602.00385)
NB: You are welcome to use these scripts for your own purposes, but if you do so, we kindly ask that you cite the above publication.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
load(file = "FGL3_results.rdata")
Results are stored in FGL3_results
AGN vs PSR: LR_P -> Logistic Regression (LR) results (for PSR vs AGN classification) LR_Pred -> Category predicted by LR (using best threshold): “PSR” or “AGN” RF_P -> Random Forest (RF) with 10-fold cross-validation results (for PSR vs AGN classification) RF_Pred -> Category predicted by RF with 10-fold cross-validation (using best threshold): “PSR” or “AGN” YNG vs MSP: BLR_PSR_P -> Boosted Logistic Regression results (for ‘Young’ (YNG) vs ‘Millisecond’ pulsars (MSP)) BLR_PSR_Pred -> Category of pulsar predicted by BLR RF_PSR_P -> Random Forest (RF) prediction (for ‘Young’ (YNG) vs ‘Millisecond’ pulsars (MSP)) RF_PSR_Pred -> Category of pulsar predicted by Random Forest (RF)
unassoc_results <- FGL3_results %>%
filter(CLASS1 == "") %>%
filter(Signif > 10) %>%
filter(LR_Pred==RF_Pred) %>%
filter(LR_Pred=="PSR") %>%
arrange(desc(Signif))
SNR_PWN_results <- FGL3_results %>%
filter(CLASS1 == "SNR" | CLASS1 == "snr" | CLASS1 == "spp" | CLASS1 == "PWN " | CLASS1 == "pwn")
BIN_results <- FGL3_results %>%
filter(CLASS1 == "HMB" | CLASS1 == "BIN")
FGL3_results_outlyingness <- FGL3_results %>%
filter(FGL3_results$PSR_Out > 75 & FGL3_results$AGN_Out > 75)
FGL3_results_pulsarness <- FGL3_results %>%
filter(CLASS1 == "PSR" | CLASS1 == "psr"
| CLASS1 == "BCU" | CLASS1 == "bcu"
| CLASS1 == "BLL" | CLASS1 == "bll"
| CLASS1 == "FSRQ"| CLASS1 == "fsrq"
| CLASS1 == "rdg" | CLASS1 == "RDG"
| CLASS1 == "nlsy1" | CLASS1 == "NLSY1"
| CLASS1 == "agn" | CLASS1 == "ssrq"
| CLASS1 == "sey") %>%
mutate(pulsarness = factor(CLASS1=="PSR" | CLASS1=="psr", labels = c("AGN", "PSR")))
sum(FGL3_results_pulsarness$LR_Pred==FGL3_results_pulsarness$RF_Pred) # A: 1825
## [1] 1825
misclassifications <- filter(FGL3_results_pulsarness,
(FGL3_results_pulsarness$LR_Pred==FGL3_results_pulsarness$RF_Pred)
&(FGL3_results_pulsarness$RF_Pred!=FGL3_results_pulsarness$pulsarness))
Table 5: Below are the results of the best models, as applied to the 3FGL catalog (all 3021 sources for which predictor parameters are available). Note that the table is searchable and sortable (by clicking on the arrows by each column).