TY - JOUR
T1 - SBI++
T2 - Flexible, Ultra-fast Likelihood-free Inference Customized for Astronomical Applications
AU - Wang, Bingjie
AU - Leja, Joel
AU - Villar, V. Ashley
AU - Speagle, Joshua S.
N1 - Funding Information:
We thank John Weaver and Kate Whitaker for providing us with the reduced JWST photometry prior to the public release. B.W. is supported by the Institute for Gravitation and the Cosmos through the Eberly College of Science. This research received funding from the Pennsylvania State University’s Institute for Computational and Data Sciences through the ICDS Seed Grant Program. Computations for this research were performed on the Pennsylvania State University’s Institute for Computational and Data Sciences’ Roar supercomputer. This publication made use of the NASA Astrophysical Data System for bibliographic information.
Funding Information:
We thank John Weaver and Kate Whitaker for providing us with the reduced JWST photometry prior to the public release. B.W. is supported by the Institute for Gravitation and the Cosmos through the Eberly College of Science. This research received funding from the Pennsylvania State University’s Institute for Computational and Data Sciences through the ICDS Seed Grant Program. Computations for this research were performed on the Pennsylvania State University’s Institute for Computational and Data Sciences’ Roar supercomputer. This publication made use of the NASA Astrophysical Data System for bibliographic information.
Publisher Copyright:
© 2023. The Author(s). Published by the American Astronomical Society.
PY - 2023/7/1
Y1 - 2023/7/1
N2 - Flagship near-future surveys targeting 108-109 galaxies across cosmic time will soon reveal the processes of galaxy assembly in unprecedented resolution. This creates an immediate computational challenge on effective analyses of the full data set. With simulation-based inference (SBI), it is possible to attain complex posterior distributions with the accuracy of traditional methods but with a >104 increase in speed. However, it comes with a major limitation. Standard SBI requires the simulated data to have characteristics identical to those of the observed data, which is often violated in astronomical surveys due to inhomogeneous coverage and/or fluctuating sky and telescope conditions. In this work, we present a complete SBI-based methodology, SBI++, for treating out-of-distribution measurement errors and missing data. We show that out-of-distribution errors can be approximated by using standard SBI evaluations and that missing data can be marginalized over using SBI evaluations over nearby data realizations in the training set. In addition to the validation set, we apply SBI++ to galaxies identified in extragalactic images acquired by the James Webb Space Telescope, and show that SBI++ can infer photometric redshifts at least as accurately as traditional sampling methods—and crucially, better than the original SBI algorithm using training data with a wide range of observational errors. SBI++ retains the fast inference speed of ∼1 s for objects in the observational training set distribution, and additionally permits parameter inference outside of the trained noise and data at ∼1 minute per object. This expanded regime has broad implications for future applications to astronomical surveys. (Code and a Jupyter tutorial are made publicly available at https://github.com/wangbingjie/sbi_pp.)
AB - Flagship near-future surveys targeting 108-109 galaxies across cosmic time will soon reveal the processes of galaxy assembly in unprecedented resolution. This creates an immediate computational challenge on effective analyses of the full data set. With simulation-based inference (SBI), it is possible to attain complex posterior distributions with the accuracy of traditional methods but with a >104 increase in speed. However, it comes with a major limitation. Standard SBI requires the simulated data to have characteristics identical to those of the observed data, which is often violated in astronomical surveys due to inhomogeneous coverage and/or fluctuating sky and telescope conditions. In this work, we present a complete SBI-based methodology, SBI++, for treating out-of-distribution measurement errors and missing data. We show that out-of-distribution errors can be approximated by using standard SBI evaluations and that missing data can be marginalized over using SBI evaluations over nearby data realizations in the training set. In addition to the validation set, we apply SBI++ to galaxies identified in extragalactic images acquired by the James Webb Space Telescope, and show that SBI++ can infer photometric redshifts at least as accurately as traditional sampling methods—and crucially, better than the original SBI algorithm using training data with a wide range of observational errors. SBI++ retains the fast inference speed of ∼1 s for objects in the observational training set distribution, and additionally permits parameter inference outside of the trained noise and data at ∼1 minute per object. This expanded regime has broad implications for future applications to astronomical surveys. (Code and a Jupyter tutorial are made publicly available at https://github.com/wangbingjie/sbi_pp.)
UR - http://www.scopus.com/inward/record.url?scp=85165712202&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85165712202&partnerID=8YFLogxK
U2 - 10.3847/2041-8213/ace361
DO - 10.3847/2041-8213/ace361
M3 - Article
AN - SCOPUS:85165712202
SN - 2041-8205
VL - 952
JO - Astrophysical Journal Letters
JF - Astrophysical Journal Letters
IS - 1
M1 - L10
ER -