Skip to main content

INARA: A Machine Learning Retrieval Framework with a Data Set of 3 Million Simulated Exoplanet Atmospheric Spectra

Molly D. O’Beirne‚ Michael D. Himes‚ Frank Soboczenski‚ Simone Zorzan‚ Adam Cobb‚ Atılım Güneş Baydin‚ Yarin Gal‚ Daniel Angerhausen‚ Massimo Mascaro‚ Giada N. Arney and Shawn D. Domagal−Goldman

Abstract

Traditional approaches for determining the atmospheres of exoplanets from telescopic spectral data (i.e., atmospheric retrievals) involve time-consuming and compute-intensive Bayesian sampling methods, requiring a compromise between physical and chemical realism and overall computational feasibility. For rocky, terrestrial exoplanets, the retrieved atmospheric composition can give insight into the surface fluxes of gaseous species necessary to maintain the stability of that atmosphere, which may in turn provide insight into the geological and/or biological processes active on the planet. Machine learning (ML) offers a feasible and reliable approach to expedite the process of atmospheric retrievals; however, ML models require a large data set to train on. Here we present a data set of 3,000,000 simulated atmospheric spectra of rocky, terrestrial exoplanets generated across a broad parameter space of stellar and planetary properties, including 12 molecular species relevant for determining extant life. We then introduce INARA (Intelligent exoplaNet Atmospheric RetrievAl), our ML-based atmospheric retrieval framework. In a matter of seconds, INARA is capable of retrieving accurate concentrations of 12 molecular atmospheric constituents when given an observed spectrum. Our work represents the first large-scale simulated spectral data set and first atmospheric retrieval ML model for rocky, terrestrial exoplanets.

Book Title
Astrobiology Science Conference (AbSciCon 2019)‚ Bellevue‚ Washington‚ June 24–28‚ 2019
Year
2019