Jacob, J. C., Armstrong, E. M., Bourassa, M. A., Cram, T., Elya, J. L., Greguska, F. R., III, et al. (2018). OceanWorks: Enabling Interactive Oceanographic Analysis in the Cloud with Multivariate Data. In
American Geophysical Union (Vol. Fall Meeting).
Abstract: NASA's Advanced Information System Technology (AIST) Program sponsors the OceanWorks project to establish an integrated data analytics center at the Physical Oceanography Distributed Active Archive Center (PO.DAAC). OceanWorks provides a series of interoperable capabilities that are essential for cloud-scale oceanographic research. These include big data analytics, data search with subsecond response, intelligent ranking of search results, subsetting based on data quality metrics, and rapid spatiotemporal matchup of satellite measurements with distributed in situ data. The software behind OceanWorks is being developed as an open source project in the Apache Incubator Science Data Analytics Platform (SDAP – http://sdap.apache.org). In this presentation we describe how OceanWorks enables efficient, scalable, interactive and interdisciplinary oceanographic analysis with multivariate data.
Interactivity is enabled by a number of SDAP features. First, SDAP provides Representational State Transfer (REST) interfaces to a number of built-in cloud analytics to compute time series, time-averaged maps, correlation maps, climatological maps, Hovmöller maps, and more. To access these, users simply navigate to a properly constructed parameterized URL in their web browser or issue web services calls in a variety of programming languages or in a Jupyter notebook. Alternatively, Python clients can make function calls via the NEXUS Command Line Interface (CLI). Authenticated users can even inject their own custom code via REST calls or the CLI.
To enable interdisciplinary science, OceanWorks provides access to a rich collection of multivariate satellite and in situ measurements of the oceans (e.g., sea surface temperature, height and salinity, chlorophyll and circulation) and other Earth science data (e.g., aerosol optical depth and wind speed), coupled with on-demand processing capabilities close to the data. We partition the data across space or time into tiles and store them into cloud-aware databases that are collocated with the computations. We will provide examples of scientific studies directly enabled by OceanWorks' multivariate data and cloud analytics.