Jacob, J. C., Armstrong, E. M., Bourassa, M. A., Cram, T., Elya, J. L., Greguska, F. R., III, et al. (2018). OceanWorks: Enabling Interactive Oceanographic Analysis in the Cloud with Multivariate Data. In
American Geophysical Union (Vol. Fall Meeting).
Abstract: NASA's Advanced Information System Technology (AIST) Program sponsors the OceanWorks project to establish an integrated data analytics center at the Physical Oceanography Distributed Active Archive Center (PO.DAAC). OceanWorks provides a series of interoperable capabilities that are essential for cloud-scale oceanographic research. These include big data analytics, data search with subsecond response, intelligent ranking of search results, subsetting based on data quality metrics, and rapid spatiotemporal matchup of satellite measurements with distributed in situ data. The software behind OceanWorks is being developed as an open source project in the Apache Incubator Science Data Analytics Platform (SDAP – http://sdap.apache.org). In this presentation we describe how OceanWorks enables efficient, scalable, interactive and interdisciplinary oceanographic analysis with multivariate data.
Interactivity is enabled by a number of SDAP features. First, SDAP provides Representational State Transfer (REST) interfaces to a number of built-in cloud analytics to compute time series, time-averaged maps, correlation maps, climatological maps, Hovmöller maps, and more. To access these, users simply navigate to a properly constructed parameterized URL in their web browser or issue web services calls in a variety of programming languages or in a Jupyter notebook. Alternatively, Python clients can make function calls via the NEXUS Command Line Interface (CLI). Authenticated users can even inject their own custom code via REST calls or the CLI.
To enable interdisciplinary science, OceanWorks provides access to a rich collection of multivariate satellite and in situ measurements of the oceans (e.g., sea surface temperature, height and salinity, chlorophyll and circulation) and other Earth science data (e.g., aerosol optical depth and wind speed), coupled with on-demand processing capabilities close to the data. We partition the data across space or time into tiles and store them into cloud-aware databases that are collocated with the computations. We will provide examples of scientific studies directly enabled by OceanWorks' multivariate data and cloud analytics.
Armstrong, E. M., Bourassa, M. A., Cram, T., Elya, J. L., Greguska, F. R., III, Huang, T., et al. (2018). An information technology foundation for fostering interdisciplinary oceanographic research and analysis. In
American Geophysical Union (Vol. Fall Meeting).
Abstract: Before complex analysis of oceanographic or any earth science data can occur, it must be placed in the proper domain of computing and software resources. In the past this was nearly always the scientist's personal computer or institutional computer servers. The problem with this approach is that it is necessary to bring the data products directly to these compute resources leading to large data transfers and storage requirements especially for high volume satellite or model datasets. In this presentation we will present a new technological solution under development and implementation at the NASA Jet Propulsion Laboratory for conducting oceanographic and related research based on satellite data and other sources. Fundamentally, our approach for satellite resources is to tile (partition) the data inputs into cloud-optimized and computation friendly databases that allow distributed computing resources to perform on demand and server-side computation and data analytics. This technology, known as NEXUS, has already been implemented in several existing NASA data portals to support oceanographic, sea-level, and gravity data time series analysis with capabilities to output time-average maps, correlation maps, Hovmöller plots, climatological averages and more. A further extension of this technology will integrate ocean in situ observations, event-based data discovery (e.g., natural disasters), data quality screening and additional capabilities. This particular activity is an open source project known as the Apache Science Data Analytics Platform (SDAP) (https://sdap.apache.org), and colloquially as OceanWorks, and is funded by the NASA AIST program. It harmonizes data, tools and computational resources for the researcher allowing them to focus on research results and hypothesis testing, and not be concerned with security, data preparation and management. We will present a few oceanographic and interdisciplinary use cases demonstrating the capabilities for characterizing regional sea-level rise, sea surface temperature anomalies, and ocean hurricane responses.