Smith, S. R., Bourass, M. A., & Jackson, D. L. (2012). Supporting Satellite Research With Data Collected by Vessels Automated Meteorological Sampling From SAMOS Project Gathers Data Required to Test Satellite Algorithms, Validate Products.
Sea Technology, 53(6), 21–24.
Sullivan, D., Rosenfeld, L., Smith, S., & Murphree, T. (2010).
Oceanographic instrumentation technician, Knowledge and Skill Guidelines for Marine Science and Technology. Monterey, CA: Marine Advanced Technology Education Center.
Huang, T., Armstrong, E. M., Bourassa, M. A., Cram, T. A., Elya, J., Greguska, F., et al. (2019). An Integrated Data Analytics Platform.
Mar. Sci., 6.
Abstract: An Integrated Science Data Analytics Platform is an environment that enables the confluence of resources for scientific investigation. It harmonizes data, tools and computational resources to enable the research community to focus on the investigation rather than spending time on security, data preparation, management, etc. OceanWorks is a NASA technology integration project to establish a cloud-based Integrated Ocean Science Data Analytics Platform for big ocean science at NASA�s Physical Oceanography Distributed Active Archive Center (PO.DAAC) for big ocean science. It focuses on advancement and maturity by bringing together several NASA open-source, big data projects for parallel analytics, anomaly detection, in situ to satellite data matchup, quality-screened data subsetting, search relevancy, and data discovery.
Our communities are relying on data available through distributed data centers to conduct their research. In typical investigations, scientists would (1) search for data, (2) evaluate the relevance of that data, (3) download it, and (4) then apply algorithms to identify trends, anomalies, or other attributes of the data. Such a workflow cannot scale if the research involves a massive amount of data or multi-variate measurements. With the upcoming NASA Surface Water and Ocean Topography (SWOT) mission expected to produce over 20PB of observational data during its 3-year nominal mission, the volume of data will challenge all existing Earth Science data archival, distribution and analysis paradigms. This paper discusses how OceanWorks enhances the analysis of physical ocean data where the computation is done on an elastic cloud platform next to the archive to deliver fast, web-accessible services for working with oceanographic measurements.
Kent, E. C., Rayner, N. A., Berry, D. I., Eastman, R., Grigorieva, V. G., Huang, B., et al. (2019). Observing Requirements for Long-Term Climate Records at the Ocean Surface.
Front. Mar. Sci., 6, 441.
Abstract: Observations of conditions at the ocean surface have been made for centuries, contributing to some of the longest instrumental records of climate change. Most prominent is the climate data record (CDR) of sea surface temperature (SST), which is itself essential to the majority of activities in climate science and climate service provision. A much wider range of surface marine observations is available however, providing a rich source of data on past climate. We present a general error model describing the characteristics of observations used for the construction of climate records, illustrating the importance of multi-variate records with rich metadata for reducing uncertainty in CDRs. We describe the data and metadata requirements for the construction of stable, multi-century marine CDRs for variables important for describing the changing climate: SST, mean sea level pressure, air temperature, humidity, winds, clouds, and waves. Available sources of surface marine data are reviewed in the context of the error model. We outline the need for a range of complementary observations, including very high quality observations at a limited number of locations and also observations that sample more broadly but with greater uncertainty. We describe how high-resolution modern records, particularly those of high-quality, can help to improve the quality of observations throughout the historical record. We recommend the extension of internationally-coordinated data management and curation to observation types that do not have a primary focus of the construction of climate records. Also recommended is reprocessing the existing surface marine climate archive to improve and quantify data and metadata quality and homogeneity. We also recommend the expansion of observations from research vessels and high quality moorings, routine observations from ships and from data and metadata rescue. Other priorities include: field evaluation of sensors; resources for the process of establishing user requirements and determining whether requirements are being met; and research to estimate uncertainty, quantify biases and to improve methods of construction of CDRs. The requirements developed in this paper encompass specific actions involving a variety of stakeholders, including funding agencies, scientists, data managers, observing network operators, satellite agencies, and international co-ordination bodies.
Freeman, E., Kent, E. C., Brohan, P., Cram, T., Gates, L., Huang, B., et al. (2019). The International Comprehensive Ocean-Atmosphere Data Set – Meeting Users Needs and Future Priorities.
Front. Mar. Sci., 6, 435.
Abstract: The International Comprehensive Ocean-Atmosphere Data Set (ICOADS) is a collection and archive of in situ marine observations, which has been developed over several decades as an international project and recently guided by formal international partnerships and the ICOADS Steering Committee. ICOADS contains observations from many different observing systems encompassing the evolution of measurement technology since the 18th century. ICOADS provides an integrated source of observations for a range of applications including research and climate monitoring, and forms the main marine in situ surface data source, e.g., near-surface ocean observations and lower atmospheric marine-meteorological observations from buoys, ships, coastal stations, and oceanographic sensors, for oceanic and atmospheric research and reanalysis. ICOADS has developed ways to incorporate user and reanalyses feedback information associated with permanent unique identifiers and is also the main repository for data that have been rescued from ships’ logbooks and other marine data digitization activities. ICOADS has been adopted widely because it provides convenient access to a range of observation types, globally, and through the entire marine instrumental record. ICOADS has provided a secure home for such observations for decades. Because of the increased volume of observations, particularly those available in near-real-time, and an expansion of their diversity, the ICOADS processing system now requires extensive modernization. Based on user feedback, we will outline the improvements that are required, the challenges to their implementation, and the benefits of upgrading this important and diverse marine archive and distribution activity.
Smith, S. R., Alory, G., Andersson, A., Asher, W., Baker, A., Berry, D. I., et al. (2019). Ship-Based Contributions to Global Ocean, Weather, and Climate Observing Systems.
Front. Mar. Sci., 6, 434.
Abstract: The role ships play in atmospheric, oceanic, and biogeochemical observations is described with a focus on measurements made near the ocean surface. Ships include merchant and research vessels; cruise liners and ferries; fishing vessels; coast guard, military, and other government-operated ships; yachts; and a growing fleet of automated surface vessels. The present capabilities of ships to measure essential climate/ocean variables and the requirements from a broad community to address operational, commercial, and scientific needs are described. The authors provide a vision to expand observations needed from ships to understand and forecast the exchanges across the ocean–atmosphere interface. The vision addresses (1) recruiting vessels to improve both spatial and temporal sampling, (2) conducting multivariate sampling on ships, (3) raising technology readiness levels of automated shipboard sensors and ship-to-shore data communications, (4) advancing quality evaluation of observations, and (5) developing a unified data management approach for observations and metadata that meet the needs of a diverse user community. Recommendations are made focusing on integrating private and autonomous vessels into the observing system, investing in sensor and communications technology development, developing an integrated data management structure that includes all types of ships, and moving toward a quality evaluation process that will result in a subset of ships being defined as mobile reference ships that will support climate studies. We envision a future where commercial, research, and privately owned vessels are making multivariate observations using a combination of automated and human-observed measurements. All data and metadata will be documented, tracked, evaluated, distributed, and archived to benefit users of marine data. This vision looks at ships as a holistic network, not a set of disparate commercial, research, and/or third-party activities working in isolation, to bring these communities together for the mutual benefit of all.
Armstrong, E. M., Bourassa, M. A., Cram, T. A., DeBellis, M., Elya, J., Greguska III, F. R., et al. (2019). An Integrated Data Analytics Platform.
Front. Mar. Sci., 6, 354.
Abstract: An Integrated Science Data Analytics Platform is an environment that enables the confluence of resources for scientific investigation. It harmonizes data, tools and computational resources to enable the research community to focus on the investigation rather than spending time on security, data preparation, management, etc. OceanWorks is a NASA technology integration project to establish a cloud-based Integrated Ocean Science Data Analytics Platform for big ocean science at NASA’s Physical Oceanography Distributed Active Archive Center (PO.DAAC) for big ocean science. It focuses on advancement and maturity by bringing together several NASA open-source, big data projects for parallel analytics, anomaly detection, in situ to satellite data matchup, quality-screened data subsetting, search relevancy, and data discovery. Our communities are relying on data available through distributed data centers to conduct their research. In typical investigations, scientists would (1) search for data, (2) evaluate the relevance of that data, (3) download it, and (4) then apply algorithms to identify trends, anomalies, or other attributes of the data. Such a workflow cannot scale if the research involves a massive amount of data or multi-variate measurements. With the upcoming NASA Surface Water and Ocean Topography (SWOT) mission expected to produce over 20PB of observational data during its 3-year nominal mission, the volume of data will challenge all existing Earth Science data archival, distribution and analysis paradigms. This paper discusses how OceanWorks enhances the analysis of physical ocean data where the computation is done on an elastic cloud platform next to the archive to deliver fast, web-accessible services for working with oceanographic measurements.
Armstrong, E. M., Bourassa, M. A., Cram, T., Elya, J. L., Greguska, F. R., III, Huang, T., et al. (2018). An information technology foundation for fostering interdisciplinary oceanographic research and analysis. In
American Geophysical Union (Vol. Fall Meeting).
Abstract: Before complex analysis of oceanographic or any earth science data can occur, it must be placed in the proper domain of computing and software resources. In the past this was nearly always the scientist's personal computer or institutional computer servers. The problem with this approach is that it is necessary to bring the data products directly to these compute resources leading to large data transfers and storage requirements especially for high volume satellite or model datasets. In this presentation we will present a new technological solution under development and implementation at the NASA Jet Propulsion Laboratory for conducting oceanographic and related research based on satellite data and other sources. Fundamentally, our approach for satellite resources is to tile (partition) the data inputs into cloud-optimized and computation friendly databases that allow distributed computing resources to perform on demand and server-side computation and data analytics. This technology, known as NEXUS, has already been implemented in several existing NASA data portals to support oceanographic, sea-level, and gravity data time series analysis with capabilities to output time-average maps, correlation maps, Hovmöller plots, climatological averages and more. A further extension of this technology will integrate ocean in situ observations, event-based data discovery (e.g., natural disasters), data quality screening and additional capabilities. This particular activity is an open source project known as the Apache Science Data Analytics Platform (SDAP) (https://sdap.apache.org), and colloquially as OceanWorks, and is funded by the NASA AIST program. It harmonizes data, tools and computational resources for the researcher allowing them to focus on research results and hypothesis testing, and not be concerned with security, data preparation and management. We will present a few oceanographic and interdisciplinary use cases demonstrating the capabilities for characterizing regional sea-level rise, sea surface temperature anomalies, and ocean hurricane responses.
O'hara, S. H., Arko, R. A., Clark, D., Chandler, C. L., Elya, J. L., Ferrini, V. L., et al. (2018). Rolling Deck to Repository (R2R) Program Data Services for the Oceanographic Research Community. In
American Geophysical Union (Vol. American Geophysical Union, Fall Meeting 2018).
Abstract: Research vessels supported by NSF are critical platforms contributing to academic oceanographic research in the US. The “underway” data sets obtained from the continuously operating geophysical, water column, and meteorological sensors aboard these vessels provide characterization of basic environmental conditions for the oceans and are of high scientific value for building global syntheses, climatologies, and historical time series of ocean properties (e.g the World Ocean Atlas, the GMRT bathymetric synthesis, ICOADS). The Rolling deck to Repository program (www.rvdata.us) provides a central shore-side data gateway that ensures the basic documentation, assessment and submission of all environmental data from ship operators to the NOAA long-term archives for these data. R2R provides a set of data services for the oceanographic research community, including: publishing an online, searchable and browsable master cruise catalog, supported by cruise and data set DOIs; organizing, archiving, and disseminating original underway data and documents; assessing data quality on select data types; creating select post-field data products; and supporting at-sea event logging. In this presentation we will discuss new developments in R2R data services and challenges associated with ship-based data management. A significant challenge is the dramatic increase in data volumes associated with new sensors (e.g. the EK80 Sonar systems) whereby individual cruise distributions can be several terabytes. Ship operators, R2R and NCEI must design a way to move and store these growing volumes. R2R is also working to make information more accessible and complete. A new website has been launched along with API web services that allow users to find and use data more easily. R2R is working to improve device metadata, including working to identify the time sources for all environmental sensors to support accurate comparison and merging of data sets.
Jacob, J. C., Armstrong, E. M., Bourassa, M. A., Cram, T., Elya, J. L., Greguska, F. R., III, et al. (2018). OceanWorks: Enabling Interactive Oceanographic Analysis in the Cloud with Multivariate Data. In
American Geophysical Union (Vol. Fall Meeting).
Abstract: NASA's Advanced Information System Technology (AIST) Program sponsors the OceanWorks project to establish an integrated data analytics center at the Physical Oceanography Distributed Active Archive Center (PO.DAAC). OceanWorks provides a series of interoperable capabilities that are essential for cloud-scale oceanographic research. These include big data analytics, data search with subsecond response, intelligent ranking of search results, subsetting based on data quality metrics, and rapid spatiotemporal matchup of satellite measurements with distributed in situ data. The software behind OceanWorks is being developed as an open source project in the Apache Incubator Science Data Analytics Platform (SDAP – http://sdap.apache.org). In this presentation we describe how OceanWorks enables efficient, scalable, interactive and interdisciplinary oceanographic analysis with multivariate data.
Interactivity is enabled by a number of SDAP features. First, SDAP provides Representational State Transfer (REST) interfaces to a number of built-in cloud analytics to compute time series, time-averaged maps, correlation maps, climatological maps, Hovmöller maps, and more. To access these, users simply navigate to a properly constructed parameterized URL in their web browser or issue web services calls in a variety of programming languages or in a Jupyter notebook. Alternatively, Python clients can make function calls via the NEXUS Command Line Interface (CLI). Authenticated users can even inject their own custom code via REST calls or the CLI.
To enable interdisciplinary science, OceanWorks provides access to a rich collection of multivariate satellite and in situ measurements of the oceans (e.g., sea surface temperature, height and salinity, chlorophyll and circulation) and other Earth science data (e.g., aerosol optical depth and wind speed), coupled with on-demand processing capabilities close to the data. We partition the data across space or time into tiles and store them into cloud-aware databases that are collocated with the computations. We will provide examples of scientific studies directly enabled by OceanWorks' multivariate data and cloud analytics.