BCO-DMO Data Processing Conventions

BCO-DMO curates and processes data to ensure they are findable, accessible, interoperable, and reusable. Our typical processing procedures are described below. Some datasets require greater or fewer processing steps than others, depending on the format and organization of the data files we receive. Refer to individual datasets' metadata pages for a description of specific processing applied.

Parameter Naming
BCO-DMO uses the term "parameter" to mean "column name", "column header", or "field". In order to support data re-use in a wide variety of applications, we routinely adjust provided parameter names to meet the following naming conventions:

  • The only allowed characters in parameter names are A-Z, a-z, 0-9, and underscores (no spaces, hyphens, commas, parentheses, or Greek letters).
  • Parameter names must begin with a letter (not a number).
  • Units typically are not included within parameter names as they are provided as part of the metadata.

Missing Data Identifiers
A "missing data identifier" is the notation used to indicate there is no data value. Commonly used notations for missing data include -999, NA, nd, NaN, or blank cells. BCO-DMO's default missing data identifier is "nd" (meaning "no data"), but this will be rendered differently depending on what you use to view the data. For example, missing data will be shown as blank (null) values when you download the data as a .csv file. In MATLAB .mat files, it will be displayed as NaN. When viewing data online at BCO-DMO, the missing value will be shown as "nd".

Latitude and Longitude Conversions
Latitudes are typically reported in decimal degrees north, meaning coordinates in the Southern Hemisphere are negative values. Longitudes are typically reported in decimal degrees east, meaning coordinates in the Western Hemisphere are negative values. BCO-DMO will convert latitude and longitude values to conform with these conventions. If the data provider prefers to provide latitude and longitude in another format, such as 0-360 degrees, we will provide both formats in separate columns the dataset.

Date and Time Conversions
Whenever possible, dates and times are provided in UTC in ISO8601 format: YYYY-MM-DDThh:mm:ssZ. Date and times provided in other formats or time zones will be converted to this convention. In cases where the local time is valuable to understanding the data, both local and UTC columns will be provided in the dataset.