Putative taxonomic information for ARISA bins as generated from clone libraries, 2014 (Bacterial, Archaeal, and Protistan Biodiversity project, Marine Viral Dynamics project)

Website: https://www.bco-dmo.org/dataset/535507
Version: 2014-11-05
Version Date: 2014-11-03

Project
» Pattern and Process in Marine Bacterial, Archaeal, and Protistan Biodiversity, and Effects of Human Impacts (Bacterial, Archaeal, and Protistan Biodiversity)
» Marine viral dynamics and incorporation into microbial association networks (Marine Viral Dynamics)

Program
» Dimensions of Biodiversity (Dimensions of Biodiversity)
ContributorsAffiliationRole
Fuhrman, Jed A.University of Southern California (USC)Principal Investigator
Cram, Jacob A.University of Southern California (USC)Contact
Copley, NancyWoods Hole Oceanographic Institution (WHOI BCO-DMO)BCO-DMO Data Manager


Dataset Description

This dataset contains putative taxonomic information for the ARISA bins, as generated from clone libraries. The protocol for assigning these IDs is described in Cram et al. (in press).

This file also contains statistics (eg. mean, standard deviation, etc.) of every variable measured.

Related Dataset:
SPOT environmental data
ARISA Relative Abundances
SPOT cruises


Methods & Sampling

Detailed information on the Methodology (pdf), including:
    Satellite measurements
    Assigning Taxonomic Identities to ARISA peaks
    Environmental parameter variability
    Seasonal variability of microbial community structure
    Mantel test approach
    Interannual variability of microbial community structure
    Alpha diversity: 
        Variability between depths
        Relation to season
        Relation to community similarity between depths
        Relation to community change
        Environmental parameters and community structure: Mantel tests
    Temporal dynamics of microbial taxa over time
        Transformations
        Taxonomic Groups
        OTUs

Relavent References:

Beman JM, Steele JA, Fuhrman JA. (2011). Co-occurrence patterns for abundant marine archaeal and bacterial lineages in the deep chlorophyll maximum of coastal California. ISME J 5: 1077-1085.

Cram JA, Chow C-ET, Sachdeva R, et al. (2014) Seasonal and interannual variability of the marine bacterioplankton community throughout the water column over ten years. The ISME Journal. doi: 10.1038/ismej.2014.153.

Frouin R, Franz BA, Werdell PJ (2003) The SeaWiFS PAR product. Algorithm updates for the fourth SeaWiFS data reprocessing 46-50.

Fuhrman J, Azam F (1982) Thymidine incorporation as a measure of heterotrophic bacterioplankton production in marine surface waters - evaluation and field results. marine biology 66:109-120.

Kirchman D, K’nees E, Hodson R (1985) Leucine incorporation and its potential as a measure of protein synthesis by bacteria in natural aquatic systems. Appl Environ Microbiol 49:599-607.

Morel A, Gentili B (2009) A simple band ratio technique to quantify the colored dissolved and detrital organic material from ocean color remotely sensed data. Remote Sensing of Environment 113:998-1011. doi: 10.1016/j.rse.2009.01.008

Noble RT, Fuhrman JA. (1998). Use of SYBR Green I for rapid epifluorescence counts of marine viruses and bacteria. Aquat. Microb. Ecol 14: 113-118.

Parsons TR (1984) A manual of chemical and biological methods for seawater analysis, 1st ed. Pergamon Press, Oxford [Oxfordshire]; New York

Patel A, Noble RT, Steele JA, Schwalbach MS, Hewson I, Fuhrman JA. (2007). Virus and prokaryote enumeration from planktonic aquatic environments by
epifluorescence microscopy with SYBR Green I. Nat Protoc 2: 269-276.

Stramski D, Reynolds RA, Babin M, et al. (2008) Relationships between the surface concentration of particulate organic carbon and optical properties in the eastern South Pacific and eastern Atlantic Oceans. Biogeosciences 5:171-201.


Data Processing Description

BCO-DMO Processing:

- added conventional header with dataset name, PI name, version date
- moved columns amoA through bcsim89 before the ARISA_# columns
- transformed ARISA_#.# columns to rows with new column of arisa_frag for the arisa name and rel_abund for the values
- replaced NA with nd
- reformated date from m/d/yyyy to yyyy-mm-dd


[ table of contents | back to top ]

Data Files

File
bins_taxonomy.csv
(Comma Separated Values (.csv), 468.83 KB)
MD5:02c5186272a709b178e107c73ebfbab7
Primary data file for dataset ID 535507

[ table of contents | back to top ]

Parameters

ParameterDescriptionUnits
nodeIDsFull name of variable; if arisa fragment the format is depth_ARISA_ITS-length; if environmental variable depth_variable-name unitless
nodeTypeARISA=arisa fragment bin; Env= other measurement unitless
nodeDepthsdepth to which the measurment corresponds meters
meanmean value of parameter at a given depth - ARISA node or environmental parameter as described in nodeIDs various
sdstandard deviation of parameter - ARISA node or environmental parameter as described in nodeIDs various
medianmedian value of parameter - ARISA node or environmental parameter as described in nodeIDs various
madmedian adjusted deviation - ARISA node or environmental parameter as described in nodeIDs various
count1000count1000 is the number of occurrences in the data set in which the organism shows up with greater than 0.1% relative abundance. unitless
countALLthe number of occurrences in the data set in which the organism shows up with greater than 0.01% relative abundance unitless
LBlower bound of ARISA fragment bin unitless
UBupper bound of ARISA fragment bin unitless
ProjectIdentity of project from which clone (used to identify the bin) was isolated unitless
LocationLocation from which the clone (used to identify the bin) was isolated unitless
DepthCatCategory of depth from which that clone was isolated: surf =500m; nd = unknown unitless
Clone1HitsNumber of occurrences in library of most frequently found clone of this ARISA fragment size unitless
Clone2HitsNumber of occurrences in library of second most frequently found clone of this ARISA fragment size unitless
OtherHitsNumber of occurences in library of other clones of this ARISA fragment size unitless
Clone1IDInternal ID number for Clone 1 unitless
Clone2IDInternal ID number for Clone 2 unitless
AccessionNCBI accession number of clone1 unitless
SecAccessionNCBI accession of clone2 unitless
DomainClone 1 Greengenes Domain ID unitless
PhylumClone 1 Greengenes Phylum unitless
ClassClone 1 Greengenes Class unitless
OrderClone 1 Greengenes Order unitless
FamilyClone 1 Greengenes Family unitless
GenusClone 1 Greengenes Genus unitless
SpeciesClone 1 Greengenes Species unitless
SilvaTagClone 1 Silva Taxonomy finest level identifier unitless
RDP_CladeClone 1 RDP Taxonomy clade level identifier unitless
EcotypeEcotype as assigned by Chow et al 2013 unitless
SecDomainClone 2 Greengenes Domain ID unitless
SecPhylumClone 2 Greengenes Phylum unitless
SecClassClone 2 Greengenes Class unitless
SecOrderClone 2 Greengenes Order unitless
SecFamilyClone 2 Greengenes Family unitless
SecGenusClone 2 Greengenes Genus unitless
SecSpeciesClone 2 Greengenes Species unitless
SecSilvaTagClone 2 Silva Taxonomy finest level identifier unitless
SecRDP_CladeClone 2 RDP Taxonomy clade level identifier unitless
SecEcotypeClone 2 Ecotype as assigned by Chow et al 2013 unitless
Letter66 letters or fewer long identifier plus ARISA fragment size unitless
ameanmean value of ARISA fragment various
asdstandard deviation of ARISA fragment various
amedianmedian value of ARISA fragment various
amadmedian adjusted deviation of ARISA fragment various
acount1000the number of times the ARISA fragment is seen with greater than 0.1% relative abundance unitless
acountallthe number of times the ARISA fragment is seen with greater than 0.01% relative abundance unitless

[ table of contents | back to top ]

Deployments

lab_Fuhrman_2014

Website
Platform
USC
Start Date
2014-10-17
End Date
2014-10-17
Description
Microbial diversity laboratory studies.  Monthlly cruises to collect water samples in Los Angeles, California area.


[ table of contents | back to top ]

Project Information

Pattern and Process in Marine Bacterial, Archaeal, and Protistan Biodiversity, and Effects of Human Impacts (Bacterial, Archaeal, and Protistan Biodiversity)


Coverage: San Pedro Ocean Time Series; approx. 33N, 118W


Description from NSF award abstract:
Bacteria, Archaea, and Protists dominate global elemental cycling and are immensely diverse genetically, taxonomically, and functionally. Yet the extent of marine microbial diversity, its patterns, and relationships among genetic, taxonomic, and functional diversity are very poorly characterized, even though the ocean covers 70% of the planet's surface. Among the least well known variables is the effect of human impacts on native marine microbial systems, although it is recognized that impacted systems are more prone to events like harmful algal blooms. Knowledge of these relationships and impacts are necessary to anticipate the responses of biota to global changes and feedback mechanisms that may alter the extents, rates, and even pathways of such changes. This project will expand upon an existing NSF-funded 10+-year monthly ocean time series (Microbial Observatory) that has focused on a single site midway between Los Angeles and Santa Catalina Island, to also include quarterly sampling adjacent to the impacted LA Harbor region to the barely-impacted Catalina coast. USC already runs facilities in LA Harbor and Catalina, with daily boats between (no cost). Measurements include (1) Genetic diversity: high throughput DNA sequences of "housekeeping" and functional genes. (2) Taxonomic diversity: high throughput tag sequences of small subunit ribosomal RNA genes, flow cytometry, automated image analysis (3) Functional Diversity: (a) Functional measurements (carbon fixation and respiration rates, microbial growth and grazing rates, cell size, morphology, and biomass variations), (b) distribution and expression of particular target functional genes involved with processes central to the cycles of carbon, nitrogen, and sulfur, (c) exploratory metatranscriptomics to explore functionalities that were not anticipated. (4) Integrating these: Multivariate statistical and network approaches including newly developed techniques (e.g. Bayesian networks to examine cause-effect relationships), and high speed computational approaches to assess the relationships among the genetic, taxonomic, and functional aspects of biodiversity observed. The PIs will also examine the collected data for signatures and specific effects (on organism identity and functions) associated with human impacted harbor site vs. the relatively pristine one.

The PIs will use network and time series analysis, along with other statistical tools to integrate "classical" microbial and oceanographic rate process measurements, flow cytometric and microscopic characterizations of communities, along with targeted as well as untargeted metagenomics and metatranscriptomics to relate genetic and taxonomic diversity with specific functions (at organismal, food web, and system levels). For example, they should be able to determine how different variants of particular taxa (e.g. at resolution levels ranging from what might be considered near the subspecies to genus levels) would differ in their association with particular measured functions, functional genes, or particular other taxa - or they might see how particular clusters of related organisms behave similarly or differently in their associations. This project offers an unprecedented and potentially transformative opportunity to combine and integrate measurements of genetic, taxonomic, and functional diversity along with direct measurements of system function in a well studied marine system that includes a gradient from one of the world's busiest harbors to a largely pristine ocean habitat. Far beyond just describing the distributions of organisms and functions (itself a necessary first step), they will specifically link spatial and temporal variations in a variety of functions with variations in genetic and taxonomic community composition.


Marine viral dynamics and incorporation into microbial association networks (Marine Viral Dynamics)


Coverage: Southern California between Los Angeles and Santa Catalina Island; Approx. 33.5N, 118.5 W


Description from NSF award abstract:
Marine microbes are tremendously abundant and are major players and driving forces in global biogeochemical cycles of carbon, nitrogen, phosphorus, and iron. We learned over the past two decades that viruses are pervasive elements in marine systems, with significant ecological, biogeochemical, genetic, and evolutionary effects on cellular marine organisms, but we have remarkably little information about the dynamics of marine viral community structure and how it relates to the community structure of their hosts (largely bacteria and phytoplankton). Such information is critical for developing proper conceptual and practical models of the roles of viruses and how these change over time and space. The goals of this project are:
(1) primarily, to characterize a significant subset of the natural virus community and its dynamics, along with bacterial host communities, as they change over daily to monthly time scales at the USC well-studied marine Microbial Observatory site (midway between Los Angeles and Santa Catalina Island), testing hypotheses regarding repeating patterns, host range effects, and taxa-time relationships, and
(2) secondarily, to incorporate these viruses into microbial association networks by statistically connecting particular types of viruses to specific potential hosts.

Approaches for this study include:
(a) nested daily, weekly, and monthly collection of bacteria and viruses for nucleic acid samples,
(b) amplification of conserved genes, as proxy phylogenetic markers, from a few moderately-well-characterized broad viral groups previously readily found in seawater (i.e. the T4-like myoviruses, T7-like podoviruses), as well as bacterial rRNA genes,
(c) extensive sequencing, after screening by community fingerprinting, from the mixed amplified products,
(d) binning of the sequences or fingerprint fragments into operational taxonomic units (OTUs) at different levels of resolution,
(e) evaluation of the results with statistical approaches to examine temporal patterns, relationships (including time-lagged ones) with other viral OTUs, bacteria, protists (monthly only), and environmental parameters,
(f) incorporating the viral OTUs mathematically into microbial association networks.

Data on environmental parameters, bacteria, and protists are already being collected monthly for an existing Microbial Observatory, so the viral work is complementary to this project, providing a major value-added component. Similarly, this project will add selected daily and weekly microbial data to the Microbial Observatory. Data from the literature and from the PI's preliminary results show they have the technology and capability to meet the first goal, and to our knowledge this would be the first such data set of its scope and kind. The investigators have already published in 2006 that the bacterial communities at the 5m depth of this site show a predictable repeating annual cycle in bacterial community composition, so the expectation of a predictable repeating viral community is not unreasonable. They also have some preliminary data showing some repeated viral occurrences. The second goal requires that there are indeed significant statistical relationships between the viruses and other measured parameters, which the PI anticipates to be the case, but of course cannot predict; if they cannot be demonstrated, this result itself would be informative and would constrain the possible modes of microbial/viral interactions.



[ table of contents | back to top ]

Program Information

Dimensions of Biodiversity (Dimensions of Biodiversity)


Coverage: global


(adapted from the NSF Synopsis of Program)
Dimensions of Biodiversity is a program solicitation from the NSF Directorate for Biological Sciences. FY 2010 was year one of the program.  [MORE from NSF]

The NSF Dimensions of Biodiversity program seeks to characterize biodiversity on Earth by using integrative, innovative approaches to fill rapidly the most substantial gaps in our understanding. The program will take a broad view of biodiversity, and in its initial phase will focus on the integration of genetic, taxonomic, and functional dimensions of biodiversity. Project investigators are encouraged to integrate these three dimensions to understand the interactions and feedbacks among them. While this focus complements several core NSF programs, it differs by requiring that multiple dimensions of biodiversity be addressed simultaneously, to understand the roles of biodiversity in critical ecological and evolutionary processes.



[ table of contents | back to top ]

Funding

Funding SourceAward
NSF Division of Ocean Sciences (NSF OCE)
NSF Division of Molecular and Cellular Biosciences (NSF MCB)

[ table of contents | back to top ]