| Contributors | Affiliation | Role |
|---|---|---|
| Hamilton, Trinity | University of Minnesota (UMN) | Principal Investigator |
| Sauer, Hailey | University of Minnesota (UMN) | Principal Investigator |
Site description
For this study, we selected twenty lakes within Minnesota’s Sentinel Lakes in a Changing Environment (SLICE) program. SLICE is a collaborative research initiative that provides long-term data on a representative sub-sampling of Minnesota’s lakes spanning the diverse geographic, land-use, and climatic gradients present in Minnesota (Fig. 1 in Sauer et al., 2022). The lakes span four of the seven Environmental Protection Agency/Commission for Environmental Cooperation (Level III) ecological regions. These regions are characterized by differences in underlying geology, soils, vegetation, and land use (Table S1 in Sauer et al., 2022). This is the first comprehensive sediment bacterial survey of these lakes.
Water Sample Collection & Analysis
At each site, we collected water profile measurements of temperature, pH, conductivity, turbidity, and dissolved oxygen using a YSI EXO2 multi-parameter sonde (YSI, Inc.). We also collected an integrated epilimnetic water sample (0–2 m) and a hypolimnetic water sample (maximum lake depth – 1 m) when thermal stratification was present. All samples were stored on ice in the field and at either 4°C or −20°C in the laboratory, depending on methodology, until processed.
Samples for soluble reactive phosphorus (SRP), dissolved organic carbon (DOC), and dissolved inorganic carbon (DIC) were filtered, processed, and analyzed within 36 hours of sampling using standard methods for SRP (4500-P) on a SmartChem 170 (Unity Scientific, Inc.) and for DIC/DOC (Method 5310-C) using a Torch Combustion TOC Analyzer (Teledyne Tekmar, Inc.) (American Public Health Association, 2012). Samples for total nitrogen (TN) and total phosphorus (TP) were frozen and analyzed using standard methods for TN (4500-N) and TP (4500-P). Samples for ammonia (NH₃) and nitrate (NO₃) were filtered and frozen prior to analysis following methods 4500-NH₃ and 4500-NO₃. All TP, TN, NH₃, and NO₃ samples were analyzed within six months of sampling on a SmartChem 170 (Unity Scientific, Inc.) discrete analyzer (APHA, 2012).
Additionally, samples for chlorophyll-a were filtered, frozen, and analyzed via fluorometry following EPA Method 445.0 (Arar et al., 1997). A complete summary of aqueous chemistry results, including sampling dates, is provided in Table S2 (Sauer et al., 2022).
Sediment Sample Collection & DNA Isolation
Sediment cores were collected from July 2018 through June 2019 using a rod-driven piston corer with a 7 cm diameter polycarbonate tube (Wright, 1997). Coring locations (i.e., flat areas near the deepest basin) were determined using publicly available bathymetric maps (https://www.dnr.state.mn.us/lakefind/index.html), while avoiding steep-sided “holes” where sediment focusing may be high. Following retrieval, core tops were stabilized in the field using a gelling agent (e.g., Zorbitrol), and intact cores were returned to the laboratory, where they were stored vertically at 4°C for no more than seven days prior to processing. In cases where the upper sediments were extremely flocculent, the uppermost sections (~0–30 cm) were immediately sectioned in the field to prevent mixing during transport.
Cores were vertically extruded in the laboratory at 1–2 cm intervals, depending on lake productivity, and subsamples from two intervals were collected for DNA analysis. Subsamples were collected from the 0–2 cm interval (hereafter referred to as shallow) and from either the 3–4 cm or 4–6 cm interval (hereafter referred to as deep). Subsamples were frozen under nitrogen for up to three months prior to DNA extraction (Table S3 in Sauer et al., 2022).
DNA was extracted from 0.25 g of wet sediment from each subsample using a PowerSoil DNA Isolation Kit (Qiagen, Inc.) following the manufacturer’s protocols. Negative controls were performed by carrying out extractions on blanks containing only reagents and no sample. Final bulk DNA concentrations were determined using a Qubit™ dsDNA HS Assay Kit (Molecular Probes, Eugene, OR, USA) and a Qubit™ Fluorometer (Invitrogen, Carlsbad, CA, USA). The detection limit of the Qubit™ dsDNA HS Assay Kit is 10 pg μL⁻¹. All samples that yielded detectable amounts of DNA were submitted for sequencing (Table S3 in Sauer et al., 2022). Although DNA was not detected in negative controls, these samples were submitted for sequencing; they failed quality control performed by the University of Minnesota Genomics Center (UMGC), and no sequencing data were obtained.
Nucleic acid preparation, amplification, and sequencing
DNA samples were submitted to the University of Minnesota Genomics Center (UMGC), where library preparation for Illumina high-throughput sequencing was performed using a Nextera XT workflow with 2 × 300 bp chemistry. This workflow utilizes transposome-based shearing, which fragments DNA and adds adapter sequences in a single step. DNA was amplified and dual-indexed with adapter sequences through PCR using primers 515F (5′-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGTGCCAGCMGCCGCGGTAA-3′) and 806R (5′-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGGACTACHVGGGTWTCTAAT-3′), targeting the V4 hypervariable region of the bacterial 16S SSU rRNA gene.
The amplicon library preparation methods developed and employed by the UMGC have been shown to be more quantitatively accurate and qualitatively complete, detecting taxonomic groups that often go undetected with existing methods (Gohl et al., 2016). Indexed samples were sequenced once using an Illumina MiSeq at the UMGC. A total of 3.29 million (3,290,170) raw reads were obtained from 40 samples.
Temporal bounds within the dataset
The date range associated with this dataset represents the sediment core collection dates.
Data Availability in SRA
All relevant data are reported in the results paper, Sauer et al. (2022). All 16S rRNA amplicon data are available from the Sequence Read Archive (SRA) under BioProject accession PRJNA763898.
Following sequencing and initial quality control performed by the University of Minnesota Genomics Center, we conducted all downstream sequence processing and analyses using established bioinformatic and statistical workflows.
We conducted post-sequence processing in Mothur (v1.43.0) following the MiSeq SOP (Schloss et al., 2009; Kozich et al., 2013). Briefly, we merged forward and reverse reads and screened, trimmed, and removed ambiguous bases. We aligned reads to references in the SILVA database (v.132) and identified and removed chimeras using vsearch (v2.13.3) (Quast et al., 2013; Edgar et al., 2011). Finally, given the nature of the study (i.e., broad-scale patterns of diversity), we classified sequences as operational taxonomic units (OTUs) using a 97% similarity threshold and assigned taxonomy using the SILVA database (Stackebrand et al., 1994; Glassman et al., 2018).
Unless otherwise stated, all statistical analyses were conducted in R (v4.0.0) (R Core Team, 2018; Wickham et al., 2019). Both environmental and community data were loaded into R using Phyloseq (v1.32.0) (McMurdie et al., 2013), and reads classified as mitochondrial or chloroplast were removed. The final dataset after all post-processing contained 2,181,132 reads assigned to 53,854 taxa across 40 samples (two sediment depths per lake).
Alpha diversity
All singletons (OTUs observed only once across all 40 samples) were removed prior to calculating alpha diversity statistics. Given the observed correlation between richness and sample read depth across sequencing batches (Fig. S1 in Sauer et al., 2022), the data were rarefied to 90% of the read depth of the lowest-depth samples (15,771 reads; Fig. S2 and Table S3 in Sauer et al., 2022). The final dataset used for alpha diversity analyses included 630,840 read counts representing 25,563 taxa across 40 samples.
Alpha diversity metrics were calculated using the Phyloseq package in R (Fig. S3 and Table S4 in Sauer et al., 2022) (McMurdie et al., 2013). Richness (observed number of OTUs) and evenness (Shannon index) were compared between sediment depths (shallow n = 20, deep n = 19) using a Wilcoxon test, and among trophic status categories (hypereutrophic n = 4, eutrophic n = 16, mesotrophic n = 16, oligotrophic n = 3) and ecological regions (Western Cornbelt Plains n = 12, North Central Hardwood Forests n = 14, Northern Lakes & Forests n = 8, Canadian Shield n = 5) using Kruskal–Wallis tests with Dunn post hoc tests and Bonferroni correction. In all analyses, one outlying sample (Trout, Deep) was excluded due to uncharacteristically low diversity.
The predictive relationships between environmental parameters measured at the time of sampling (Table S2 in Sauer et al., 2022) and alpha diversity metrics were evaluated using multiple regression. The significance and variance explained by each predictor were assessed using the relimpo (v.2.2.3) and vegan (v.2.5–6) packages in R (Groemping et al., 2006; Oksanen et al., 2009). Final models for richness (observed) and evenness (Shannon) were selected based on AIC scores.
Beta diversity
Prior to beta diversity analyses, OTUs were filtered by removing those with fewer than two total counts and occurring in fewer than 10% of samples. Following filtering, the average number of reads per sample was 47,605, with a minimum read depth of 15,150 and a maximum read depth of 99,561. Because OTU count data exhibit strong positive skew, a variance-stabilizing transformation (VST) was applied to reduce heteroscedasticity (Love et al., 2020). Log-like transformations such as VST have been shown to transform count data toward near-normal distributions and produce larger eigengap values, resulting in more consistent correlation estimates that influence downstream analyses (Badri et al., 2020). After filtering and transformation, the final dataset for beta diversity analyses included 5,512 taxa across 40 samples.
Sample dissimilarity was visualized using principal component analysis (PCA) with the ordinate function in Phyloseq (McMurdie et al., 2013). Differences in community composition among ecological regions were assessed using permutational analysis of variance (PERMANOVA) with the adonis function in vegan (Oksanen et al., 2009), based on Bray–Curtis dissimilarity. Dispersion within groups was evaluated using permutation tests with the betadisp and permutest functions in vegan. Prior to calculating the dissimilarity matrix, negative VST values were converted to zero, as these values likely represent zero or near-zero counts and were considered negligible for distance calculations and hypothesis testing. Cluster analysis was performed using Ward’s (D2) method with the same dissimilarity matrix used for PERMANOVA.
* The primary data file of this dataset has been converted from its original format (.tsv) to csv.
* Within the primary data file (filename: 986587_v1_sediment_bacteria_in_MN_lakes.csv), lat and lon values have been split into two separate columns. Originally, both values were provided in a column named "lat_lon." The published file has both a "lat" column and a "lon" column.
* Unit values (grams represented by "g"s) have been removed from the "sample_size" column so the data within this column can be rendered accurately as numeric values.
* Country, state, lake name and sample depth range values have been parsed from the column "geo_loc_name" into individual columns. The original "geo_loc_name" column has been retained within the data file.
| Parameter | Description | Units |
| accession | NCBI Sequence Read Archive (SRA) run accession identifying a unique sequencing run. | unitless |
| message | Message from NCBI - should indicate successfully loaded. | unitless |
| sample_name | Unique name of the environmental or biological sample. | unitless |
| sample_title | Sample title (this is an optional field within the dataset and may not apply to every sample). | unitless |
| organism | DNA source - metagenome for environmental sample. | unitless |
| host | Host organism (if relevant, can be left blank). | unitless |
| isolation_source | Type of substrate DNA was extracted from (e.g. sediment, soil, tissue). | unitless |
| collection_date | Date of sample collection. | unitless |
| geo_loc_name | County, State, Country of sample collection. | unitless |
| lat | Latitude of sample collection location in decimal degrees; a positive value indicates a northern coordinate. | decimal degrees |
| lon | Longitude of sample collection location in decimal degrees; a negative value indicates a western coordinate. | decimal degrees |
| ref_biomaterial | DNA source. | unitless |
| samp_collect_device | Device used for sample collection (core, pump, etc). | units |
| samp_size | Size of sample collected. | mL |
| country | Country of sample collection derived from geo_loc_name. | unitless |
| state | State of sample collected derived from geo_loc_name. | unitless |
| lake_name | Name of lake where sample was collected derived from geo_loc_name. | unitless |
| sample_depth_range_min | Minimum depth range of sediment core derived from geo_loc_name. | cm |
| sample_depth_range_max | Maximum depth range of sediment core derived from geo_loc_name. | cm |
NSF Award Abstract:
The Great Lakes hold about 20% of the freshwater on Earth and have been increasingly impacted by human activities in recent decades. Lake Erie suffers from large, annually recurring, toxic cyanobacterial blooms in summer, whereas Lake Superior experiences smaller, localized cyanobacterial blooms after storm events. Cyanobacterial blooms have harmful ecological, human health, and economic implications. These blooms are a global phenomenon, observed in lakes and oceans, and can lead to low oxygen conditions and the production of toxins, both of which can be harmful for ecosystems. Understanding how different types of cyanobacteria influence nutrient cycling remains a major knowledge gap. This project aims to provide a deeper understanding of the long-term state of the Great Lakes ecosystem. The research approach combines new and established methods. Project results and implications will be shared with local and regional water interests in partnership with the Pittsburgh Collaboratory for Water Research, Education, and Outreach, the Great Lakes Commission Harmful Algal Blooms Collaborative, and the Lake Erie Area Research Network. Education is a central part of this project and training opportunities target next generation of scientists, including postdoctoral, graduate, and undergraduate students. The students and postdoc will receive state-of-the-art training in the rapidly developing fields of biogeochemistry and geomicrobiology, while working with an interdisciplinary team of scientists.
This study will examine nitrogen cycling, phytoplankton community composition, and the nitrogen isotopic composition of chloropigments in order to evaluate cyanobacterial productivity in the modern Laurentian Great Lakes as well as the historical record of cyanobacterial blooms over the past several hundred years. The nitrogen isotope composition of chloropigments is expected to provide a powerful new proxy for understanding primary productivity and the relative importance of cyanobacteria to export production and nitrogen cycling. This proxy would be valuable not only for management of modern systems but has important implications for increasing our understanding of the role of cyanobacteria throughout Earth history. This project would test this molecular isotopic proxy in contemporary aquatic ecosystems to assess its efficacy for: (1) determining the relative contributions of cyanobacteria vs eukaryotic algae (e.g., diatoms) to primary production; (2) evaluating export production of cyanobacterial productivity (including blooms); and (3) constraining historical cyanobacteria productivity in the sedimentary record. Comparison of a system characterized by eutrophication and seasonal cyanobacterial blooms (Lake Erie) with one characterized by picocyanobacteria productivity, but the near-absence of large-scale cyanobacterial blooms (Lake Superior), will provide information about the range of impacts that cyanobacteria can have on carbon and nitrogen cycling. Further information regarding nitrogen cycling will be derived from analysis of solid and dissolved nitrogen species throughout the annual cycle, as well as seasonal studies of sediment processes to measure associated sediment nitrogen removal rates through different processes.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
| Funding Source | Award |
|---|---|
| NSF Division of Ocean Sciences (NSF OCE) |