All seafloor basalts were stored frozen at -80°C for XRD analysis and DNA extraction. Bulk mineralogy analysis, i.e. quantitative determination of rock-forming minerals and total clay minerals, was determined on all three seafloor basalts via X-ray Diffraction (XRD) analysis at KT GeoServices, Inc. Detection limits were at 1-5 wt%. For the two Lō’ihi seafloor basalts, average values were used.
Raw sequence reads were evaluated with FastQC version 0.11.3 (Schmieder and Edwards, 2011a), quality trimmed (minimum quality score – 25, maximum length – 450 bp, maximum homopolymer length – 9 bp, max N-tail – 1 bp), and filtered (removal of technical duplicates, minimum length – 60 bp) with Prinseq 0.20.4 (Schmieder and Edwards, 2011b) and MG-RAST (Meyer et al., 2008). We obtained 1,191,651 sequences in the EPR dataset; 1,102,191 sequences in the Lo’ihi dataset; and 58,188 sequences in the negative control dataset. Quality-filtered reads were assembled denovo using standard 454 settings in mira 3.4.1.1 (Chevreux et al., 1999). Padded (i.e. including potential gaps) contigs > 500 bp were filtered using mira 3.4.1.1 (convert_project) (Chevreux et al., 1999). Seafloor basalt contigs were screened for contamination using a combination of BBMap (bbduk.sh with parameters mcf=0.25, k=31) and the BLASTN algorithm (Altschul et al., 1990). The BBMap algorithm identified 4 potentially contaminant contigs in the EPR metagenome dataset (total of 4,290 bp) and 10 potentially contaminant contigs in the Lo’ihi (total of 12,423 bp). Community richness was estimated using the Chao1 index, diversity analysis was calculated using the Shannon index in QIIME 1.9.1 (alpha_diversity.py) based on BLASTX assignments of contigs. Phylosift was used to assess community diversity using the core molecular marker set of genes, which includes ~40 three-domain protein coding genes, single-copy eukaryote specific nuclear orthologs, ribosomal RNA genes (16S/18S), mitochondrial genes (mtDNA markers), and plastid and viral markers identified through Markov-clustering algorithms applied to genome datasets (Darling et al., 2014).
BCO-DMO Processing:
version 2015-11-06: replaced version 2015-10-23. Added site, lat, lon, and date columns.