<div><p><strong>Raw cDNA transcriptome sequence reads:</strong></p>
<p>The raw sequence data are deposited in the NCBI Sequence Read Archive (SRA) with accession numbers SRR3990241- SRR3990248 associated with BioProject PRJNA330848 and BioSamples SAMN05427525 - SAMN05427532</p>
<p>Species: Atlantic silverside (Menidia menidia)<br />
Sample type: mRNA from mix of tissues<br />
RNA extraction method: Qiagen RNeasy Plus Universal Tissue Mini Kit<br />
Library preparation: Illumina’s TruSeq RNA sample prep kit v2<br />
Sequencing instrument: Illumina HiSeq 2000</p>
<p><strong>Assembled </strong><strong>Atlantic silverside transcriptome:</strong></p>
<p>An assembled Atlantic silverside transcriptome is deposited in the NCBI GenBank Transcriptome Shotgun Assembly Sequence Database (TSA). This version of the project (01) has the accession number GEVY01000000, and consists of sequences GEVY01000001-GEVY01020998. The cleaned RNA-seq reads from all samples were de no assembled with two different programs: CLC Genomic Workbench v6.0.2 (both with an automatically optimized word size of 25 and a longer word size of 40) and Trinity v. r20131110 (with default settings, but retaining only the isoform with the highest mapped read depth within each subcomponent). We saw that each assembly contained a substantial set of unique transcripts not present in the other assemblies and therefore merged all three to maximize the gene space coverage in our final contig set. To reduce redundancy, we used cd-hit-est v4.5.4 to collapse the contig set into the longest representative for each unique sequence, and CAP3 v12/21/07 to meta-assemble partial assemblies of the same transcript. Following these procedures, we broke up likely chimeric contigs with the method by Yang and Smith (2013, BMC Genomics 14:328). Because we wanted to reduce our contig set to only include a single representative transcript for each silverside gene, we used a reciprocal best hit blast approach to extract non-redundant putative orthologs to the gene sets in three related species: platyfish (Xiphophorus maculatus), medaka (Oryzias latipes), and Nile tilapia (Oreochromis niloticus). We compared our contig set against the full peptide set for each reference species (downloaded from Ensemble release 75) with blastx, and then compared the peptide sequences for each species to our contig set with tblastn, in both cases using soft masking and an e-value cut-off of 10e-4. For each reference species, we recorded reciprocal best hits (RBHs) when a contig and a protein had a best match to each other. We used a sequential approach to select putative orthologs. We first extracted the contigs that were RBHs to platyfish proteins (since this species yielded the highest number of RBHs). We also added additional contigs that had a best hit to a portion of an RBH protein not covered by the RBH contig (secondary hits (maximum overlap of 10 amino acids allowed)), under the assumption that these contigs represented transcript fragments. We then added contigs that were RBHs (and the associated secondary non-overlapping hits to the same proteins) to medaka proteins that were non-redundant to the platyfish proteins. Medaka proteins were considered non-redundant if they did not have a RBH to the previously extracted RBH platyfish protein set (in a direct blastp comparison of the two protein set) or was annotated to the same zebrafish gene (ZFIN ID) as an RBH platyfish protein. We similarly added contigs that were RBH or associated secondary hits to tilipia proteins that were non-redundant to the proteins included from the other species. To recover additional high quality non-redundant transcripts, we used TransDecoder to predict coding regions in our redundancy-reduced contig set on the basis of nucleotide composition, open reading frame (ORF) length and Pfam domain content. Of the contigs predicted to contain a complete ORF, we retained the subset which did not have a significant (e-value<10e-2) blastn hit to the RBH contig set (and therefore are non-redundant).</p>
<p><strong>Methods are also published in:</strong></p>
<p>Therkildsen, N. O., and S. R. Palumbi.2016. Practical low-coverage genomewide sequencing of hundreds of individually barcoded samples for population and evolutionary genomics in nonmodel species. Molecular Ecology Resources. doi: <a href="https://doi.org/10.1111/1755-0998.12593" target="_blank">10.1111/1755-0998.12593</a></p></div>
Atlantic silverside (Menidia menidia) cDNA transcriptome and TSA accessions
<div><p>The data include Atlantic silverside (Menidia menidia) genetic accession information at the National Center for Biotechnology Information (NCBI). Accessions in this dataset are for cDNA transcriptome or a Transcriptome Shotgun Assembly (TSA). Links to the accession at NCBI are provided along with accession number and accession type. Atlantic silversides were collected at Poquott Beach, New York on June 20th and 21st, 2013.</p></div>
Atlantic silverside (Menidia menidia) cDNA transcriptome and TSA accessions
<div><p>Assembly Method :: CLC Genomics Workbench 6.0.2; Trinity r20131110; CAP3 v12/21/07 </p>
<p>BCO-DMO Data Manager Processing Notes:<br />
* added a conventional header with dataset name, PI name, version date<br />
* modified parameter names to conform with BCO-DMO naming conventions<br />
* broke sampling location description into sample_location and sample_state<br />
* added lat/lon of sample site to dataset</p></div>
686981
Atlantic silverside (Menidia menidia) cDNA transcriptome and TSA accessions
2017-04-06T13:55:29-04:00
2017-04-06T13:55:29-04:00
2023-07-07T16:10:26-04:00
urn:bcodmo:dataset:686981
Atlantic silverside (Menidia menidia) cDNA transcriptome and TSA accessions from specimens collected at Poquott Beach, New York in June of 2013 (Fishery Genome Changes project)
false
Palumbi, S. R. (2017) Atlantic silverside (Menidia menidia) cDNA transcriptome and TSA accessions from specimens collected at Poquott Beach, New York in June of 2013 (Fishery Genome Changes project). Biological and Chemical Oceanography Data Management Office (BCO-DMO). Version Date 2017-04-06 [if applicable, indicate subset used]. http://lod.bco-dmo.org/id/dataset/686981 [access date]
true
false
2017-04-06
HTML
https://www.bco-dmo.org/dataset/686981
text/html
Datapackage.json
Frictionless Data Package
https://www.bco-dmo.org/dataset/686981/datapackage.json
application/vnd.datapackage+json
PDF
https://www.bco-dmo.org/dataset/686981/Dataset_description.pdf
application/pdf
JSON-LD
https://www.bco-dmo.org/dataset/686981.json
application/ld+json
Turtle
https://www.bco-dmo.org/dataset/686981.ttl
text/turtle
RDF/XML
https://www.bco-dmo.org/dataset/686981.rdf
application/rdf+xml
ISO 19115-2 (NOAA Profile)
https://www.bco-dmo.org/dataset/686981/iso
application/xml
http://www.isotc211.org/2005/gmd-noaa
Dublin Core
https://www.bco-dmo.org/dataset/686981/dublin-core
application/xml
http://purl.org/dc/elements/1.1/
686981
http://lod.bco-dmo.org/id/dataset/686981
OSPREY
http://www.opengis.net/def/crs/OGC/1.3/CRS84
<http://www.opengis.net/def/crs/OGC/1.3/CRS84> POINT (-73.1025 40.9475)
40.9475
-73.1025
40.947500000000
-73.102500000000