And trimmed with the sfffile tool (454-Roche) according to the quality graphs. After trimming, the two 16S fragments were approximately 320 bp and 470 bp long and fully encompassed V1-V2 and V3-V4, respectively (hereafter named V12 and V34). The six libraries were then similarly processed through QIIME v1.7.0 [46]. Sequences were filtered and demultiplexed using Split_libraries (settingsM3-b10-reverse_primer_mismatches 3-w50-g-s20) qhw.v5i4.5120 followed by denoising with denoise_wrapper.py. Each library was inflated and checked for chimeras using UCHIME v4.2, implemented in Usearch, against the ChimeraSlayer reference database (“gold”) in the Broad Microbiome Utilities. Chimeras were discarded. All same-amplicon sequences were combined and de-novo assigned fpsyg.2014.00726 to operational taxonomic units (OTUs) based on 97 identity default threshold using uclust [47] and Trie prefilter. A representative sequence for each OTU was picked (i.e. most abundant) and assigned to taxonomy based on the Greengenes database gg_13_5_otus, retraining with the RDP classifier (0.8 confidence level). We retained only OTUs shared across two or more samples, plus all singletons represented by >3 reads. All Cyanobacteria, except members of the
age of Melainabacteria (see Results), were removed from the dataset as putatively derived from ingested material. We cannot, however, fully exclude sampling of environmental bacteria, given the HMR-1275 site inclusion of digesting matter. While this is a common issue when studying wild samples, we also note that the current classification of bacteria as either gut-associated or free-living remains a challenging task and is currently being revised (e.g. for Planctomycetes [48] and Melanobacteria [49]). Sequences were aligned with PyNAST default parameters [46] using the reference database “core_set_aligned.fasta” from Greengenes buy Tariquidar website and default lanemask. Alignments were further trimmed and a phylogenetic tree was built with FastTree [50]. The three alpha diversity indexes, Chao1, Shannon index, and Phylogenetic diversity (PD) Whole Tree metrics, were estimated on rarefied OTU tables at 500 reads/sample to a subsampling depth determined as the minimum number of sequences per sample, and 10 iterations. Beta diversity (Unifrac and binary_Jaccard) was measured using the script jackknifed_ beta_diversity.py on an even-depth rarefaction for all samples determined as the minimum number of sequences in each library and visualized through principal coordinates analysis (PCoA). We used the script dissimilarity_mtx_stats.py to calculate means and standard deviations for all the rarefied unweighted unifrac distance matrices. With the script make_distance_ boxplots.py we generated distance boxplots for comparisons among all host species and performed two-sample t-tests (with Bonferroni correction for multiple comparisons) for all pairs of boxplots to help determine which boxplots (distributions) were significantly different. Procrustes analyses were performed to compare beta diversity patterns of UniFrac distances between V12 and V34, with Monte Carlo simulations (1000 permutations). Rarefaction curves on the observed number of OTUs were calculated on a sub-sampling of 500 reads/sample to a sub-sampling depth determined as the minimum number of sequences in each library and 10 iterations. Graphs were plotted with function ggplot in R. The core microbial taxa and OTUs were computed for each species as well as for all cichlids using an in-house R script. The core.And trimmed with the sfffile tool (454-Roche) according to the quality graphs. After trimming, the two 16S fragments were approximately 320 bp and 470 bp long and fully encompassed V1-V2 and V3-V4, respectively (hereafter named V12 and V34). The six libraries were then similarly processed through QIIME v1.7.0 [46]. Sequences were filtered and demultiplexed using Split_libraries (settingsM3-b10-reverse_primer_mismatches 3-w50-g-s20) qhw.v5i4.5120 followed by denoising with denoise_wrapper.py. Each library was inflated and checked for chimeras using UCHIME v4.2, implemented in Usearch, against the ChimeraSlayer reference database (“gold”) in the Broad Microbiome Utilities. Chimeras were discarded. All same-amplicon sequences were combined and de-novo assigned fpsyg.2014.00726 to operational taxonomic units (OTUs) based on 97 identity default threshold using uclust [47] and Trie prefilter. A representative sequence for each OTU was picked (i.e. most abundant) and assigned to taxonomy based on the Greengenes database gg_13_5_otus, retraining with the RDP classifier (0.8 confidence level). We retained only OTUs shared across two or more samples, plus all singletons represented by >3 reads. All Cyanobacteria, except members of the
age of Melainabacteria (see Results), were removed from the dataset as putatively derived from ingested material. We cannot, however, fully exclude sampling of environmental bacteria, given the inclusion of digesting matter. While this is a common issue when studying wild samples, we also note that the current classification of bacteria as either gut-associated or free-living remains a challenging task and is currently being revised (e.g. for Planctomycetes [48] and Melanobacteria [49]). Sequences were aligned with PyNAST default parameters [46] using the reference database “core_set_aligned.fasta” from Greengenes website and default lanemask. Alignments were further trimmed and a phylogenetic tree was built with FastTree [50]. The three alpha diversity indexes, Chao1, Shannon index, and Phylogenetic diversity (PD) Whole Tree metrics, were estimated on rarefied OTU tables at 500 reads/sample to a subsampling depth determined as the minimum number of sequences per sample, and 10 iterations. Beta diversity (Unifrac and binary_Jaccard) was measured using the script jackknifed_ beta_diversity.py on an even-depth rarefaction for all samples determined as the minimum number of sequences in each library and visualized through principal coordinates analysis (PCoA). We used the script dissimilarity_mtx_stats.py to calculate means and standard deviations for all the rarefied unweighted unifrac distance matrices. With the script make_distance_ boxplots.py we generated distance boxplots for comparisons among all host species and performed two-sample t-tests (with Bonferroni correction for multiple comparisons) for all pairs of boxplots to help determine which boxplots (distributions) were significantly different. Procrustes analyses were performed to compare beta diversity patterns of UniFrac distances between V12 and V34, with Monte Carlo simulations (1000 permutations). Rarefaction curves on the observed number of OTUs were calculated on a sub-sampling of 500 reads/sample to a sub-sampling depth determined as the minimum number of sequences in each library and 10 iterations. Graphs were plotted with function ggplot in R. The core microbial taxa and OTUs were computed for each species as well as for all cichlids using an in-house R script. The core.