Filter Taxa Based on Total Data

Arguments

x

phyloseq object.

rank

Taxonomic rank to use like Phylum.

percent_thres

Percent cut-off to use. If value is 10, then all phyla with less that 10 percent of the total counts in all data are removed.

verbose

Logical. Prints a list of removed taxa. Default is TRUE.

Value

Filtered phyloseq

Details

Provided a count data phyloseq, filterTaxaTotal calculates the percent of taxon at the rank level is calculated. Those that are less than the percent_thres are removed. This function works with ASV level data as the ASVs are merged at specified level for calculation. Therefore, ASVs that belong to low/rare abundance at a specified rank are removed.

Author

Sudarshan A. Shetty

Examples

library(biomeUtils)
data('FuentesIliGutData')
# below we filter Family that are less than 2% of the total data
ps.filt <- filterTaxaTotal(FuentesIliGutData,
                           rank = 'Family',
                           percent_thres = 2,
                           verbose = TRUE)
#> Following taxa were removed at level: Family
#> Akkermansiaceae
#> Erysipelotrichaceae
#> Streptococcaceae
#> Methanobacteriaceae
#> Bifidobacteriaceae
#> Peptostreptococcaceae
#> Christensenellaceae
#> Rikenellaceae
#> Tannerellaceae
#> Veillonellaceae
#> Clostridiaceae 1
#> Barnesiellaceae
#> Family XI_2II
#> Coriobacteriaceae
#> Lactobacillaceae
#> Burkholderiaceae
#> Desulfovibrionaceae
#> Eggerthellaceae
#> Acidaminococcaceae
#> Marinifilaceae
#> Muribaculaceae
#> Peptococcaceae
#> Pasteurellaceae
#> Halomonadaceae
#> Enterococcaceae
#> Anaeroplasmataceae
#> Clostridiales vadinBB60 group
#> Defluviitaleaceae
#> Atopobiaceae
#> Leuconostocaceae
#> Shewanellaceae
#> Victivallaceae
#> Actinomycetaceae
#> Synergistaceae
#> Eubacteriaceae
#> env.OPS 17
#> Micrococcaceae
#> Pseudomonadaceae
#> Family XI
#> Family XI_2
#> Staphylococcaceae
#> Corynebacteriaceae
#> No. of taxa removed: 217
#> No. of samples removed: 0
ps.filt
#> phyloseq-class experiment-level object
#> otu_table()   OTU Table:         [ 688 taxa and 589 samples ]
#> sample_data() Sample Data:       [ 589 samples by 61 sample variables ]
#> tax_table()   Taxonomy Table:    [ 688 taxa by 7 taxonomic ranks ]
#> phy_tree()    Phylogenetic Tree: [ 688 tips and 687 internal nodes ]