Removing unassigned OTUs
2
1
Entering edit mode
6.4 years ago

I have a bunch of sequences which failed to get assignment up to genus level. For downstream analysis I want to remove those OTUs/sequences/ASVs from the phyloseq object (OTU table). Could somebody suggest me script to remove those OTUs? Thanks in advance.

EDIT:

The input and output provided as image as well as ta ext. You can see ASVs which are not assigned up to genus level (NA) which are ASV, ASV7, ASV8 etc has been removed in output. I want the same for whole table. because of certain reasons I dont want to use Excel.

Input

https://ibb.co/dLxOz8

#otuid  sample1 sample2 Phylum  Class   Order   Family  Genus   species 
ASV6    0   0   Cyano   [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   NA  NA  
ASV7    7549    0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   NA  NA  
ASV8    165 0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   NA  NA  
ASV11   0   0   TM7 TM7-3   I025    Rs-045  NA  NA  
ASV12   0   0   TM7 TM7-3   I025    Rs-045  NA  [Saccharibacteria]_UB2523   
ASV13   0   0   TM7 TM7-3   I025    Rs-045  [Saccharibacteria]  [Saccharibacteria]_UB2523   
ASV1    14692   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV2    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV3    7347    5107    Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV4    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV5    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV9    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] [Melainabacter_A1]  
ASV10   3685    2773    Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] [Melainabacter_A1]

output

https://ibb.co/dQRae8

#otuid  sample1 sample2 Phylum  Class   Order   Family  Genus   species 
ASV13   0   0   TM7 TM7-3   I025    Rs-045  [Saccharibacteria]  [Saccharibacteria]_UB2523   
ASV1    14692   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV2    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV3    7347    5107    Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV4    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV5    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] NA  
ASV9    0   0   Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] [Melainabacter_A1]  
ASV10   3685    2773    Cyanobacteria[Melainabacteria]  [Melainabacteria]   [Melainabacteriales]    [Melainabacteriaceae]   [Melainabacter] [Melainabacter_A1]
R phyloseq microbiome OTU next-gen • 3.1k views
ADD COMMENT
3
Entering edit mode
6.4 years ago
GenoMax 147k

In that case you don't need to use R. You could use awk and not print any line where the 8th field matches NA.

cat <(head -n 1 test) <(awk -F'\t' '$8 !~/NA/' input.txt) > out.txt
ADD COMMENT
2
Entering edit mode
6.4 years ago
zx8754 12k

Assuming you have read the file into R using something like myData <- read.table(file = "myFile.txt"). We can subset NA value rows as below:

myDataClean <- myData[ !is.na(myData$Genus), ]

# then output to a file
write.table(myDataClean, "myFileClean.txt")
ADD COMMENT

Login before adding your answer.

Traffic: 1107 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6