A Basic question regarding lncRNA identification pipeline.
2
1
Entering edit mode
8.5 years ago

Hi,

I have been analyzing RNA-Seq data sets of some Breast cancer cell lines to create a high confidence list of expressed lncRNAs. However as, I am new to NGS, I cannot figure out how do I filter out the known Expressed gene/protein coding transcripts from my annotation file after cufflinks assembly? Are there any specific tools to do the filtering? If anyone could help me regarding this, I will really appreciate your help.

Thanks

R

RNA-Seq lncrna pipeline • 2.0k views
ADD COMMENT
2
Entering edit mode
ADD COMMENT
0
Entering edit mode
8.5 years ago

Thanks, but my question is slightly different, , basically, after top-hat assembly with bowtie2 , I Used RABT assembly in cufflinks and then merged all transcripts (elegant= gtf file of annotated transcripts), then did cuffmerge of the replicates. After running cuffcompare with r- given as annotated gencode assembly, I got the transfrags identified with diff signs (=, c x etc.) Now I want to filter out all transfrags of ‘i’, ‘j’, ‘o’, ‘u’ and ‘x’ option, while making an extra file of known lncRNAs (by matching with bodymap annotated lncRNA.gtf). I am curious if I can do all that in command line in one comment, something like:

awk ‘$22 ~ /j,i,o,u,x/ { print }’..
ADD COMMENT

Login before adding your answer.

Traffic: 2484 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6