Is TCGA Breast cancer data strand specific?
1
2
Entering edit mode
6.1 years ago
Vasu ▴ 790

Hi,

I have TCGA Breast raw sequencing data fastq files. Initially with one of the sample bam file I used Rseqc to check whether it is strand specific or not.

I see that it is Strand specific RF (reverse forward strandness). I have aligned all the samples with hisat2 using the argument --rna-strandness RF.

But now somewhere I saw that all the TCGA samples are Un-stranded.

Can anyone please tell whether the data is strand specific protocol or not?

RNA-Seq tcga strand geneexpression hisat2 • 2.9k views
ADD COMMENT
0
Entering edit mode

Please refer to previous questions on that matter.

ADD REPLY
0
Entering edit mode

that was the one I saw. The recent comment I saw for the post is non-stranded. Thats why I asked again. And I have checked the paper also there is no information about strand specific.

ADD REPLY
0
Entering edit mode
6.1 years ago
GenoMax 147k

Original TCGA breast cancer data generated at UNC-Chapel Hill was NOT stranded.

ADD COMMENT
0
Entering edit mode

WHen I checked one of the sample with Rseqc I see that it is strand specific. And from this post [Strand Specificity of Arrays and RNAseq] I see that all illumina technologies are strand specific.

And if the samples are not stranded and if we do alignment with strand specific option will there be any problem?

ADD REPLY
0
Entering edit mode

As I said above if the sample was generated at UNC-Chapel Hill then it was prepared by a non-stranded library protocol based on personal communication with people who did this work. Alignments are not done using strand specific options but it is the read counting that takes that into consideration.

ADD REPLY
0
Entering edit mode

yes I understand with your previous comment. My question is if I have the TCGA raw sequencing data which is non-standed library protocol and if I do alignment with --rna-strandness RF (reverse forward strand specific) option will there be any problem? any effect on counts data?

I'm asking this because I have aligned all the TCGA BRCA samples using hisat2 with --rna-strandness RF strand specific option and used featureCounts to extract counts and then used that for the analysis.

ADD REPLY
0
Entering edit mode

I do not know but let me check with other mods who may.

ADD REPLY
0
Entering edit mode

My understanding is --rna-strandedness doesn't affect mapping, instead it adds a XS tag, which is needed by Cufflinks and StringTie. It shouldn't affect featureCounts.

ADD REPLY
0
Entering edit mode

The post you link is somewhat unfortunate. While it is true that Illumina platforms sequence one specific strand, it depends on the library prep if the strand information is preserved from the mRNA. If you use an unstranded kit, it is lost, no matter what platform you use. The only way to know for sure is to get your hands on the original lab protocol.

ADD REPLY

Login before adding your answer.

Traffic: 2163 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6