I have got 6 samples (3 conditions with 2 replicates each) and therefore 6 quant.sf files for each one
I want to compare my 3 conditions : A vs B , A vs C , B vs C
But for that, I need to know more details about splicing junctions and transposable elements (TEs)
I tried to import my (pseudo)bam files into IGV to try to understand how mapped reads deal with the reference transcriptome but I couldn't do it. About TEs, I don't know how I could get information about them.. I read papers but I'm still getting confused ..
If you are mapping with Salmon you are mapping to the transcriptome, and you are already taking into account splice junctions. If you want more details about splice junctions, you have to map against the genome - two good programs for this are STAR and HISAT2.
I mean, for example : for 1 gene, we can get many isoforms transcripts (A,B,C..) . So, how a read can be specific of the isoformA and not the B one for example? I can't understand how Salmon makes the difference ...
Thanks for SalmonTE, i'm gonna look that
About IGV, it requires a reference genome and bam files from the alignment. In my case, I have got a transcriptome reference and bam files from Salmon (in reality, I read they are not real bam files but pseudo bam files )
So, I import my transcriptome reference into IGV : it looks recognize the reference because there's no error messages and I can see the nucleotidic sequence of the transcriptome. But when I import one of my (pseudo)bam file from Salmon, IGV doesn't match the bam file with the transcriptome. There's no error message, but the software struggles a lot
I guess IGV does that because it is not suitable for transcriptome, and only for genome
Each isoform has some unique region, and some shared regions with other isoforms. Based on how many reads map to the unique regions of each isoform, Salmon uses an expectation-maximization algorithm to optimally apportion shared counts between isoforms. This question has been addressed before, e.g. see Rob answer to Big differences between mappings computed by Salmon and quantification .
Did you position-sort and index the bam file? Did you select one particular transcript for viewing? For bam visualization, IGV will only show mapped reads after zooming in to small regions. Another problem may be your transcriptome has hundreds of thousands of transcripts, and this may overload IGV. Did you check IGV memory usage after loading the bam?
About my bam files, I sorted and indexed them. I also zoomed in to small regions...
Actually, on the section "select a chromosome to view" , I don't have "Chr1" "Chr2" .... but all my transcripts "AT3G54560.1" "AT3G54570.1" etc. I guess it takes too much space and IGV is not suitable for transcriptome reference
If you are mapping with Salmon you are mapping to the transcriptome, and you are already taking into account splice junctions. If you want more details about splice junctions, you have to map against the genome - two good programs for this are STAR and HISAT2.
About TEs, see this pipeline: https://github.com/hyunhwaj/SalmonTE
You have to explain in more detail what you did and report error messages, if any, otherwise it will be difficult to help you.
Thanks for answering me ,
I mean, for example : for 1 gene, we can get many isoforms transcripts (A,B,C..) . So, how a read can be specific of the isoformA and not the B one for example? I can't understand how Salmon makes the difference ...
Thanks for SalmonTE, i'm gonna look that
About IGV, it requires a reference genome and bam files from the alignment. In my case, I have got a transcriptome reference and bam files from Salmon (in reality, I read they are not real bam files but pseudo bam files )
So, I import my transcriptome reference into IGV : it looks recognize the reference because there's no error messages and I can see the nucleotidic sequence of the transcriptome. But when I import one of my (pseudo)bam file from Salmon, IGV doesn't match the bam file with the transcriptome. There's no error message, but the software struggles a lot
I guess IGV does that because it is not suitable for transcriptome, and only for genome