STAR can not detect this chimeric read
0
3
Entering edit mode
8.2 years ago

Hi,

I try to detect chimeric fusions between a virus (integrated into the host genome) and the host genome. I aligned the reads (paired-end 2x76 stranded) on a hybrid genome (host+virus) where the virus genome is considered as an additional chromosome. Here's my command :

$STAR --genomeDir $stargenomeDir --outFilterScoreMinOverLread 0.3 --outFilterMatchNminOverLread 0.3 -seedSearchStartLmax 10 --outFilterMultimapNmax 10 --outFilterMismatchNmax 10 --chimSegmentMin 10 --outFilterMatchNmin 10 --chimJunctionOverhangMin 10 --readFilesIn $r1 $r2 --runThreadN $threads --outStd SAM --readFilesCommand zcat

version : STAR_2.3.1u_r375

So I expect STAR to report fusion reads with minimum 10 bases aligning either of the host or virus ; and the rest on the virus or the host respectivelly. As :

 # : host genome
 @ : virus genome
 = : read
 - : splicing

#######################################@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
          ==========-----------------------------------=========================
            min 10bp

But when I check on IGV on the extremity of virus genome I observe some reads containing soft-clipping longer than 10bp (in this cases they are 13bp long). When I align these soft-clipped bases on the host genome using blast I found a position on the host genome where I indeed found traces of a fusion transcript which continues (I can clearly see reads that aligned after the fusion breakpoint representing the fusion transcript). But STAR do not report this fusion. Am I doing something wrong ?

Here's the SAM lines for a pair of reads containing the soft-clipping of interest :

NS500186:93:HYFJVBGXX:3:13402:19874:18099   163 chrVirus    1   255 16S59M  =   65  128 CACATGTTTAGGTTTGTGACAATGACCATGAGCCCCAAATATCCCCCGGGGGCTTAGAGCCTCCCAGTGAAAAAC AAAAAEEEEEEEEEEEEEEEEEEEAEEEEEAEEEEEEEEEEEEEEE/EEEEEEEEEEEEAEEE<EEEEEEEEEEE NH:i:1  HI:i:1  AS:i:117    nM:i:2
NS500186:93:HYFJVBGXX:3:13402:19874:18099   83  chrVirus    65  255 64M12S  =   1   -128    CGCGAAACAGAAGTCTGAAAAGGTCAGGGCCCAGACCAAGGCTCTGACGTCTCCCCCCGGAGGGACAGCTCAGCAC    AAEEEEEEEEEEEEEAEEEEEEE/EEEEEEEAEEEAEEEEE/EEEEEEEAEEEEEEEEEEEEEEEEEAEAEAAAAA    NH:i:1  HI:i:1  AS:i:117    nM:i:2

I put two figures explaining my cases

Alignment on the virus :

enter image description here

Alignment on the host :

picture 2

Thanks

RNA-Seq fusion STAR • 4.3k views
ADD COMMENT
2
Entering edit mode

I think I found the issue. The soft-clipped sequence appears to be present at multiple position in the host genome. I suppose STAR do not report multi-mapping fusions. I'll dig a little bit more to be sure.

ADD REPLY
0
Entering edit mode

@NicoBxl: Both of the links you included for the images are not loading (I tested them). Can you post new versions and update your post?

ADD REPLY
0
Entering edit mode

Should work. The links are ok. I also put the SAM lines for a pair of reads.

ADD REPLY
0
Entering edit mode

Perhaps the links are working from your part of the world but they still don't work for me. Referring to this link for example.

ADD REPLY
0
Entering edit mode

I reupload the picture and change the post. You see them now ?

ADD REPLY
0
Entering edit mode

I still can't (even if I use the URL) but don't worry. It may be local firewall specific. Other's able to see the images?

ADD REPLY
2
Entering edit mode

I see images. Even when sober.

ADD REPLY
0
Entering edit mode

I see them even before coffee :P

@NicoBxl: You might post this to the STAR email list/google group. Alex is usually pretty good about replying (given how many options STAR has, I expect he's the only one that can point to the right one to tweak).

ADD REPLY
1
Entering edit mode

Yes I will do that. I will post his answer here.

ADD REPLY
0
Entering edit mode

If you would like to try another tool I wrote RILseq which was built for chimera detection in bacteria but I don't see why it wouldn't work here. It overcomes multiple mapping reads and will do some statistical analysis to detect over-represented chimera. See https://pypi.python.org/pypi/RILseq

ADD REPLY
0
Entering edit mode

Did you try STAR-fusion?

ADD REPLY

Login before adding your answer.

Traffic: 1654 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6