alignment containing sequences from position a to b
1
0
Entering edit mode
8.0 years ago

Hello everyone,

I'm looking for an alignment containing reads started or clipped at the first position and ended or clipped at the last position of my alignment. I don't want to have reads ended in the middle of the sequence.

For that I've done an alignment from fastq input with bwa on my entire gene, then selected a region between 2 positions (samtools view -b trialsorted.bam AA_AD:30-230 > trialsorted30-230.bam) and then realign these sequences with a reference containing only the first bases of my start position and the same for the last bases of my end position in order to keep only reads of interest. However I didn't get a selection of sequences starting from a position and ending to another position.

Does anyone have any idea how to solve my problem?

Thanks

alignment sequence sequencing • 3.4k views
ADD COMMENT
0
Entering edit mode
java -jar picard.jar FilterSamReads I=align-sort.bam O=out.bam JS=clippFilter.js FILTER=includeJavascript

with clippFilter.js containing:

function accept(r)
 {
 return r.getContig()=="AA_AD";
 }
accept(record);

Using this command, I got something, however this one

function accept(r)
 {
 return r.getContig()=="AA_AD" &&
  r.getStart()==30 &&
  r.getEnd()==130;
 }
accept(record);

does not allow me to crop sequences from 30 to 130, nothing pass through the filter. Using igv, I saw my input contain sequences that overlap this region.

ADD REPLY
0
Entering edit mode
8.0 years ago

using samjs https://github.com/lindenb/jvarkit/wiki/SamJS

samtools view -b in.bam AA_AD:30-230 | java -jar dist/samjs.jar -e 'record.getContig()=="AA_AD" && record.getStart()==30 && record.getEnd()==230'
ADD COMMENT
0
Entering edit mode

thank you for your rapid answer! I can't however run it with 'FilterSamReads'

ADD REPLY
0
Entering edit mode

I still don't get it, how to run this command with FilterSamReads? I tried without success:

samtools view -b in.bam AA_AD:30-230 | java -jar FilterSamReads.jar -e 'record.getContig()=="AA_AD" && record.getStart()==30 && record.getEnd()==230'

java -jar FilterSamReads.jar INPUT=in.bam -e 'record.getContig()=="AA_AD:30-230" && record.getStart()==30 && record.getEnd()==230' OUTPUT=out.bam
ADD REPLY
0
Entering edit mode

And what was the error displayed ? What did you see ?

ADD REPLY
0
Entering edit mode

ERROR: Invalid argument '-e'

impossible to find something equivalent in the options

ADD REPLY
0
Entering edit mode

"Invalid argument".

I will not give you the solution: the manual for this picard tool is here https://broadinstitute.github.io/picard/command-line-overview.html#FilterSamReads .

ADD REPLY
1
Entering edit mode

by the way, nowadays, there is no such file "FilterSamReads.jar ", there is only "picard.jar". Your version of picard looks very old. What is the version ? current is 2.7.1

ADD REPLY
1
Entering edit mode

Ok thank you! After having updated picard and carefully read of the manual, I ran this command: java -jar picard.jar FilterSamReads I="insorted.bam", O="outsorted.bam", READ_LIST_FILE="read_names.txt", FILTER="includeAligned"

with read_names.txt file containing: AA_AD:30-230 -e 'record.getContig()=="AA_AD:30-230" && record.getStart()==30 && record.getEnd()==230'

However, there is still something wrong, I don't understand what happen with insorted.bam (sorted and present in my folder): "ERROR 2016-11-22 08:54:11 FilterSamReads Failed to filter aa.bam, htsjdk.samtools.SAMException: Cannot read non-existent file: /home/admin/Bureau/insorted.bam, at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:338) at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:325) at picard.sam.FilterSamReads.doWork(FilterSamReads.java:210) at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:208) at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95) at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)"

ADD REPLY
2
Entering edit mode

The error is quite clear. The file doesn't exist in the path specified. There is no /home/admin/Bureau/insorted.bam. Where is insorted.bam?

And what's up with the comma's and "" in your command? I can't find those in the manual.

ADD REPLY
0
Entering edit mode

Thanks for opening my eyes!

ADD REPLY
0
Entering edit mode
java -jar picard.jar FilterSamReads I=insorted.bam O=outsorted.bam JS=clippFilter.js FILTER=includeJavascript

with -e 'record.getContig()=="AA_AD" && record.getStart()==30 && record.getEnd()==230' in clippFilter.js file. I however got a syntax error after the first " ' "

ADD REPLY
0
Entering edit mode

You're mixing everything. There is no such '-e' . This option '-e' is for another tool, not for picard. Your script in picard should be only:

record.getContig()=="AA_AD" && record.getStart()==30 && record.getEnd()==230
ADD REPLY
0
Entering edit mode

that's what I've finally done (and also with a script using 'function'), no problem with 'record.getContig' but when I add 'record.getStart' and/or 'record.getEnd' I get no record, as mentioned in my post above. I've tried with other files, other positions without success Here is the start of my sorted bam:

ZUXCF:05542:10628   0   AA_AD   1   34  33S62M1D5M  *   0   0   GCTTCGTAAGCTGTCAGCCTCTCACGGTCCGCTATGTTTTTCAACCCGTATCTGAGCGGCGGCGTGACCGGCGGTGCGGTCGCGGGTGGCCGGCGCAGCG    >8828<<<9=>>>===<<8<<=<<>>9==8<==>@=C>C=3=?9?A8>>=?<<<><<<78808:<===9=9::6:===:>>>666+68/:6<8<<<==<>    NM:i:1  MD:Z:62^T5  AS:i:62 XS:i:0
ZUXCF:03174:13233   16  AA_AD   1   60  68S98M  *   0   0   ATCTAGTGCTGCATGTGTATTTTCTTTGTGATTTTGCTTCGTAAGCTGTCAGCCTCTCACGGTCCGCTATGTTTTTCAACCCGTATCTGAGCGGCGGTGTGACCGGCGGTGCGGTCGCGGGTGGCCGGCGTCAGCGTTCGCAGCCCGGCTCCGCGCAGGGCTCGGG  0)++++..,,-..<<<:::4=<<;6;;8887)7777;7=<<<8>==;;::770788<<883=<7<;;8888)888886;3;;;@====<7777770::;;;94<7994=</0,><99;6<<<77282<<<;;;;880888;;;7==9=<=9<<<:;82::<0/.?A  NM:i:1  MD:Z:29C68  AS:i:93 XS:i:0
ADD REPLY
0
Entering edit mode

so you don't have any read starting at 30...

ADD REPLY
0
Entering edit mode

it's only the beginning of the file... My aim is not to select reads starting at a defined position and ending at a defined position but get an interval of sequences between two positions even if I have piece of reads

ADD REPLY
1
Entering edit mode

You may find a way to trim the aligned sequence at position 30, then do the same with 230 and remove all the part-aligned sequences using FilterSamReads record.getStart()==30 && record.getEnd()==230

ADD REPLY
0
Entering edit mode

how can I trim aligned sequences at position 30 and 230?

ADD REPLY

Login before adding your answer.

Traffic: 1818 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6