How to obtain anchors of specific length(eg-4 mer) from both the ends of the read.
1
0
Entering edit mode
9.3 years ago
neha 114 • 0

Hello,

I am working with RNA-paired end reads using Bowtie2 .I am beginner in this and want to obtain specific length of nucleotide anchors from both ends of each mapped read, in a file.

I will be highly thankful,if someone suggest me how to go for this step.

RNA-seq • 1.9k views
ADD COMMENT
2
Entering edit mode
9.3 years ago

Using my tool Bioalcidae: https://github.com/lindenb/jvarkit/wiki/BioAlcidae

The javascript:

var max_len =10;

while(iter.hasNext())
    {
    var rec = iter.next();
    if(rec.getReadUnmappedFlag()) continue;
    var cigar = rec.getCigar();
    if( cigar == null ) continue;
    var read = rec.getReadString();
    if( read == null ) continue;
    var seq = "";
    var readpos=0;
    for(var i=0;i< cigar.numCigarElements();++i)
        {
        var ce = cigar.getCigarElement(i);
        var op = ce.getOperator();
        if( ! op.consumesReadBases() ) continue;
        for(var x=0;x < ce.getLength();++x)
            {
            if( op.consumesReferenceBases())
                {
                seq+=read.charAt(readpos);
                }
            readpos++;
            }
        }
    if( seq.length < max_len*2) continue;//prevent overlap ?

    out.println(rec.getPairedReadName()+"\t"+ seq.substr(0,max_len)+"\t"+ seq.substr(seq.length-max_len));
    }

invoke;

java -jar dist-1.133/bioalcidae.jar -f script.js  samtools/examples/ex1.bam

output:

B7_591:4:96:693:509 1/2    CACTAGTGGC    GTTTAACTCG
EAS54_65:7:152:368:113 1/2    CTAGTGGCTC    TTTAACTCGT
EAS51_64:8:5:734:57 2/2    AGTGGCTCAT    TAACTCGTCC
B7_591:1:289:587:906 2/2    GTGGCTCATT    ACTCTTCTCT
EAS56_59:8:38:671:758 2/2    GCTCATTGTA    TCGTCCATGG
EAS56_61:6:18:467:281 1/2    ATTGTAAATG    CCCTGGCCCA
EAS114_28:5:296:340:699 2/2    ATTGTAAATG    CATGGCCCAG
B7_597:6:194:894:408 1/2    TGTAAATGTG    ATTGCCCAGC
EAS188_4:8:12:628:973 1/2    AAATGTGTGG    GCCCAGCATT
EAS51_66:7:68:402:50 2/2    GTGTGGTTTA    AGCATTTGGG
(....)
ADD COMMENT
0
Entering edit mode

Thank you very much :-)

ADD REPLY

Login before adding your answer.

Traffic: 1987 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6