Regarding Bowtie-1.0.0
3
0
Entering edit mode
11.7 years ago
Raghav ▴ 100

hello every one, i have 454 sequencing data and i want to map it on reference genome, for that I have build data base in bowtie folder (to map my reads) and my data file (.fna file) contain reads with unique ids like

>read1
ACGATCGCAGATC----hindIII site-----CGATCGCATCGCATCAGCAT.. 
>read2
ACGATCGCAGATC----AAGCTT with addition of 1 or 2 nts in between cut position-----CGATCGCATCGCATCAGCAT..

and so on... I want to map such type of sequence to my reference genome but i don't how to to choose parameters in bowtie.. If any one has experience of such type of mapping please help me out. you may also suggest better mapping tools.

waiting for your valuable comments.

bowtie 454 • 2.3k views
ADD COMMENT
0
Entering edit mode
11.7 years ago
Ido Tamir 5.2k

Its not completely clear to me how your reads look like, but i think you have some sort of fixed 5' and 3' adaptor - or are the ---- your insert sequences and you have a hindIII site somewhere in read1 but not in read2?

With bowtie 1 you would have to trim the reads. If its an exact number of nucleotides and the data is in fasta/q format this is easy with eg. fastx trimmer where you can enter the first base and the last base to keep. Or, because its 454 so not a lot of data and you seem unexperienced with bioinformatics upload it to galaxy and use the tools there (fastx trimmer is there IIRC).

With bowtie 2 you could align in local mode without needing to trim. But thats suboptimal.

ADD COMMENT
0
Entering edit mode
11.7 years ago
Raghav ▴ 100

dear Tamir, thank you for your kind reply. my reads are look like:

G6H7M2K02GSJ6M rank=0000448 x=2668.5 y=2204.0 length=101 GTGTTGGGTGTGTTTGGTGTGTTGTTTTCTAACAAGGATACACTACTTAGGCTTTTAAGATCGGGTTGCGGTTTAAGTTCTTATACTCAATCATACACAAT G6H7M2K02IOEII rank=0001256 x=3441.0 y=1256.0 length=47 GTGTTGGTGTGTTGTTGGTGTTGTATCAGTCAGCACACAGGGAGTAG G6H7M2K02I46FY rank=0001789 x=3632.0 y=1612.0 length=61 TATAGTAGATGAGGTCTAGTCCTAAACTCGTCTCGTCTAACACCTATATAAATAGGTTTAC

in each read,hindIII cut site present somewhere in between. and theoretically each read should be mapped at two different locations (either on the same chromosome or at different chromosomes). I am a new one in this field [NGS data analysis] and now i am looking the same task in bowtie2 but unable to fix parameters in bowtie2 I used this command for bowtie1 and did not get desire out put ./bowtie2 < reference data base> -f <query> --all <outputfile>

I am expecting result something like that:

G6H7M2K01C5MVF, 75..19 of 345 and Chr4, 5564653..5564709 of 18585056 (52/57 ident)

75 TTCCCCCATCAAGAAATAGAACTGACTAATCCTAAGTCAAAGGGTCGAAAAACCCAA 19

5564653 TTCTCCCATCAAGAAATAGAACTGACTAATCCTAAGTCAAAGAGTCAAGAAACTCAA 5564709

explanation::: 75 to 19 position of read id G6H7M2K01C5MVF of length 345 mapped with chromosome 4 at 5564653 and 5564709 and rest of positions of read are matched on same chromosome at 5564536 and 5564622

G6H7M2K01C5MVF, 212..125 of 345 and Chr4, 5564536..5564622 of 18585056 (82/89 ident)

212 TATCCATTCTTATTCGATCACAGCGAGGGAGCAAGTCAAAATAGAAAAACTCACATTCATTGGGTTTAGGGATAATCAGGCTCGA-ACT 125

5564536 TATCTATTCTTATTCGATCACAGCGAGGAAGCAAGTTAAAATAGAAAAACTCACATTTATTGGGTTTAGGGATAATCAGGC--GACACT 5564622

waiting for your kind reply,

ADD COMMENT
0
Entering edit mode

You should add your response to the answer as a comment, or include your clarification as a comment to your original post.

ADD REPLY
0
Entering edit mode

As matted suggested, it would have been better to rewrite your question with this information. Also, to take the aligners that matted suggested. But before this, I think there is no way around you preprocessing the reads by a) trimming them b) splitting them at the hindIII site and then aligning each one of these split reads separately to the genome (with an aligner usable for 454 sequences) and then combining the alignments of the two split reads (or maybe paired end mode with discordant reads works). And again: use bowtie2 in local mode

ADD REPLY
0
Entering edit mode
11.7 years ago
matted 7.8k

Why do you want to use bowtie 1? That's close to the worst choice you could make for 454 data.

454 data is characterized by short indels stemming both from homopolymer repeat length errors and the general error structure of pyrosequencing.

Try searching this forum (and other sites) for better tools to use with 454 data:

What'S The Best Reads Aligner For 454 Data Now???

What Aligner To Use For Paired-End 454 Reads?

ADD COMMENT

Login before adding your answer.

Traffic: 2260 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6