Samtools: Get Alignment (No Overlap)
2
0
Entering edit mode
13.4 years ago
Zhshqzyc ▴ 520

If a chromosome region is given, which command can get the sequence alignment without overlapping?

Thanks.

samtools • 3.2k views
ADD COMMENT
1
Entering edit mode
13.4 years ago
Zhidkov ▴ 600

did you look on samtools FAQ? http://sourceforge.net/apps/mediawiki/samtools/index.php?title=SAM_FAQ

ADD COMMENT
0
Entering edit mode

I looked at it but there are many samtools command options. I still don't understand it. For ex, IF position is chr21:38781803-38782933, then what is the command?

ADD REPLY
1
Entering edit mode
13.4 years ago

From the samtools help or the samtools website:

samtools view BAMFILE chr2:1,000,000-2,000,000

If this does not do what you want, perhaps you can clarify your question a bit.

ADD COMMENT
0
Entering edit mode

The output contains the raw data with overlap. But I don't want overlap.

ADD REPLY
0
Entering edit mode

don't forget to put in new bam file, so you have to add > new.bam

ADD REPLY
0
Entering edit mode

You will need to write code to get reads that do not overlap other reads. I do not know of a simple tool that pulls out only reads that do not overlap other reads.

ADD REPLY
0
Entering edit mode

Socan we convert .bam to .fasta file? Then we can use microsoft biology tools get the result simply? Microsoft biology tools can open a fasta file and get the alignment sequence.

ADD REPLY
0
Entering edit mode

If the bam file is sorted, this shouldn't be hard - just output the first read, and drop the reads starting before its end. But why would you want this?

ADD REPLY
0
Entering edit mode

Okay. So you meant using samtools sort -o in.bam out.prefix? How can I get the first read? The bam file is to big, 200gb. How long it will take?

ADD REPLY
0
Entering edit mode

"samtools view BAMFILE | head -1" will give you the first read. I think that Ketil and I still have no idea what you want to do, though. Please edit your original question with quite a bit more detail.

ADD REPLY
0
Entering edit mode

zhshqzyc: why don't you extract a smaller data set and experiment a bit with samtools? If you have illumina reads, you could use 'head -40000 data.fq > small_data.fq' to extract the first 10K reads. That should be a bit more manageable than your 200G file and help you work out what you want to do, and how to achieve it.

ADD REPLY

Login before adding your answer.

Traffic: 1291 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6