Entering edit mode
7.9 years ago
Korsocius
▴
260
Dear all, I would like to create table from bam file. Where will be information about GC and for read_count in lenght range.
INPUT: tsv file
CHROM Start stop length GC
chr1 56971 57065 94 0.287234
chr1 565460 565601 141 0.411348
chr1 754342 754488 146 0.520548
chr1 754392 754544 152 0.532895
chr1 754392 754544 152 0.532895
chr1 767020 767159 139 0.345324
Output:
chrom region 0-10 lenght 10-20 .................190-200
chr1 0-60000 Read count,mean GC
So, output will be some summarization in 60kb region. where will be information about count of reads and average GC for this count from 5th column and it will be for each length.
Now I have for binarization:
for z in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y
do
export $z
for i in {0..249480000..60000}
do
u=$i
let "u +=60000"
export $i
export $u
samtools view *.bam chr$z:$i-$u |
done
done
Thank you Fedik
I ve got this. But I need separate by length in region.... This is biggest problem for me. And for you the easiest way :
its great if your problem is solved. I'm experimenting a lot with process substitutions. This solution came as a result of one of experiments. Anyways the problem is solved :)
No it is not :-( I need summary for length distribution. F.e. : for chr1:0-60000 GC and Read count for length 0-10 and GC and Read count for length 10-20 etc to 190-200bp...and for another region...
Change regions and follow above mentioned method. It will work.
I am really lost in space with this :-(