Entering edit mode
2.6 years ago
Ishan
•
0
I have a CRAM file with paired reads which looks like this:
im13@node-13-21:~/scratch_im13_projects/im13_basespace_runs$ samtools view ./walkup_194_repeat/CRAM/A01_FR_KAPA_25x_1ug_SR_1ngx4rxns_S1.cram | head
D00586:937:HVCWGBCX3:1:1101:1485:1803 77 * 0 0 * * 0 0 NCAGAGGAAGCGGAACGCATGTTTC #<GGGIIGIGGGIIGIGIIGGG.<<
D00586:937:HVCWGBCX3:1:1101:1485:1803 141 * 0 0 * * 0 0 AGGGTGTTCGGGCCGCTGCTCTGCA GAGGGGGGGGGGGIIIIIIGIGGGG
D00586:937:HVCWGBCX3:1:1101:1440:1901 77 * 0 0 * * 0 0 NGTACCGTGCGACATCGCGAGTATC #<<<GGAGGIAAGIGGGIG<GA<<<
D00586:937:HVCWGBCX3:1:1101:1440:1901 141 * 0 0 * * 0 0 CTGTCTGTCTCAATGCCACACTGCA G<G<AGGGGGGGIGGIIIIIIGGGG
D00586:937:HVCWGBCX3:1:1101:1549:1836 77 * 0 0 * * 0 0 NTGAAGATGATCGCTTATACGTATC #<<GGGIIIGGIGGGIGIGIG<.<<
D00586:937:HVCWGBCX3:1:1101:1549:1836 141 * 0 0 * * 0 0 CTGTGTCGCCCTCGTCCCCGCTGCA AGGGGGIGGGIGIAGGGIIGGAGGG
D00586:937:HVCWGBCX3:1:1101:1705:1849 77 * 0 0 * * 0 0 NGGGAGAATGCCATGCATTGGTTTC #<<GGIGIIIIIIIIGIGGGIG<<<
D00586:937:HVCWGBCX3:1:1101:1705:1849 141 * 0 0 * * 0 0 GCCAGGAATTCCAGGCTCACCTGCA GGGGGIIIGAGIIIIIIGIIIGGGI
I would like to trim the ends of these 25bp reads to 20bp length, e.g.
for the first read: NCAGAGGAAGCGGAACGCATGTTTC
--> NCAGAGGAAGCGGAACGCAT
for the second read: AGGGTGTTCGGGCCGCTGCTCTGCA
-> AGGGTGTTCGGGCCGCTGCT
How can I do this and save the output?
Many thanks!
Looks like these are unaligned reads. You may be able to pipe these through samtools collate (reads seem to be collated but just in case) | samtools fastq | through a program that does the trimming (like bbduk.sh from BBTools) | samtools to CRAM (if you want to restore the format).
I also have the FASTQ files for R1 and R2 which were combined to make the CRAM:
can I run the trimming on the FASTQ files and then convert to CRAM? if so, how can I do this? apologies not familiar with bbduk.sh
Yes you can. Assuming you want to do this for both reads you can do something like
Then convert the files to CRAM.