Entering edit mode
2.9 years ago
Javier
•
0
I am transforming sam files to bam, to facilitate their ordering I use this command,
% cd /Volumes/GENOMA/BWA
% samtools sort -n -O V350019555_L03_B5GHUMqcnrRAABA-551.sam | samtools fixmate -m -O bam V350019555_L03_B5GHUMqcnrRAABA-551.bam
but it gives me the following error,
As elsewhere in samtools, use '-' as the filename for stdin/stdout. The input file must be grouped by read name (e.g. sorted by name). Coordinated sorted input is not accepted. (base) javier@iMac-de-JAVIER BWA %
I use the version of samtools 1.10
I have done what you have told me and it has generated 24 files in bam format between 25 and 240 MB. But what interests me is having only one. Wouldn't each file be a chromosome? What do I have to do to have a single file?
your pipeline crashed. You're looking at the temporary files.
and what do I have to do? what can it be due to?
You will need to restart the job. Delete the
.tmp
files before you restart the job. If it fails agains then you either do not have enough RAM or disk space to complete this job.I'm doing what you tell me. 32 bam files are generated, but after about three minutes, the 32 bam files are deleted, showing this message in terminal
That means the job was done. You should now have a sorted BAM file with mates fixed. What do you see with
and then for this command
If I actually do what you say, and the sequences grouped into 3 chromosomes appear. I don't know why in 3, and not in all because I thought that the readings of the entire file are of the entire genome
Now I try to remove the duplicate reads, and I get this error
That is because I asked you to provide only top 4 lines in output by using the
head -4
command.Providing a word description does not help us with debugging the problem. Please show the output of the commands I had asked for in previous comment.
This is that I watch in the terminal
What is LN?
LN
is length of that chromosome in bases. So it does look like you have a co-ordinate sorted BAM file.Do I see the same name in that command line for both input and output? You can't use the same name for input and output. Assuming you have not corrupted BAM file use a different output name for
samtools markdup
command for output. If the file has become corrupt you will need to re-do the previous step.If indeed it was what you said, and I have given it another name
I have not yet downloaded it in igv, but I have done a quality control, but no values appear
what do you think? keep in mind, that sol is a file and I have to merge it with 7 others, (by the way, I already know how to merge, it works the way you said the other day)
Unfortunately
flagstat
output is no good. It is saying you have nothing in your BAM file. Go backwards (if needed all the way to the original fastq files) and check every file for size. Make sure the file sizes are in line with amount of data. Somewhere along the way something failed and you have been dragging empty data files along.It generates some files, but they are deleted How do I get a bam file?
_L03_B5GHUMqcnrRAABA-551.sam | samtools fixmate -m -O BAM - V350019555_L03_B5GHUMqcnrRAABA-551.bam
[bam_sort_core] merging from 33 files and 1 in-memory blocks...
My advice still stands. Check every step of your analysis. Snippets of logs you are posting in your comments seem to indicate the right outcome but since we can't see your data/datafiles you are the only one who can sort this out.