Hi, all,
Recently I am using samtools sort to sort a bam file. Unfortunately I add -m 4G to samtools sort, thousands of tmp files are created. The program ran very slowly and I terminated the program.
But no matter how many times I tried to delete those tmp files, they will regenerated again.
I thought this may caused by the 'flush' program of linux system. I though it may keep output buffered data to the disk. Then I restarted the computer. However, these files still can not be deleted.
I wonder if there are specific mechanisms to maintain the existence of these files and how can I delet them?
Thank you!
Here is the code:
----sort
samtools sort -m 4G in.bam out
#unfortunately samtools sort may not recognize 4G.
----remove files
find . -type f -print -delete
Thos files look like:
> ./accepted_hits.uniq_extend.867800.bam
> ./accepted_hits.uniq_extend.3103140.bam
> ./accepted_hits.uniq_extend.4923639.bam
> ./accepted_hits.uniq_extend.2051549.bam
> ./accepted_hits.uniq_extend.2558274.bam
> ./accepted_hits.uniq_extend.4479815.bam
> ./accepted_hits.uniq_extend.4228120.bam
> ./accepted_hits.uniq_extend.1447749.bam
> ./accepted_hits.uniq_extend.3727934.bam
> ./accepted_hits.uniq_extend.1875384.bam
> ./accepted_hits.uniq_extend.1119027.bam
> ./accepted_hits.uniq_extend.1760717.bam
> ./accepted_hits.uniq_extend.2232630.bam
> ./accepted_hits.uniq_extend.3866979.bam
> ./accepted_hits.uniq_extend.3403807.bam
> ./accepted_hits.uniq_extend.3445816.bam
> ./accepted_hits.uniq_extend.3545020.bam
> ./accepted_hits.uniq_extend.4257818.bam
> ./accepted_hits.uniq_extend.494616.bam
> ./accepted_hits.uniq_extend.1170807.bam
> ./accepted_hits.uniq_extend.4204209.bam
> ./accepted_hits.uniq_extend.2165281.bam
> ./accepted_hits.uniq_extend.1431678.bam
> ./accepted_hits.uniq_extend.2309349.bam
> ./accepted_hits.uniq_extend.2084188.bam
> ./accepted_hits.uniq_extend.2189091.bam
> ./accepted_hits.uniq_extend.4600442.bam
> ./accepted_hits.uniq_extend.783378.bam
> ./accepted_hits.uniq_extend.4129533.bam
> ./accepted_hits.uniq_extend.1766329.bam
> ./accepted_hits.uniq_extend.167663.bam
> ./accepted_hits.uniq_extend.797810.bam
> ./accepted_hits.uniq_extend.4724017.bam
> ./accepted_hits.uniq_extend.289114.bam
> ./accepted_hits.uniq_extend.76259.bam
> ./accepted_hits.uniq_extend.1144843.bam
> ./accepted_hits.uniq_extend.3280847.bam
> ./accepted_hits.uniq_extend.520985.bam
The attribute of parent_directory
ls -l parent_directory
total 352744
drwx------. 4 ct omics 360845312 May 21 08:29 tophat
The attribute of one tmp file
ls -l ./accepted_hits.uniq_extend.66430.bam
-rw-r--r--. 1 ct omics 1142 May 10 03:35 ./accepted_hits.uniq_extend.66430.bam
Another thing, when ls
in the directory
ls: reading directory ./: Too many levels of symbolic links
total 352744
I do not understand why so many levels of symbolic links?
Have you tried using
rm -f
on these files? What is the exact error-messagefind
gives you?Also,
find . -delete
will delete everything in the current directory and sub-directories - are you sure that's what you want to do?I have tried
/bin/rm -f
. No error message fromfind
.What do these temporary files look like? Can you give me some example file-names?
Edit: I just tried it with
samtools sort -m 4G a_file.bam
on my machine, cancelled the process and deleted all bam files usingrm *.bam
, no problem. Could it be that you have a hardware problem?A list of files:
accepted_hits.520859.bam accepted_hits.3280847.bam accepted_hits.1144843.bam accepted_hits.76259.bam accepted_hits.28914.bam accepted_hits.4257818.bam
I can delete or generate other files in this directory or other directories in the same disk. I do not know if it is a hardware problem.
Another idea: Try deleting these files by their inode-numbers as described here
Everyting seems fine except the files can not be deleted. Thanks, Philipp.
Does the partition in which these files are written have an
acl
flag set? Take a look at/etc/fstab
and see if you see theacl
flag. Perhaps (for whatever reason)samtools
is setting an ACL flag on the temporary files that prevent their deletion? Or do you have a sticky bit set on the parent directory, andsamtools
was built and is owned by another account on your system? In that case,samtools
may be creating files that are owned by that other account, and you would not (by default) have permissions to delete those files.Thakns Alex. I ckecked
/etc/fstab
, and noacl
flag found. No stricky bit set on the parent directory (drwx------.
).samtools
is built by myself. I have the right to delete files. I even can see the difference before and after deletion. However, no matter how many time tried, those tmp files are still there. Some of them may have name changed and no files have new modification time.I would ask sysadmin to check permissions and filesystem. The error you indicating about levels of symbolic links doesn't look good and might be a cause of this. It might be that in these links you have some (self) loops - so system is simply unable to figure out what to do and refusing to proceed.
Thanks, seninp. I searched before and found the (self) loops thing. However, I can not find any symbol links under this directory.
are you working on the local filesystem or is it network, and what is the type? I assume you are the ct user of omics group, right? what if you run "lsattr" on the folder?
I am working on a server which have shared disks by a network. The type of disk is
nfs
. When I runlsattr
it returns"Inappropriate ioctl for device while readsing flags on [various dir names]"
. I amct
user ofomics
group.my bad. you would need to run "lsattr -d" in the folder and "lsattr filename" to see the extended attributes of the file you want to delete. for example here they discuss the a attribute. Well, maybe you need to be root to manipulate these, I am not overall too familiar with all this, so I'd call for help at this point.
Still same errors. I searched and found
one can not check or set attributes on an NFS-mounted filesystem
. I do not know if this is the reason.it might be. i have no idea what to do further with nfs, hope you will solve the issue. good luck!
Thank you for your help, seninp.