Filtering a 10X generated .bam file based on a list of barcodes
0
0
Entering edit mode
13 months ago
jhnicolas • 0

Hello everyone,

Basically, I have clustered and annotated the barcodes in R, then I wanted to look at reads in several particular clusters in IGV. I generated a barcode list following the 10X tutorial as the picture below shown, briefly subset the clusters in R then tagged the barcodes with CB:Z: and saved as filter.txt. While I could generate a 2 Kb header file, the later step always gives me a 0 Kb filtered_SAM_body file.

I tried:

(following another thread)

samtools view -D CB:FILTER.txt BAMFILE.bam -h -b -o FILTEREDBAM.bam

===

(following the a third thread...[lost the page])

samtools view BAMFILE.bam | LC_ALL=C grep -F -f FILTER.txt > filtered_SAM_body

===

(modified 10X)

samtools view -H BAMFILE.bam > SAM_header

samtools view BAMFILE.bam -X INDEX.bai | LC_ALL=C grep -F -f FILTER.txt > filtered_SAM_body

===

(10X)

samtools view -H BAMFILE.bam > SAM_header

samtools view BAMFILE.bam  | LC_ALL=C grep -F -f FILTER.txt > filtered_SAM_body

===

I also tried adding CB: prior to filter path or removing CB:Z from the filter.txt file... none worked all gave me the 0 Kb file.

Any Suggestions will be very very very much appreciated.

Thank you.

List of barcodes generated following 10X tutorial

snRNA-seq scRNA-seq samtools BAM RNA-seq • 1.3k views
ADD COMMENT
0
Entering edit mode

Just double checking, but for the first approach, did you ensure your samtools is <=1.17?

ADD REPLY
0
Entering edit mode

Yes. took me sometime.... I deleted 1.19, suppose I unzipped 1.17 (wherein I couldn't find sourceme.bash). Eventually I figured out I was using the 1.10 version originally by typing just "samtools", I then changed to use the CellRanger built in samtools, which is 1.12.

The results were the same, blank filtered bam files.

ADD REPLY
0
Entering edit mode

This code is weird:

samtools view 20230514_CellRanger_Output/Backup_before_filtering/20230511_Mut_7/outs/possorted_genome_bam.bam | LC_ALL=C grep -F -f SubsetBam/Filters/mut_7_L5ET_filter.txt > filtered_SAM_body

It should just be:

samtools view INPUT.BAM | grep -F -f headers.txt > filtered.sam

Also we don't need to see the file names. You should use placeholders like I have above. It makes it easier to debug the code.

ADD REPLY
0
Entering edit mode

Are you sure the LC_ALL=C is not required or is that change defeating the point of adding that modifier in the first place?

ADD REPLY
0
Entering edit mode

Unless the tools are outputting some weird character formats, the answer is no don't set LC_ALL. See here for more details: https://unix.stackexchange.com/questions/87745/what-does-lc-all-c-do I think OP copy pasted some code from somewhere without really understanding why

ADD REPLY
1
Entering edit mode

I think that might be relevant here (users on the forum don't go out of their way to add that setting unless something weird happened without that setting). But you could be right - maybe the OP on the original thread with this setting had some special requirement. I'll see if I can find that thread.

EDIT: I could not find that thread. I'd recommend trying it both with and without the LC_ALL=C (maybe a limited grep -m100) to see if anything changes.

ADD REPLY
0
Entering edit mode

Changed that, thank you very much and sorry for the inconvenience.

ADD REPLY

Login before adding your answer.

Traffic: 2375 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6