Hello!
I am new to processing snRNAseq data, and am going through the 10X cellranger pipeline.
I could not find any bam files to download from the SRR numbers, so I chose to process raw fastq files.
This is how I downloaded the files, using 1 as an example:
- prefetch SRA files
prefetch SRR9141212
Fastq-dump zipped files
fastq-dump --split-3 --skip-technical --readids --read-filter pass --gzip /project/nilslind_369/cfausto/wilson-diabetes/input/pretrim/SRR9141212.sra
Rename the two fastq.gz files to match cellranger count nomenclature (moved into one directory)>
mv SRR9141212_pass_1.fastq.gz ctrl1_S1_L001_R1_001.fastq.gz
SRR9141212_pass_2.fastq.gz ctrl1_S1_L001_R2_001.fastq.gz
- Script for cellranger count
cellranger count --id=ctrl1 \ --transcriptome=/project/nilslind_369/cfausto/Ref_Genome/hg38/Homo-sapien_genome \ --fastqs=/project/nilslind_369/cfausto/wilson-diabetes/input/ctrl1
Here is the error message when the job fails early on:
FASTQ header mismatch detected at line 4 of input files "/project/nilslind_369/cfausto/wilson-diabetes/input/ctrl1/ctrl1_S1_L001_R1_001.fastq.gz" and "/project/nilslind_369/cfausto/wilson-diabetes/input/ctrl1/ctrl1_S1_L001_R2_001.fastq.gz": file: "/project/nilslind_369/cfausto/wilson-diabetes/input/ctrl1/ctrl1_S1_L001_R1_001.fastq.gz", line: 4
When I head the fastq file, it looks like it's in the wrong format:
oߎ#??$x?~?4?xr/?/9??~?????O?8??o?<?u?y?s?<ߐy???ݩ??'???,?o?{
??Szeu??????????_??????????VJ????????~Q??z????????Z?T??W??Z??????????????~???w????V??????????uY㕿????g?M?g??wu???/???Ϫ????r?W?ˌ????*???????G5?S2 ???k?N7&?????me?*(?ɿVg<Qȳk~\u?????????????}?xM????~:)ڿߝf?_??????{~?.1?H??wr??????)m?㱮?@?s~~?+d}???<Mt??????j<_??_?_k?3]??S|3?y??[?H??M@^u}?s??[?_??????F??Q??????o?^???ʣ?T9????R?{~h??'s^?u6??Ϳ?o??o?2o?{<r?'rZ???J???_?????a??Gd?y?tH+V?y?? ?ߺ Ug???c??K\?u???n??ư>G??!?oÃ?ux?????4??H?b??j?S[ϯ?+??<gu5י>o???\[?=????1?;?naVg??F??uZ?a? ??F?c?je?{>|? ??wu??_?f????"?1????W????<??????S?#??D? ?.??0??)x]??VS?????m
?g|?<???98?c?.?gC'?<c c[?S????Qob????????s??Y??????L??c???????p[?? =(u?u?V?Z??0?????U?/?W?t?-?v0?k??1?????v?yf??? S????Z????Z_.؝u<????T'?ǜ,???it7?c?????0?????P??җ*?y?+:?xoQZ??*?????n?9Yއs sPu+?ʕ??>?x?Ї??,?????g]???)
?Ū?GU?6=????W?I?^?gG??????+8C <.??4?RKu?t???繩? |?M??5?O=?? :?? ?? ???q??D? ?n???HѵY}?q???.????s???_+W=S?HH??F???5$m?go???0E??Z?H= ??y֭??????챭??:?M?n?s??Ζ?1O????y?/??kZ?z?x???)-?????ş????pO?pBգB [3}p͍ї??H???1'?Ӿ?|? ?a1?? ]?ސ~?l ??? Lϐ??q?x???0???!>?b?y??.??3}I)?A?RlhhO?|???p??nc?> /? |j<??*?:-?4DЊΎ?ng????? ?
-?aٲ?<?4??????go????튇??TKc䝓?X?5D??V??+? B?v??gl??????<{C?o?<?U?? Z??Bh??W??j?Sp8??"C%?????ˈ??3M???go???????8?:?+??x???X?'?????3?7?????z????H>?p?``E̯?] }?x??????_"???>??d?b*賍X??l?????J??,?Bi??]????go??????Ly?????????y??>>ݺ??3h?v?CQ?f?0????S"?R?y???d?????3??5?
w??c??>??O¤?n=?1hO/Rp? D'xe?F????/????Z?I??@?]???1??Ї?Hj ^ ??8???Z?bɊ?oF?LZ????ME8?g?oanj/?y??>?F?OavO%?F??!"ţ!??D?.??!M??1'?o?7? ]??!??xg]?:+C?:?-?Y??ND?u?/??<?d???,^???k??t? ?72{C???y.86?-WR??9?^"?b?? Z@??3G??7a]?? j???O?n s?9???Sy\?>?vY??cN?G ??h??????z??
???En?q]??o?G????dy{????? ???I???<??;N1wsUB??b?9??]??Ni?G%$j???%??-?k?=???|?<?dy??/?y?????{??W??j=- ?W?r?~?<S??˶?$?r???cN?|?]??n̳7t]{CL?kΝ0 ???~l?<l??Q?K??GQd?$>???Kh?}???S|?:{CW? ??(?1̀???o1?k9?y̳7t??0aNH@x?2?? 6t?????H:P??ƙO?z??K?? ? e˯)no=???n??Z=?dy??^;?z?|??????i*脓???W????????V?Sx? ?????<{C??ÌO??撬?l?8N??ې?m~? ?8<??[?zvs?<????G?=?<??]??R?2d?iV???bT?8???1??(+n&NM???O?????<{C???n??P^.? ?𭱀[4S?f??.??[:?Hb%???? ????<{C??7D?&?Q?f???B?os]??#.??:O`???4???ݕI<!?/?P?????n^????s
I'm not sure why it looks like this, maybe because it's a zipped file? Any different way to get the fastq files would be greatly appreciated!
If the files are out of sync you can use
repair.sh
from BBMap suite to bring the files back in sync removing any singleton reads.what is the output of
and
The outputs look like normal fastq files:
R1
R2
Download fastq files from this link: https://trace.ncbi.nlm.nih.gov/Traces/index.html?view=run_browser&acc=SRR9141212&display=download
I'm wondering how this would be different than downloading with
fastq-dump
because it's a 58GB file, which I wouldn't necessarily want on my personal computerDumping 10x data via sratools has caused issues in past. If you are sure the file you have is ok then carry on.