Entering edit mode
2.2 years ago
Chris
▴
340
Hi all,
I run the code below to assemble gene expression from aligned reads:
samtools view MT1/Tophat_Out/accepted_hits.sorted.bam | python -m
HTSeq.scripts.count -q -s no - ~/Indexes/Mus_musculus/UCSC/mm10/Genes/genes.gtf > MT1/MT1.count.txt
Then I got this error message:
Error occurred when reading beginning of SAM/BAM file. file has no sequences defined (mode='r') - is it SAM/BAM format? Consider opening with check_sq=False [Exception type: ValueError, raised in libcalignmentfile.pyx:1000]
samtools view -h MT1/Tophat_Out/accepted_hits.sorted.bam
@HD VN:1.0 SO:coordinate
@SQ SN:1 LN:195471971
@SQ SN:10 LN:130694993
@SQ SN:11 LN:122082543
@SQ SN:12 LN:120129022
@SQ SN:13 LN:120421639
@SQ SN:14 LN:124902244
@SQ SN:15 LN:104043685
@SQ SN:16 LN:98207768
@SQ SN:17 LN:94987271
@SQ SN:18 LN:90702639
@SQ SN:19 LN:61431566
@SQ SN:2 LN:182113224
@SQ SN:3 LN:160039680
@SQ SN:4 LN:156508116
@SQ SN:5 LN:151834684
@SQ SN:6 LN:149736546
@SQ SN:7 LN:145441459
@SQ SN:8 LN:129401213
@SQ SN:9 LN:124595110
@SQ SN:MT LN:16299
@SQ SN:X LN:171031299
@SQ SN:Y LN:91744698
@PG ID:TopHat VN:2.1.1 CL:/gpfs2/global/tophat-2.1.1.Linux_x86_64/tophat -r 200 -p 8 -o /gpfs2/home/user/protocol_1/data/MT1/Tophat_Out -G /gpfs2/home/user/ncbi-genomes-2022-09-07/Mus_musculus/Ensembl/GRCm38/Annotation/Genes/genes.gtf /gpfs2/home/user/ncbi-genomes-2022-09-07/Mus_musculus/Ensembl/GRCm38/Sequence/Bowtie2Index/genome /gpfs2/home/user/protocol_1/data/Fastq/Trim_MT1_1.fastq /gpfs2/home/user/protocol_1/data/Fastq/Trim_MT1_2.fastq
@PG ID:samtools PN:samtools PP:TopHat VN:1.12 CL:samtools view -h /gpfs2/home/user/protocol_1/data/MT1/Tophat_Out/accepted_hits.sorted.bam
SRR213132.1920910.2 385 1 3155596 0 80M = 890971485941625 AATTCAACTAGAAAGTTACCTACCAAGTACTAATTAGAATTATAAAGTCAGAGTCTGCAGCTCCAGGCCTTTCAGTTAGT A>CE+<4<B<<?EHHB?:?AA9<AFGC:ED@GDF*:**:B<FCII>B??DGE@/?BFGBBDADF;@EDDACECC=A?;;7 AS:i:-2 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:37C42 YT:Z:UU NH:i:20 CC:Z:= CP:i:7949540 HI:i:0
SRR213132.34347266.1 113 1 3593111 50 80M 19 3806813GAACTCCCTAGTTAGAACCATGACTTGGGTGTGAAAGCCCTTGCCCTAGGAGTGAAAAGTCCTCACACCTGGCTTCTAAGC?5?;?>CDEFDFCCDHHHHHHIIIGHGHGFEGIIGBFIIIJIHFD<JIIHGIJJJJJGGHCHEFIHIIIJJJJJJJHHHAS:i:-5 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:60T19 YT:Z:UU XS:A:- NH:i:1
...
Would you please what I need to do to fix this? Thanks a lot!
Perfect. Thank you so much! I use the code the author used but for some reasons, we need to add -h to the samtools command to make it work.
Protocol 1, step 4.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6373869/