I want to load bam file and the human_refseq.idx.fa
file to IGV. My questions are:
(1) Should I be loading the indexed file? testes.index
(2) Should the human_refseq.idx.fa
file be loaded as the genome?
I will detail the steps that I used.
First, I obtained the primary assembly and created the RSEM reference:
rsem-refseq-extract-primary-assembly GCF_000001405.31_GRCh38.p5_genomic.fna GCF_000001405.31_GRCh38.p5_genomic.primary_assembly.fna
rsem-prepare-reference --gff3 GCF_000001405.31_GRCh38.p5_genomic.gff --bowtie2 --bowtie2-path ./bowtie2-2.4.5-py39hd2f7db1_2 --trusted-sources BestRefSeq GCF_000001405.31_GRCh38.p5_genomic.primary_assembly.fna human_refseq
I then calculated the expression and align the sequence:
rsem-calculate-expression -p 2 --output-genome-bam --bowtie2 --bowtie2-sensitivity-level "very_fast" --append-names --paired-end s_2_1_sequence.txt s_2_2_sequence.txt human_refseq testes
Next, I sorted and index the bam file:
samtools sort -l 9 -o testes.sorted.bam -@ 4 testes.transcript.bam
samtools index -b -@ 4 testes.sorted.bam
samtools faidx human_refseq.idx.fa
Next, I load the bam file and human_refseq.idx.fa
file to IGV.
(1) Go to the IGV.
(2) Click "File"..."Load from file"...and then load the index file testes.sorted.bam
.
(3) Click "Genomes"..."Load Genome from file"...upload the "human_refseq.idx.fa" file.
Now, finally I want to include the top over-expressed gene in thyroid tissue (i.e., DDX11L1) in the output, which is identified using R. I entered the gene name (DDX11L1) in the IGV search box but it displayed a "Cannot find feature or locus: DDX11L1" error message.
If using one of the IGV provided human reference genomes is an option for you, go ahead with hg38/hg19 you will find the gene of interest there. You may open IGV, from the top left corner specify which genome you would like to work with then go ahead and load the bam files.