Hello,
I tried to do some bisulfite alignment with some data sample from internet. I chose bismark for alignment and methylation call. After I ran the process for several sample, the result is strange. It seems it failed to do the alignment because in the text file from bismark, there is only 1 or 0 values for the methylation detected, unmethylation, and coverage. So, I think it's because I use the wrong human genome reference. I download the reference from this site: http://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/
I download hg38.fa.gz and use it for bowtie indexing. Because of that, I think maybe I use the wrong file for bowtie indexing. Is there any advice which file I use?
Additional details:
My data: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE61150
I use trim_galore to trim the data and use FastQC to check the quality before and after the trimming process.
Bismark version 0.14 (just download it from the website)
Use bowtie (not bowtie2, because it said bowtie is good for shorter reads, mine is short read, 46 bp)
Commands I use:
To build the genome rindex :
bismark_genome_preparation <folder> #(I already set every executable in the environment variable)
To call the methylation data:
bismark -n l -l 50 <reference> <file fastq>
bismark_methylation_extractor -s --comprehensive <file .sam>
Thank you very much for your help.
bharata1803,
bismark
(alignment command ) should report % of reads mapped. If percentage is high then your genome version is good (as Devon mentioned there shouldn't be anything strange when using hg38). However, your problem might be from thebismark_methylation_extractor
.I had very similar strange methylation calls and it was due to bugs in
bismark_methylation_extractor
code. Make sure you're using the newest version.Try hg19. Not sure if hg38 has been standardized enough for successful use across tools.
Could you post the version of bismark and exact command lines/script used? It would help us to find an answer for you.
Thank you for your answer, I have edited my question. Everything is like the user guide said. I'm still new and this is the first time I use it so I don't configure much.
hg38 will work fine with bismark. At the end of an alignment run, bismark will print to screen (and to a file, if I recall correctly) various alignment metrics, such as the number of reads seen and the number aligned. If you have low-coverage WGBS data, then having low coverage results after methylation extraction is expected.
Also, I would recommend bowtie2 over bowtie for humans, regardless of the read length. Human data will show more than a few indels, which bowtie won't handle well.
Also, post a couple lines of the file with the methylation extractor results that you're looking at. You should be looking at the "coverage" files, which will have 5 columns.
Thank you for your advice. I will try it.