Hi for everyone
I have run the SparkBWA (https://github.com/citiususc/SparkBWA )tool for read mapping in AZURE Microsoft (HDinsight cluster) and it works OK but it cost me while running because of cloud services
and I need to do some modifications on the code of SparkBwa and test the code but this will be very expensive for me if i conducted on Azure and see the results for modification so, is there any suggestion or solutions for a quick test since large test takes long time? Can I use a small read file I mean fastq and fasta even not real data? because I just want small datasets
Because I have tried to run the SParkBwa in my local machine but it hanged-in and a black screen appeared may be because of the large reference and the huge number of work
so can I change the two read files as just small once
and if the code OK I will try it with the real data later on
thanks for your help
I am new with bioinformatics
can you give me a link to download the data for viral or bacterial that I need for paired ends reads (read1.fasta and read2.fasta) with the fastq file
Thank you for your help
A very famous virus genome is that of Phi X, which is maybe too small. A small bacterial genome is for instance that of Mycoplasma. E. coli has a bigger genome.
In general, you can search NCBI's Nucleotide database, or Ensembl bacteria.