Hello everyone
I am newer in bioinformatics. I would like to ask for your advice, helps, or comments on RNA-seq analysis in the class. Question: I have 4 FASTA sequencing files of 4 different cell lines. They have been prefiltered to contain only the sequencing reads from human chr21. They are VCap.fasta, LnCap.fasta, EP.fasta, and PrEC.fasta. NOTHING ELSE. In addition, I have received the reference sequence., the full sequence of chr21 in FASTA format (chr21.fa). Notice that all reference sequences and annotations here are from hg19 reference genome.
Use suitable software to align all four RNA-sequenced samples.
Normally, I use hisat2 or Kallisto for RNA-seq. I often have R1.fastq and R2.fastq for one sample, then I can run the command to change fastq - sam -bam file. Then I can align them with references. However, here I have only one FASTA file for one sample, and it is already prefiltered to contain only the sequencing reads from chr21. Could anyone help?
I think you can use
hisat2
with fasta files in addition to fastq files. First thing I would do is to check if you have pared end or single en reads in those fasta file . If paired, I will separate then into 2 fasta files, one withR1
and one withR2
and runhisat2
giving as reference thechr21.fa
file. If single end, you can runhisat2
in se mode (I dont remember the exact argument for this, but there is an option).thank you for your suggestion, I have checked it and I can use it for single alignment with -U option of hisat2
You can make it cheap and transform them to fastq giving them any dummy quality line.
If you have
You can make it as
Either you
awk
something together or use https://github.com/lh3/seqtkThank you so much for suggestion!