Chromosomal arms with SNPs information using R
1
0
Entering edit mode
9.3 years ago
maruthi ▴ 10

Hi,

I have list of SNP for the respective arm of each chromosome. Now, I need to come up with a picture for each chromosome with the SNP coverage across the arms. May I know how I can do it using R ? Thank you and I will look forward for your kind knowledge share.

Thank you,

Maruthi.

R • 2.9k views
ADD COMMENT
1
Entering edit mode
9.3 years ago

This older thread has some code and ideas for plotting SNP density, all you have to do is to split up the chromosomes in the example code into chromosome arms:

Plotting Density Of Snps On Chromosomes

ADD COMMENT
0
Entering edit mode

Dear Philipp,

Thank you very much for your kind reply. I have made list of Chromosomal arms (both p & q separate files) with respective SNPs (number of SNPs for every 10 Mb) as text files. How can I use them to make the SNP density plot? Each file has something like

Chr     Start            End     Number
X    140400001 150600000 136

I have like that for every arm of a chromosome in separate text files. May I know how I can use either Philippe's or Irsan's suggestion!

Thank you

ADD REPLY
1
Entering edit mode

So you already have the binned values and can use most of Irsan's code, you could do something like

snps <- read.table("X_chrom.tab", sep="\t", header=T)
plot(snps$Start, snps$Number)

Gives you a really bare bones histogram since your bins are already present.

ADD REPLY
0
Entering edit mode

Thank you Philipp. I will give a go with your suggestion. This will be my first time using R. It feels so different as I am used to SPSS. Thank you for being helpful and for sharing your knwoledge.

ADD REPLY
0
Entering edit mode

Dear Philipp,

Do you know anyone where I can learn Programming needed for NGS stuff ! Do you provide any online classes ! I am ready to pay and the online classes if one can teach me more related with NGS than general programming. May be I am posting this at a wrong place, however I thought you might help me. If I want to send/post a message to you how can I send it !

Will look forward for your kind suggestion.

Thank you,

Maruthi.

ADD REPLY
1
Entering edit mode

Hi Maruthi,

There's always the Software Carpentry courses which teach reproducible programming, either R or Python, version control with git and sometimes automation or databases. Disclaimer: I'm an instructor for that.

You can always go over the lessons by yourself if there's no upcoming course in your area: http://software-carpentry.org/lessons.html

I can recommend two books, a "beginner's" one is "Practical Computing For Biologists" by Haddock and Dunn, one for more "advanced" programmers is Buffalo's "Bioinformatics Data Skills". A few more suggestions are here: What Is One Book Every Bioinformatician Should Read?

Then there's the Coursera online courses for bioinformatics, if you want a piece of paper at the end (the "specializations") you have to pay. The "courses" are free.

If you're someone who prefers "learning by doing" try the problem tracks at Rosalind. There are several different tracks with simple to complicated problems for you to try out using any programming language.

ADD REPLY
0
Entering edit mode

Hi Philipp,

Thank you for your suggestion. Nice info about Rosalind and Carpentry link. Thank you.

Good day,

Maruthi.

ADD REPLY
0
Entering edit mode

Dear Philipp,

By using the below command (which you helped me with)

snps <- read.table("X_chrom.tab", sep="\t", header=T)
plot(countSNP_chr1q$start, countSNP_chr1q$number,pch=19, col="blue")

I could do a SNP density plot for Chr 1q arm. However, if I want to plot all the chromosomes how can I modify the command ! I am not considering p arms for 13, 14, 15, 16, 17, 21 and 22 (due to the satellite nature of the p arms of the respective chromosome).

I will look forward for your suggestion.

Thank you,
Maruthi.

ADD REPLY
1
Entering edit mode

Hi Maruthi,

I guess the simplest case would be to iterate over your files, read the file into memory, then to plot it, to save the plot and then to continue, kinda like

for (i in list.files(".", pattern="*.tab") ) {
  tmp.tab <- read.table(i, sep="\t", header=T)
  png(gsub(".tab", ".png", i))
  plot(tmp.tab$start, tmp.tab$number, pch=19, col="blue")
  dev.off()
}

This iterates over all files in the current directory ending in tab. If you want another directory change the .. Then it reads in each table, opens a PNG image file with the same name as the tab file but ending in .png, plots the picture to that PNG, closes the file (dev.off()) and continues to the next file. I haven't tested this.

(I'm not using lapply since that's a bit confusing to beginners)

ADD REPLY
0
Entering edit mode

Dear Philipp,

The below modified command (based on yours - but included different color codes for p & q arm and X & Y axis defined)

for ( I in list.files(".", pattern="*.txt") ) {
  tmp.tab <- read.table(i, sep="\t", header=T)
  png(gsub(".txt", ".png", i))
  file_name<-basename(i)
  xlabel<-gsub(".txt","",file_name)
  xlabel<-gsub("countSNP_c", "C",xlabel)
  if (grepl("p", file_name)) {
    plot(tmp.tab$start,tmp.tab$number,pch=19,col='blue',xlab=xlabel,ylab='SNP_number')
  } else {
    plot(tmp.tab$start,tmp.tab$number,pch=19,col='red',xlab=xlabel,ylab='SNP_number')
  }
  dev.off()
}

worked well.

Thank you for your help,
Maruthi.

ADD REPLY
0
Entering edit mode

Hi Philipp,

May I know how I can remove/correct large intervals while plotting the SNPs on a chromosome ? I am using the below command.

tmp.tab <- read.table("C:\\Users\\Maruthi\\Documents\\Analysis files\\1_22X.bed\\chr1.txt", sep="\t", header=F)
pos <- tmp.tab[,3]
mode(pos) <- "numeric"
num <- length(pos)
dis <- NULL
for(i in 1:(num-1)){
  dis <- c(dis,pos[i+1]-pos[i])
}
pdf(file="C:\\Users\\Maruthi\\Documents\\Analysis files\\1_22X.bed\\chr1.pdf")
plot(dis,type="p",pch=19,xlab="SNP distribution",ylab="Chr1",main="Chromosome Report",col="red",xlim=c(0,6000),ylim=c(0,5e+6) )
dev.off()

I will look forward for your kind suggestion.

Thank you,
Maruthi.

ADD REPLY

Login before adding your answer.

Traffic: 1920 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6