Colouring a string
3
0
Entering edit mode
6.1 years ago

Imagine I have a DNA sequence (e.g., dummy and short, ATTATGCGGGGAATTT ) and I would like to colour the different nucleotides with different coluours, given that I have a vector indicating the position of the nucleotides to be coloured:

(1,3,6,7,8,10) <- to be coloured in red

(2,4,5,12) <- to be coloured in green

How would you do that (if you don't want to do that manually) ?

genome sequence • 2.2k views
ADD COMMENT
0
Entering edit mode

I had this idea once for schollboys and the only way I found is to go throught HTML code

<head>
<style TYPE="text/css"> 
    .A {
        color: red;
        font-family: monospace;
        font-size: 88px;
    }
    .C {
        color: green;
        font-family: monospace;
        font-size: 88px;
    }
    .G {
        color: orange;
        font-family: monospace;
        font-size: 88px;
    }
    .T { 
        color: blue;
        font-family: monospace;
        font-size: 88px;
    }
</style>

</head>
<span class="G">G</span><span class="C">C</span><span class="A">A</span><span class="T">T</span><span class="G">G</span><span class="C">C</span><span class="T">T</span><span class="A">A</span><span class="G">G</span><span class="C">C</span><span class="A">A</span><span class="G">G</span><span class="C">C</span><span class="T">T</span><span class="G">G</span><span class="T">T</span><span class="C">C</span><span class="A">A</span><span class="C">C</span><span class="G">G</span>

Color is manage by the CSS class.

This is hard code but you can develop a function to input a sequence and generate to appropriate <span class="X">X</span> for each base

ADD REPLY
0
Entering edit mode

Do you want to use multiple different colour arrays?

ADD REPLY
5
Entering edit mode
6.1 years ago
Russ ▴ 520

Here's a solution using R:

library(crayon)

string <- "ATTATGCGGGGAATTT"
sp <- strsplit(string, split = "")[[1]]
df <- data.frame("nucleotide" = as.character(sp), stringsAsFactors = F)

redVector <- c(1,3,6,7,8,10)
greenVector <- c(2,4,5,12)

df$ntColored <- df$nucleotide
df[redVector, "ntColored"] <- red(df[redVector, "ntColored"])
df[greenVector, "ntColored"] <- green(df[greenVector, "ntColored"])

cat(df$ntColored)

edit: just for fun, it's also easy to colour by letter:

df$byLetter <- ifelse(df$nucleotide == "A", df$byLetter <- blue("A"), 
       ifelse(df$nucleotide == "C", df$byLetter <- red("C"),
              ifelse(df$nucleotide == "G", df$byLetter <- green("G"),
                    df$byLetter <- yellow("T"))
              )
       )

cat(df$byLetter)
ADD COMMENT
1
Entering edit mode

Thank you so much: I am actually making the rest of the analysis in R and this turns out to be the best solution for me !

ADD REPLY
0
Entering edit mode

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they all work.
Upvote|Bookmark|Accept

ADD REPLY
0
Entering edit mode

And now, the further step: how would you save the output in a (image) file ? With the options I have tried I can export only a black string and not the coloured one ....

ADD REPLY
1
Entering edit mode

How do you intend to use the output? The answers here depend on the use of so-called ‘escape’ sequences which are invisible characters which terminals that support colour interpret.

These characters are not supported in every possible application though. I’m not personally aware of any image editors that support them natively.

The best solution I can think of is a screenshot?

You can view the escape characters but piping the output of the tools (maybe not the R one, I’m not 100% sure how that one works), to cat -v

E.g.

$ Colorise_script -arg ATGC | cat -v
ADD REPLY
0
Entering edit mode

you original question is badly formulated. First you ask for color, then you say you want R and now you say you want to save the image....

ADD REPLY
0
Entering edit mode

Apologies for the bad formulation. I asked for colours in general and different solutions have been proposed. Among these, I have followed what looked the most suitable for me. What I am aiming to do is:

  1. create the coloured string, as reported in the original question
  2. save the string in a file (possibly an image)

Thanks

ADD REPLY
0
Entering edit mode

Yes, but what do you actually want to do with it downstream?

Is it just to put in presentations or something?

ADD REPLY
0
Entering edit mode

Yes, at the end I would like to obtain a figure to put in a paper/presentation.

ADD REPLY
0
Entering edit mode

From what I can find, there is no better option than screenshotting the output.

It is theoretically possible to pipe STDOUT from the terminal, as this post explains. The only option to support colours however is enscript, which would mean you could only generate postscript files. enscripts colouration escape sequences are also not the same as an xterm's, so an intermediate script to transliterate everything would be needed.

In short, its f*cking difficult.

The alternative would be to start from scratch in a language which has some support for creation of images, but this then becomes less about text manipulation, and more of a rendering problem, and none of the solutions here are in that vane.

ADD REPLY
0
Entering edit mode

This is convoluted, but you can use the textGrob function from grid package in R. You can create a textGrob, which is a ggplot-like object that just contains text. You'll need to figure out the coloring, but once you create the textGrob, you can ggsave the textGrob object to get your image.

Good luck!

ADD REPLY
0
Entering edit mode

OP, you should have specified that "save to file" part at the outset. Colors depend on the renderer, not the file itself (of course, image and pdf files are the way to achieve portability). By leaving that out, people have spent their time helping you without the actual goal available to them. This kind of formulation frustrates people and makes them less inclined to help you out subsequently/follow up on questions you might have.

ADD REPLY
0
Entering edit mode

I apologise again for the bad formulation of my question. Initially, I thought that the main problem was to generate the string and not to save the output and, thus, I preferred not to bother people about the second issue. My bad.

ADD REPLY
0
Entering edit mode

Yeah, I agree with jrj.healey, I'd probably just take a screenshot...

ADD REPLY
0
Entering edit mode

Tank you to everyone for your help: at the end, I thik I'll opt out for the screenshot solution. I'll try also RamRS suggestion about textGrob and, if I'll obtain some interesting results, I'll let you know.

Again, thank you !!!

ADD REPLY
4
Entering edit mode
6.1 years ago

in C using ANSI escape codes

ADD COMMENT
3
Entering edit mode
6.1 years ago
Joe 21k

A pure bash option (because I apparently have nothing better to do).

Note that this script will not be particularly forgiving for different specifications on the command line...

# Usage:
#  $  bash col_seq.sh <Sequence> <red> <green> <yellow> <blue>
#
# Indexes must be provided as a comma separated quoted string, e.g:
#  $  bash col_seq.sh ATGTACGATCG "1,2" "3,4" "5,6" "7,8"
#
# You can miss a colour out, but will need to specify empty quotes: ""
#  $  bash col_seq.sh ATGTACGATCG "1,2" "3,4" "" "7,8"

in_array() {
 ARRAY=$2
 for e in ${ARRAY[*]} ; do
  if [[ "$e" == "$1" ]] ; then
   return 0
  fi
  done
 return 1
}

red(){
printf "\e[31m$1\e[0m"
}
green(){
printf "\e[32m$1\e[0m"
}
yellow(){
printf "\e[33m$1\e[0m"
}
blue(){
printf "\e[34m$1\e[0m"
}

string=$(echo "$1" | tr '[:lower:]' '[:upper:]')
IFS=',' read -r -a Rarray <<< "$2"
IFS=',' read -r -a Garray <<< "$3"
IFS=',' read -r -a Barray <<< "$4"
IFS=',' read -r -a Yarray <<< "$5"


for i in $(seq 1 "${#string}") ; do
  if in_array "$i" "${Rarray[*]}" ; then
   red "${string:i-1:1}"
  elif in_array "$i" "${Garray[*]}" ; then
   green "${string:i-1:1}"
  elif in_array "$i" "${Yarray[*]}" ; then
   yellow "${string:i-1:1}"
  elif in_array "$i" "${Barray[*]}" ; then
   blue "${string:i-1:1}"
  else
   printf "${string:i-1:1}"
  fi
done
printf "\n"

Now you can do assorted bash magic:

Screen_Shot_2018_10_10_at_10_48_43

Edit:

I got carried away...

ADD COMMENT

Login before adding your answer.

Traffic: 1623 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6