I have a data set like the following:
import numpy as np
from pandas import DataFrame
mypos = np.random.randint(10, size=(100, 2))
mydata = DataFrame(mypos, columns=['x', 'y'])
myres = np.random.rand(100, 1)
mydata['res'] = myres
The res variable is continous, the x and y variables are integers representing positions (therefore largely repetitive), and res represents kind of correlations between pairs of positions.
I am wondering what are the best ways of visualizing this data set? Possible approaches already considered:
The first approach is problematic when the number of positions get large, because high values (which are the values we care about) of the res variable would be drowned in a sea of small dots.
The second approach could be promising, but I am having trouble producing it. I have tried the parallel_coordinates function from the pandas module, but it's not behaving as I would like it to. (see this question here: parallel coordinates plot for continous data in pandas )
I hope this helps to find a solution in R. Good luck.
# you need this package for the colour palette
library(RColorBrewer)
# create the random data
dd <- data.frame(
x = round(runif(100, 0, 10), 0),
y = round(runif(100, 0, 10), 0),
res = runif(100)
)
# pick the number of colours (granularity of colour scale)
nColors <- 100
# create the colour pallete
cols <-colorRampPalette(colors=c("white","blue"))(nColors)
# get a zScale for the colours
zScale <- seq(min(dd$res), max(dd$res), length.out = nColors)
# function that returns the nearest colour given a value of res
findNearestColour <- function(x) {
colorIndex <- which(abs(zScale - x) == min(abs(zScale - x)))
return(cols[colorIndex])
}
# the first plot is the scatterplot
### this has problems because points come out on top of eachother
plot(y ~ x, dd, type = "n")
for(i in 1:dim(dd)[1]){
with(dd[i,],
points(y ~ x, col = findNearestColour(res), pch = 19)
)
}
# this is your parallel coordinates plot (a little better)
plot(1, 1, xlim = c(0, 1), ylim = c(min(dd$x, dd$y), max(dd$x, dd$y)),
type = "n", axes = F, ylab = "", xlab = "")
for(i in 1:dim(dd)[1]){
with(dd[i,],
segments(0, x, 1, y, col = findNearestColour(res))
)
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With