Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plot a Region in R

Tags:

plot

r

ggplot2

I generated a Matrix with 100 random x-y-Coordinates in the [-1,1]^2 Interval:

n <- 100
datam <- matrix(c(rep(1,n), 2*runif(n)-1, 2*runif(n)-1), n) 
# leading 1 column needed for computation
# second column has x coordinates, third column has y coordinates

and classified them into 2 classes -1 and 1 by a given target function f (a vector). I computed a hypothesis function g and now want to visualize how good it matches the target function f.

f <- c(1.0, 0.5320523, 0.6918301)   # the given target function
ylist <- sign(datam %*% f)    # classify into -1 and 1

# perceptron algorithm to find g:
perceptron = function(datam, ylist) {
  w <- c(1,0,0)             # starting vector
  made.mistake = TRUE 
  while (made.mistake) {
  made.mistake=FALSE 
  for (i in 1:n) {
  if (ylist[i] != sign(t(w) %*% datam[i,])) {
    w <- w + ylist[i]*datam[i,]
    made.mistake=TRUE 
  }
 }
}
return(w=w)
}

g <- perceptron(datam, ylist)

I now want to compare f to g in plot.

I can do this quite easily in mathematica. Shown here is the data set with the target function f that separates the data in the +1 and -1 parts:

https://i.sstatic.net/PMRap.png

This mathematica plot shows both f and g in comparison (different data set and f)

https://i.sstatic.net/Qmklo.png

This is the corresponding mathematica code

ContourPlot[g.{1, x1, x2} == 0, {x1, -1, 1}, {x2, -1, 1}]

How can I do something similar in R (ggplot would be nice)?

like image 470
spore234 Avatar asked Jan 27 '26 23:01

spore234


1 Answers

Same thing using ggplot. This example follows your code exactly, then adds at the end:

# OP's code...
# ...

glist <- sign(datam %*% g)

library(reshape2)  # for melt(...)
library(plyr)      # for .(...)
library(ggplot2)
df <- data.frame(datam,f=ylist,g=glist) # df has columns: X1, X2, X3, f, g
gg <- melt(df,id.vars=c("X1","X2","X3"),variable.name="model")

ggp <- ggplot(gg, aes(x=X2, y=X3, color=factor(value)))
ggp <- ggp + geom_point()
ggp <- ggp + geom_abline(subset=.(model=="f"),intercept=-f[1]/f[3],slope=-f[2]/f[3])
ggp <- ggp + geom_abline(subset=.(model=="g"),intercept=-g[1]/g[3],slope=-g[2]/g[3])
ggp <- ggp + facet_wrap(~model)
ggp <- ggp + scale_color_discrete(name="Mistake")
ggp <- ggp + labs(title=paste0("Comparison of Target (f) and Hypothesis (g) [n=",n,"]"))
ggp <- ggp + theme(plot.title=element_text(face="bold"))
ggp

Below are results for n=200, 500, and 1000. When n=100, g=c(1,0,0). You can see that f and g converge for n~500.

In case you are new to ggplot: first we create a data frame (df) which has the coordinates (X2 and X3) and two columns for the classifications based on f and g. Then we use melt(...) to convert this to a new dataframe, gg, in "long" format. gg has columns X1, X2, X3, model, and value. The column, gg$model identifies the model (f or g). The corresponding classifications are in gg$value. Then the ggplot calls do the following:

  1. Establish the default dataset, gg, the x and y coords, and the coloring [ggplot(...)]
  2. Add the points layer [geom_point(...)]
  3. Add lines separating the classifications [geom_abline(...)]
  4. Tell ggplot to plot the two models in different "facets" [facet_wrap(...)]
  5. Set the legend name.
  6. Set the plot title.
  7. Make the plot title bold.

like image 103
jlhoward Avatar answered Jan 30 '26 17:01

jlhoward