Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detect order of applications of transformations from ggplot object

These objects print the same but the objects themselves are different.

library(ggplot2)
p1 <- ggplot(cars, aes(speed, dist)) + xlim(1, 2) + geom_point() + geom_line()
p2 <- ggplot(cars, aes(speed, dist)) + geom_point() + xlim(1, 2) + geom_line()
p3 <- ggplot(cars, aes(speed, dist)) + geom_point() + geom_line() + xlim(1, 2)
length(waldo::compare(p1, p2))
#> [1] 229
length(waldo::compare(p1, p3))
#> [1] 190

I would like to understand from a ggplot object itself the order in which the transformations have been applied.

We can access the layers, scales etc using p1$layers, p1$scales etc, and we can find them there in order of appliance by transformation type, but I need to know the order overall.

$layers elements are equivalent between the plots above (as can be checked with waldo::compare(p1$layers, p2$layers), $scales element however differ due to environments found as attributes, function enclosures, or elements of other environments. This is the part I got stuck at.

A general answer is best, but an answer that will work "90% of the time" would be appreciated too. The general issue is not only about scales and layers but should include other transformations as well (coordinates, position, themes) as long as their position relative to objects of other types changes the output.

The output for the given examples might look like :

# 1st scale than 1st layer then 2nd layer
gg_order(p1)
#> scales layers layers 
#>      1      1      2

# 1st layer than 1st scale then 2nd layer
gg_order(p2)
#> layers scales layers 
#>      1      1      2

# 1st layer than 2nd layer then 1st scale
gg_order(p3)
#> layers layers scales 
#>      1      2      1

The number of transformations doesn't always match number of functions in the original code since a few functions apply several transformations, we can assume a one on one mapping here if it helps.

EDIT:

I have designed some tools that help navigating the waldo diffs, this might help:

devtools::install_github("moodymudskipper/woof")
woof::woof_compare(p1, p2)
w <- woof::woof_compare(p1, p2)
w$scales$super$..env$env$self$super$..env
print(w$scales$super$..env$env$self$super$..env, substitute = TRUE)
like image 993
Moody_Mudskipper Avatar asked Dec 14 '25 17:12

Moody_Mudskipper


1 Answers

This isn't an answer in the sense that this will help you further your goal, but perhaps it might help you scope out a different goal.

The reason that waldo is reporting differences between these plots, is because upon addition of every layer, the scales are cloned: the new scales become child environments of the old scales. The 'state' of the plot object, should thus depend on how many objects were added after the xlim() function, because this operation clones the scale that the function produces. The cloning happens in this line of source code:

https://github.com/tidyverse/ggplot2/blob/d7f22413efea3dd2a7c9effff05d4b2aa2c2d300/R/plot.R#L150

I believe this scale cloning is what lets waldo report differences, but I don't think the cloning is able to track any state in other parts of the plot, and therefore your goal might be unachievable.

The reason I believe so, is because one can do the following exercise. If you fork ggplot2, then comment out that particular line, those plot objects become identical (but won't render properly):

library(ggplot2) # 3.4.2 from CRAN

p1 <- ggplot(cars, aes(speed, dist)) + xlim(1, 2) + geom_point() + geom_line()
p2 <- ggplot(cars, aes(speed, dist)) + geom_point() + xlim(1, 2) + geom_line()
p3 <- ggplot(cars, aes(speed, dist)) + geom_point() + geom_line() + xlim(1, 2)

length(waldo::compare(p1, p2))
#> [1] 224
length(waldo::compare(p1, p3))
#> [1] 190

# Now with local fork with that line commented out
# Path may differ on your machine
devtools::load_all("~/packages/ggplot2/")
#> ℹ Loading ggplot2

p1 <- ggplot(cars, aes(speed, dist)) + xlim(1, 2) + geom_point() + geom_line()
p2 <- ggplot(cars, aes(speed, dist)) + geom_point() + xlim(1, 2) + geom_line()
p3 <- ggplot(cars, aes(speed, dist)) + geom_point() + geom_line() + xlim(1, 2)

waldo::compare(p1, p2)
#> ✔ No differences
waldo::compare(p1, p3)
#> ✔ No differences

Created on 2023-04-08 with reprex v2.0.2

So unless one is really an R wizard and are able to wrangle out different states from these scale environments, I think the order of operations is irretrievable from the plot object alone.

I'd be happy to be proven wrong though!

like image 68
teunbrand Avatar answered Dec 17 '25 11:12

teunbrand



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!