I would like to process data frame through dplyr
and ggplot
using column names in form of string. Here is my code
library(ggplot2)
library(dplyr)
my_df <- data.frame(var_1 = sample(c('a', 'b', 'c'), 1000, replace = TRUE),
var_2 = sample(c('d', 'e', 'f'), 1000, replace = TRUE))
name_list = c('var_1', 'var_2')
for(el in name_list){
pdf(paste(el, '.pdf', sep =''))
test <- my_df %>% group_by(el) %>% summarize(count = n())
ggplot(data = test, aes(x = el, y = count)) + geom_bar(stat='identity')
dev.off()
}
The above code obviously does not work. So I tried different things like UQ
and as.name
. UQ
creates column with extra quotes and ggplot does not understand it with aes_string
. Any suggestions?
I can use for (el in names(my_df))
with filtering, but would prefer to work with strings.
UPDATE Here are detailed messages/errors that I got:
for(el in name_list){
pdf(paste(el, '.pdf', sep =''))
test <- my_df %>% group_by(!!el) %>% summarize(count = n())
ggplot(data = test, aes_string(x = el, y = 'count')) + geom_bar(stat='identity')
dev.off()
}
The above code generate empty files.
for(el in name_list){
pdf(paste(el, '.pdf', sep =''))
test <- my_df %>% group_by(UQ(el)) %>% summarize(count = n())
ggplot(data = test, aes_string(x = el, y = 'count')) + geom_bar(stat='identity')
dev.off()
}
The above code also generates empty files
for(el in name_list){
pdf(paste(el, '.pdf', sep =''))
test <- my_df %>% group_by(as.name(el)) %>% summarize(count = n())
ggplot(data = test, aes_string(x = el, y = 'count')) + geom_bar(stat='identity')
dev.off()
}
produces
Error in mutate_impl(.data, dots) :
Column `as.name(el)` is of unsupported type symbol
You need to UQ
(or !!
) the name/symbol. For example
for(el in name_list){
pdf(paste(el, '.pdf', sep =''))
test <- my_df %>% group_by(UQ(as.name(el))) %>% summarize(count = n())
print(ggplot(data = test, aes_string(x = el, y = 'count')) + geom_bar(stat='identity'))
dev.off()
}
I made two changes to your code:
dplyr
use group_by_
instead of group_by
;ggplot2
use aes_string
or get(variable)
;I also added minor changes (e.g. ggsave
to save plots).
library(ggplot2)
library(dplyr)
my_df <- data.frame(var_1 = sample(c('a', 'b', 'c'), 1000, replace = TRUE),
var_2 = sample(c('d', 'e', 'f'), 1000, replace = TRUE))
name_list = c('var_1', 'var_2')
for(el in name_list){
p <- my_df %>%
group_by_(el) %>%
summarize(count = n()) %>%
ggplot(aes(x = get(el), y = count)) +
geom_bar(stat = "identity")
ggsave(paste0(el, ".pdf"), p)
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With