Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Suppress missing values in Tabyl xtabs in R

Tags:

r

janitor

According to the tabyl documentation: enter image description here

However, I can't figure out how to suppress the NA from the denominator!

Please see here for the data:

df <- data.frame(col1 = c(1,1,2,2,1, NA,NA),
                 col2 = c("this", NA,"is", "text",NA,NA,'yes'), 
                 col3 = c('TRUE', 'FALSE', NA, 'TRUE','TRUE', 'TRUE', 'FALSE'), 
                 col4 = c(2.5, 4.2, 3.2, NA, 4.2, 3.2,3)) %>% 
                  mutate_if(is.numeric, as.factor) %>% 
                  mutate_if(is.character, as.factor) 

str(df)  
df %>% 
  tabyl(col1, col3, show_missing_levels = F) %>% 
  adorn_percentages("row")%>% 
  adorn_pct_formatting(digits = 2) %>%
  adorn_ns()

Note how the percent of the NA is still showing up in the denominator. I don't want to see the NA % at all in the cross tab:

 col1      FALSE       TRUE        NA_
    1 33.33% (1) 66.67% (2)  0.00% (0)
    2  0.00% (0) 50.00% (1) 50.00% (1)
 <NA> 50.00% (1) 50.00% (1)  0.00% (0)

What I want to see:

 col1      FALSE       TRUE       
    1 33.33% (1) 66.67% (2)  
    2  0.00% (0) 100.00% (1)

Any idea how I can achieve this?

like image 766
NewBee Avatar asked Sep 20 '25 09:09

NewBee


2 Answers

By default, show_na = TRUE in tabyl. If we change it to FALSE, the OP's code should work

library(dplyr)
library(janitor)
 df %>% 
   tabyl(col1, col3, show_missing_levels = FALSE, show_na = FALSE) %>%
   adorn_percentages("row")%>% 
   adorn_pct_formatting(digits = 2) %>%
   adorn_ns()

-output

#  col1      FALSE        TRUE
#    1 33.33% (1)  66.67% (2)
#    2  0.00% (0) 100.00% (1)

Earlier, was thinking about injecting na_omit to remove the NA rows and select to select the columns of interest. But, this will also changes/remove the attribute, thus making the adorn_ns to not work

like image 72
akrun Avatar answered Sep 22 '25 05:09

akrun


It is almost the same as akrun's

library(dplyr)
library(janitor)
df %>% 
  tabyl(col1, col3, show_missing_levels = FALSE) %>% 
  na.omit() %>% 
  select(-NA_) %>%
  adorn_percentages("row")%>% 
  adorn_pct_formatting(digits = 0)  

Output:

 col1 FALSE TRUE
    1   33%  67%
    2    0% 100%
like image 38
TarJae Avatar answered Sep 22 '25 05:09

TarJae