Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scientific notation only for specific numbers of a dataset's column

Tags:

r

I need to format numeric columns of a data frame showing scientific notation only when the number is less than 0.0001. I have written the following code where format function has been used. The problem with this code is that it transforms all numbers.

Any suggestion?

col1 <- c(0.00002, 0.0001, 0.5689785541122558)
col2 <- c(3.5, 45.6546548788, 12585.5663)
tab <- cbind(col1, col2)
tab <- as.data.frame(tab)
format(tab[1], digit = 1, nsmall = 3)
like image 576
mbistato Avatar asked Sep 08 '25 13:09

mbistato


2 Answers

1) dplyr Define a vectorized format and use that in mutate/across:

formatv <- function(x, ...) {
  mapply(format, x, scientific = abs(x) < 0.0001, ...)
}    

library(dplyr)
tab %>% mutate(across(, formatv, digit = 1, nsmall = 3))

2) Base R or with only base R (formatv is from above)

replace(tab, TRUE, lapply(tab, formatv, digit = 1, nsmall = 3))

or

replace(tab, TRUE, formatv(as.matrix(tab), digits = 1, nsmall = 3))

or if you have a small number of columns do each individually

transform(tab,
  col1 = formatv(col1, digits = 1, nsmall = 3),
  col2 = formatv(col2, digits = 1, nsmall = 3))

3) collapse formatv is from above.

library(collapse)
ftransformv(tab, names(tab), formatv, digit = 1, nsmall = 3)

4) purrr map_dfc in purrr can be used. formatv is from above.

library(purrr)
tab %>% map_dfc(formatv, digit = 1, nsmall = 3)
like image 101
G. Grothendieck Avatar answered Sep 10 '25 07:09

G. Grothendieck


You could apply on both margins 1:2.

as.data.frame(apply(tab, 1:2, \(x) format(x, digits=1, nsmall=3)))
#    col1      col2
# 1 2e-05     3.500
# 2 1e-04    45.655
# 3 0.569 12585.566

Or if you want to format just one specific column:

transform(tab, col1=sapply(col1, format, digits=1, nsmall=3))
#    col1        col2
# 1 2e-05     3.50000
# 2 1e-04    45.65465
# 3 0.569 12585.56630

Important just is, that each element is formatted individually.

Here another way using replace.

tab |> 
  round(5) |>
  (\(.) replace(., . < 1e-4, format(.[. < 1e-4], digit=1, nsmall=3)))()
#      col1        col2
# 1   2e-05     3.50000
# 2   1e-04    45.65465
# 3 0.56898 12585.56630
like image 28
jay.sf Avatar answered Sep 10 '25 07:09

jay.sf