I have multiple excel files with pictures in one of the sheets. Is there a way to extract the image (image path) into R to then be placed into the tesseract ocr.
Previously I used the openxlsx package's function loadWorkbook:
wb <- openxlsx::loadWorkbook("C:/Users/.../test_file.xlsx")
when you output wb:
A Workbook object.
Worksheets:
Sheet 1: "Sheet1"
Images:
Image 1: "C:/Users/..../AppData/Local/Temp/RtmpuUQZm7//file41e..._openxlsx_loadworkbook/xl/media/image1.png"
Worksheet write order: 1
Is there anyway to get this image path? The type variable is a workbook object and when you do type of it is "S4" so it appears that I can't convert it to a character and pull out the path.
You can access the image path with the @media slot of your workbook object.
Here's a reprex of plotting a PNG stored within an xlsx file:
require(png)
require(openxlsx)
require(grid)
wb <- openxlsx::loadWorkbook("~/img.xlsx")
img <- png::readPNG([email protected]$media[1])
grid::grid.newpage()
grid::grid.raster(img)

Created on 2020-03-04 by the reprex package (v0.3.0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With