I'm getting an error message that I can't make sense of. My code:
 url <- "https://en.wikipedia.org/wiki/California_State_Legislature,_2017%E2%80%9318_session"
leg <- read_html(url)
testdata <- leg %>% 
  html_nodes('table') %>% 
  .[6] %>% 
  html_table()
To which I get the response:
Error in out[j + k, ] : subscript out of bounds
When I swap out html_table with html_text I don't get the error. Any idea what I'm doing wrong?
library(htmltab)
library(dplyr)
library(tidyr)
url <- "https://en.wikipedia.org/wiki/California_State_Legislature,_2017%E2%80%9318_session"
url %>%
  htmltab(6, rm_nodata_cols = F) %>%
  .[,-1] %>%
  replace_na(list(Notes = "", "Term-limited?" = "")) %>%
  `rownames<-` (seq_len(nrow(.)))
Output is:
  District              Name      Party       Residence Term-limited? Notes
1        1        Ted Gaines Republican El Dorado Hills                    
2        2      Mike McGuire Democratic      Healdsburg                    
3        3         Bill Dodd Democratic            Napa                    
4        4       Jim Nielsen Republican          Gerber                    
5        5 Cathleen Galgiani Democratic        Stockton                    
6        6       Richard Pan Democratic      Sacramento                    
...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With