I started playing around with Rcpp and would like to use the fastLm function as an example (also because it's useful for potential later work). I know that fastLm is part of the RcppArmadillo package but I would like to compile it using sourceCpp. The code can be found here and is also below.
The first problem I encounter is that I can't simply run sourceCpp("fastLm.cpp") in R after installing and loading Rcpp and RcppArmadillo. I get this error error: RcppArmadillo.h: No such file or directory and then all kind of things, which I guess follow from that.
The second issue is that I think I need to change some stuff in the fastLm.cpp. My changes are also below but I am sure something is missing or wrong. I included #include <Rcpp.h> and using namespace Rcpp; and // [[Rcpp::export]] to export the function to R and I changed the arguments from SEXP to NumericVector and NumericMatrix. I don't see why that shouldn't work and a similar adjustment is probably possible for the return value?
fastLm.cpp
#include <RcppArmadillo.h>
extern "C" SEXP fastLm(SEXP ys, SEXP Xs) {
Rcpp::NumericVector yr(ys); // creates Rcpp vector from SEXP
Rcpp::NumericMatrix Xr(Xs); // creates Rcpp matrix from SEXP
int n = Xr.nrow(), k = Xr.ncol();
arma::mat X(Xr.begin(), n, k, false); // reuses memory and avoids extra copy
arma::colvec y(yr.begin(), yr.size(), false);
arma::colvec coef = arma::solve(X, y); // fit model y ~ X
arma::colvec resid = y - X*coef; // residuals
double sig2 = arma::as_scalar( arma::trans(resid)*resid/(n-k) );
// std.error of estimate
arma::colvec stderrest = arma::sqrt( sig2 * arma::diagvec( arma::inv(arma::trans(X)*X)) );
return Rcpp::List::create(
Rcpp::Named("coefficients") = coef,
Rcpp::Named("stderr") = stderrest
) ;
}
fastLm.cpp changed
#include <Rcpp.h>
#include <RcppArmadillo.h>
using namespace Rcpp;
// [[Rcpp::export]]
extern "C" SEXP fastLm(NumericVector yr, NumericMatrix Xr) {
int n = Xr.nrow(), k = Xr.ncol();
arma::mat X(Xr.begin(), n, k, false); // reuses memory and avoids extra copy
arma::colvec y(yr.begin(), yr.size(), false);
arma::colvec coef = arma::solve(X, y); // fit model y ~ X
arma::colvec resid = y - X*coef; // residuals
double sig2 = arma::as_scalar( arma::trans(resid)*resid/(n-k) );
// std.error of estimate
arma::colvec stderrest = arma::sqrt( sig2 * arma::diagvec( arma::inv(arma::trans(X)*X)) );
return Rcpp::List::create(
Rcpp::Named("coefficients") = coef,
Rcpp::Named("stderr") = stderrest
) ;
}
You need to indicate dependency on RcppArmadillo with the Rcpp::depends pseudo attribute. This will take care of finding RcppArmadillo headers and link against blas, lapack etc ...
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
List fastLm(NumericVector yr, NumericMatrix Xr) {
int n = Xr.nrow(), k = Xr.ncol();
arma::mat X(Xr.begin(), n, k, false); // reuses memory and avoids extra copy
arma::colvec y(yr.begin(), yr.size(), false);
arma::colvec coef = arma::solve(X, y); // fit model y ~ X
arma::colvec resid = y - X*coef; // residuals
double sig2 = arma::as_scalar( arma::trans(resid)*resid/(n-k) );
// std.error of estimate
arma::colvec stderrest = arma::sqrt( sig2 * arma::diagvec( arma::inv(arma::trans(X)*X)) );
return Rcpp::List::create(
Rcpp::Named("coefficients") = coef,
Rcpp::Named("stderr") = stderrest
) ;
}
Also, it is very important that you use #include <RcppArmadillo.h> and not #include <Rcpp.h>. RcppArmadillo.h takes care of including Rcpp.h at the right time, and order of include files is very important here.
Also, you can return a List and drop the extern "C".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With