Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correct way to debug 3rd party R package/library locally in Rstudio

After initial anlaysis I believe the 3rd party R package I use on a daily basis has a bug and the need to confirm and fix it grows in me.

I used to just copy the function into my own workspace and resolve the dependencies, but this is only good for a reasonably isolated function, which doesn't depend on a lot of different other 3rd packages and internal functions.

I've tried getting some ideas from the contribution guide of some R packages on github, yet that's to no avail. Most of them only cover how to write good commit message, how to ask questions and where to pick an issue to work on; others simply don't have a contribution guide (I might be biased in this regard and if you know one that covers how to build and get up to speed with debugging, please kindly share it).

I've tried setting keep.source.pkgs to yes in .Renviron/.Rprofile. After a lot of bumpy trial-errors I finally got to the stage where print(targetFunction) showed me source code with comments (meaning that source file is included in the installed package itself and in theory should be ready to get breakpoints).

However I was still unable to put breakpoint in the source file (Rstudio complained that an updated version of package needed to be loaded and yet this could only be achieved if the original source code has a .Rproj file, such that I can build inside Rstudio); I also wasn't able to use setBreakpoint function, which said no source ref found (even though source file was already available given keep.source.pkgs was set to yes).

If I wanted to contribute to a R package, I clone the repo from github, and I am able to build it (R CMD build) and install it (R CMD INSTALL), how do I "correctly" start debugging it?

like image 295
stucash Avatar asked Jan 27 '26 09:01

stucash


1 Answers

When I find what I think is a bug in a 3rd-party package that I need to dive into, for anything but the more trivial functions, I tend to do this:

  1. git clone the repo. If it's simple enough and/or I know I will not be submitting a github pull-request (gitlab merge-request) then I'll clone the simpler https: URL, which does not allow pushing back to a branch. If I suspect I'll be submitting a change request, though, I'll typically fork the repo into my own account (GH or GL) and then clone my fork of the repo. (And if so, I'll work in a branch for later submission.)

  2. Work heavily in the devtools realm:

    1. Increment the package version in DESCRIPTION to distinguish my local version from the "real" package;
    2. Edit the package source for the function I need and insert browser() into the function I'm investigating;
    3. devtools::load_all the path to the local package;
    4. run the expression/code that triggers the function to be debugged;
    5. once in the debugger, I'm able to work with all related functions and namespaces natively and work on the bug;
    6. if I feel I've resolved the problem, I'll remove the browser() I added in step 2, load_all the package again, and run my code/expression again to see if it works as intended without unexpected warnings/errors and with the expected results;
    7. I have the optional to devtools::install("path/to/local/fork"), which does a more formal installation of the package from the local fork directory, making the changes available to future local R sessions. If I made the version-change minor-enough, any change to the upstream package should appear as an upgrade an override my work, which (depending on the status of my PR/MR) may or may not include this fix; if it does not, then I just git rebase origin the changes and recheck my updated package.

At this point, I may put a little or a lot more effort into the cloned repo.

  • If I didn't fork the repo, I'll just git diff the relevant files and insert the diff into an issue in the origin repo. While a PR/MR would be ideal, it should still be relatively easy for the authors/maintainers to see the diff and apply/test/use or something else. Sometimes I'll do this even if I know it'll be incomplete; for instance, if I know I cannot add unit-testing or if there are other aspects that need resolution, then an interim hack at least clearly indicates what can work to a certain point, at which point the maintainers may have a better idea how to finish the work.
  • If I'm going to submit a PR/MR, many packages have additional requirements and/or requests, such as below. The unit-tests can be a little onerous, but often it's good to provide some more reproducible context to the issue.
    • simple unit-tests
    • edit to NEWS.md
    • edit to CONTRIBUTORS.md

Whether or not I actually submit a PR/MR, I submit an issue on the origin repo. This issue is usually written before

like image 106
r2evans Avatar answered Jan 30 '26 01:01

r2evans



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!