Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to return a hashref when using perl XS

Tags:

go

perl

xs

I have the following challenge: bind a code written in Go with Perl (5.12.2)

In the past I use CGO + XS and it works as a charm, however my input was a string and the output, a boolean.

Now I need to return something more sophisticated. This could be represented as a hashref, something like this:

{ 
 foo => "bar",
 status => 1,
 ids => [1,2,3],
}

All tutorials about XS explain in detail how to use with primitive types (string, integer) but I can't find a good material about hashref, and libraries that may return a hashref such as YAML::XS seems really complex. FFI solutions also often handle primitive types and I can't find a good example of hashref.

I find one way but it seems... unethical. I can generate the structure in go code and serialize it as json, passing via GCO and XS to be parsed in Perl. I know that it may works, but I'd like to find a better solution.

Unfortunately, I have a very strong requirement about performance. Without this I would like to use an REST API or gRPC to do this integration, but the impact will be severe on this case.

like image 243
Tiago Peczenyj Avatar asked Sep 17 '25 22:09

Tiago Peczenyj


1 Answers

This question seriously nerdsniped me. The central problem is that much of the core Perl API uses macros instead of functions, which cannot be called from CGo.

I first used the following command to get the boilerplate out of the way:

module-starter --module=Example::CGo --author Botje --email [email protected] --class=Module::Starter::XSimple --mb

I replaced the .xs file with the following:

#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"

#include "ppport.h"

SV* getit(pTHX);

MODULE = Example::CGo           PACKAGE = Example::CGo          

SV*
getit()
        CODE:
                ST(0) = getit(aTHX);

This sets up Example::CGo::getit to call the go function named getit and treat its return value as the return value to Perl. All calls to perl core functions need a context, so we pass it unmodified to getit. I'm not sure if that will work for all flavors of perl builds, but whatever :)

The main course is the Go code. The key to make it work is to define a few helper functions that wrap the Perl macros. Each function has a pTHX or pTHX_ declared so the context is available to the macros inside. That does mean we have to pass that THX variable to all C.helper_ calls.

Once that is out of the way, the actual code is simply building the wanted data structure one step at a time. Note how ugly the upcasts to SV* are :)

package main
/*
#define PERL_NO_GET_CONTEXT
#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"

// static inline avoids multiple definitions (see https://go.dev/wiki/cgo)
static inline HV* helper_newHV(pTHX) { return newHV(); }
static inline AV* helper_newAV(pTHX) { return newAV(); }
static inline SV* helper_newRV_noinc(pTHX_ SV* thing) { return newRV_noinc(thing); }
static inline SV* helper_newSVpv(pTHX_ const char* str, int len) { return newSVpv(str, len); }
static inline SV* helper_newSViv(pTHX_ int i) { return newSViv(i); }

static inline SV** helper_hv_store(pTHX_ HV* hv, const char* key, U32 klen, SV* val, U32 hash) {
    return hv_store(hv, key, klen, val, hash);
}
static inline void helper_av_push(pTHX_ AV* av, SV* sv) { av_push(av, sv); }
*/
import "C"
import "unsafe"

func main() {}

type example struct {
    foo string
    status int
    ids []int
}

//export getit
func getit(THX *C.PerlInterpreter) *C.SV {
    val := example{ foo: "bar", status: 1, ids: []int{1,2,3} };
    hv := C.helper_newHV(THX);

    var foo_key *C.char = C.CString("foo")
    defer C.free(unsafe.Pointer(foo_key))

    var foo *C.char = C.CString(val.foo)
    defer C.free(unsafe.Pointer(foo))
    foo_val := C.helper_newSVpv(THX, foo, C.int(len(val.foo)))
    C.helper_hv_store(THX, hv, foo_key, C.uint(len("foo")), foo_val, 0);

    var status_key *C.char = C.CString("status")
    defer C.free(unsafe.Pointer(status_key))

    status_val := C.helper_newSViv(THX, C.int(val.status))
    C.helper_hv_store(THX, hv, status_key, C.uint(len("status")), status_val, 0);

    ids := C.helper_newAV(THX);
    for _, id := range val.ids {
        C.helper_av_push(THX, ids, C.helper_newSViv(THX, C.int(id)))
    }

    var ids_key *C.char = C.CString("ids")
    defer C.free(unsafe.Pointer(ids_key))

    ids_val := C.helper_newRV_noinc(THX, (*C.SV)(unsafe.Pointer(ids)))
    C.helper_hv_store(THX, hv, ids_key, C.uint(len("ids")), ids_val, 0);

    rv := C.helper_newRV_noinc(THX, (*C.SV)(unsafe.Pointer(hv)))

    return rv
}

I built this with go build -buildmode=c-archive to get a cgo.a, and added that information to Build.PL like so:

extra_linker_flags => "cgo.a",

While browsing I also saw some concerns about cgo being "slow", so I benchmarked it on my system vs a simple XS version: (Macbook Pro M1, perl 5.34, go1.22)

                Rate      getit getit_pure
getit       709479/s         --       -82%
getit_pure 3951059/s       457%         --

Up to you to decide if 700K invocations per second is "slow" ;)

like image 63
Botje Avatar answered Sep 20 '25 16:09

Botje