Cryptid.me

Miscellany

C++ Embeddable Language Survey

grouplang

C++ has a variety of embeddable languages available for integration. Almost any language can be hacked into a semi-embeddable context. This usually takes the form of putting a runtime into it’s own process, and adding many FFI hacks. Only languages where you can actually create multiple isolated VM contexts per process were considered.

TinyScheme

TinyScheme is a very minimal implementation of a subset of the R5RS Scheme standard. It was originally a fork of MiniScheme, and while information is very sparse online, the license is dated to 2000.

lambda

TinyScheme is most known for its integration as a scripting language into the open-source GIMP image editor. They call the system “Script-fu”.

Dependencies

TinyScheme is probably not available in your repositories and is easiest to build from source. Fetch the latest release into our working directory and kick off the build.

> wget https://downloads.sourceforge.net/project/tinyscheme/tinyscheme/tinyscheme-1.41/tinyscheme-1.41.tar.gz
HTTP request sent, awaiting response..
> tar -zxvf tinyscheme-1.41.tar.gz
tinyscheme-1.41/BUILDING
tinyscheme-1.41/CHANGES ..
> cd tinyscheme-1.41 && make
gcc -fpic -pedantic -I. -c -g -Wno-char-subscripts -O -DSUN_DL=1 -DUSE_DL=1 -DUSE_MATH=1 -DUSE_ASCII_NAMES=0  scheme.c ...
> cd ..

This won’t produce a system wide install, the library is contained in the directory. TinyScheme can be built in dynamic / statically linked variants.

VM Setup

Add the related headers. The “USE_INTERFACE” preprocessor macro is required. Here we have it set in dynamic linking mode.

#include <iostream>

// 0 for static linking 1 for dynamic linking
#define USE_INTERFACE 1

#include "scheme.h"
#include "scheme-private.h"

Creating a new interpreter context is simple. Because this is a C API ensure that error checking is done manually, documentation is not perfect. You can learn most of what you need from inspecting the C header files provided.

// create interpreter context
scheme* const context = scheme_init_new();

// check for correct initialization
if(context == NULL){
  throw std::runtime_error("Failed To Initialize Scheme Context");
};

// add function to interpreter context
context->vptr->scheme_define(
  context,                                         // scheme interpreter
  context->global_env,                             // environment for symbol
  context->vptr->mk_symbol(context, "myprint"),    // symbol to bind
  context->vptr->mk_foreign_func(context, myPrint) // value to set
);

Scheme load string will evaluate our program inside our interpreter context. Post execution the environments state is maintained and can be continued.

const std::string script = "(myprint \"hello scheme!\")";

scheme_load_string(context, script.c_str());

Ensure that after the interpreter context is totally finished that you manually cleanup. There is no stack based handler system.

scheme_deinit(context);

C++ Printer

Because TinyScheme is in the Lisp family arguments are passed in as a list. Linked lists are split into car, the head of the list, and cdr, the rest of the list. Because we only care about a single parameter we take the head of the list, ensure it’s a string, and print it.

pointer myPrint(scheme* const context, pointer args){
  if(args == context->NIL){
    throw std::runtime_error("No Arguments");
  }

  if(!context->vptr->is_string(context->vptr->pair_car(args))){
    throw std::runtime_error("Argument Not String");
  }

  const char* const text = context->vptr->string_value(context->vptr->pair_car(args));

  std::cout << text << std::endl;

  return context->NIL;
};

TinyScheme always expects a return value, so by default we give NIL.

Compiling

Standard compilation, ensuring that the TinyScheme build directory is included for the link, and include paths.

g++ -o main main.cpp -L tinyscheme-1.41 -I tinyscheme-1.41

Result

./main
hello scheme!

V8 JavaScript

V8 is a Google’s JavaScript engine. It was released in 2008, supports JIT compilation, BSD licensed, and has full modern JavaScript feature support.

v8

The engine is the core of the Chrome browser, and the Node.js run-time environment. Compared to alternatives, it is incredibly complex.

Dependencies

To build V8 you have to use Google’s own tool chain. Download it into its own directory, and add it to your path.

> git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
Cloning into 'depot_tools'...
> export PATH=`pwd`/depot_tools:"$PATH"

Google’s tools expect Python version 2.7. On Arch the system wide configuration uses Python 3 now. The easiest way to fix this problem with a bad hack is to change the system wide python symlink to use Python 2 instead of Python 3. We will have to remember to change this back or our system will have problems later.

> sudo ln -sf /usr/bin/python2 /usr/bin/python
> python --version
Python 2.7.15

Use Google’s tools to fetch the V8 source, and generate a build configuration.

> fetch v8 && cd v8
Running: gclient root ...
> tools/dev/v8gen.py x64.release

Edit the build configuration to produce a static library instead. This should open in Vi by default.

> gn args out.gn/x64.release

Add these lines to the end and save.

is_component_build = false
v8_static_library = true
use_custom_libcxx = false
use_custom_libcxx_for_host = false

Ninja is the actual build system similar to Make.

> ninja -C out.gn/x64.release
ninja: Entering directory `out.gn/x64.release'
[0/1650] CXX obj/torque/scope.o ...
> cd ..

Make sure to change the Python version back now that we are done.

> sudo ln -sf /usr/bin/python3 /usr/bin/python
> python --version
Python 3.6.5

V8 expects several files to be present at run-time for our executable.

cp v8/out.gn/x64.release/*.bin .

VM Setup

Add related headers.

#include <iostream>

#include "libplatform/libplatform.h"
#include "v8.h"

V8 requires itself, and a couple dependency libraries to be initialized before use. This does not setup a VM context for execution.

v8::V8::InitializeICUDefaultLocation("");
v8::V8::InitializeExternalStartupData("");
std::unique_ptr<v8::Platform> platform = v8::platform::NewDefaultPlatform();
v8::V8::InitializePlatform(platform.get());
v8::V8::Initialize();

Create the actual VM for our contexts to exist in.

v8::Isolate::CreateParams params;
params.array_buffer_allocator = v8::ArrayBuffer::Allocator::NewDefaultAllocator();
v8::Isolate* isolate = v8::Isolate::New(params);

Setup stack helpers to use a specific isolate for the rest of the scope.

v8::Isolate::Scope isolate_scope(isolate);
v8::HandleScope handle_scope(isolate);

We can create an environment for an isolate that we will use when creating our context. Here we register our printing function into the environment.

v8::Handle<v8::ObjectTemplate> global = v8::ObjectTemplate::New(isolate);

global->Set(
  v8::String::NewFromUtf8(isolate, "myprint", v8::NewStringType::kNormal).ToLocalChecked(),
  v8::FunctionTemplate::New(isolate, myPrint)
);

A context is an isolated view of an environment inside an isolate.

v8::Local<v8::Context> context = v8::Context::New(isolate, NULL, global);

Enter into our context, JIT the source, and now we are ready to run the code.

v8::Context::Scope context_scope(context);
v8::Local<v8::String> source = v8::String::NewFromUtf8(isolate, "myprint('hello javascript!')", v8::NewStringType::kNormal).ToLocalChecked();
v8::Local<v8::Script> script = v8::Script::Compile(context, source).ToLocalChecked();

Run script in interpreter context. Any errors will be reported as exceptions.

script->Run(context);

Clean up our VM. Note that this is not our context.

isolate->Dispose();
delete params.array_buffer_allocator;

Clean up V8 itself.

v8::V8::Dispose();
v8::V8::ShutdownPlatform();

C++ Printer

V8 passes everything that could be related to a callback with a single data structure. This includes parameters and other information.

void myPrint(v8::FunctionCallbackInfo<v8::Value> const& args){
  if(args.Length() != 1){
    throw std::runtime_error("myprint expected exactly 1 parameter");
  }

  v8::HandleScope handle_scope(args.GetIsolate());

  v8::String::Utf8Value str(args.GetIsolate(), args[0]);

  std::cout << *str << std::endl;
};

The HandleScope ensures anything that is done inside the callback is cleaned up before exiting the handler.

Compiling

The actual compiler arguments are rather complicated for this case. We add the v8 include path, link in pthread / related dependencies, and set the C++ standard version.

g++ -o main main.cpp -I v8/include -Wl,--start-group v8/out.gn/x64.release/obj/{libv8_{base,libbase,external_snapshot,libplatform,libsampler},third_party/icu/libicu{uc,i18n},src/inspector/libinspector}.a -Wl,--end-group -lrt -ldl -pthread -std=c++0x

In the middle we get this uncommon “-Wl,–start-group … -Wl,–end-group” segment. This is declaring that circular dependencies exist inside the block, and that the linker is going to need to do some extra work.

Result

> ./main
hello javascript!

Lua

Lua is by far the most common choice for an embeddable programming language. It has both interpreted, and JIT compiled variants, and is considered both fast, and minimal. The primary abstraction is the table that is used for essentially everything.

lua

It has been used as a scripting language for many popular games such as Baldur’s Gate, World of Warcraft, and Garry’s Mod.

Dependencies

Because of how common this one is it’s likely already available in your distributions package manager.

sudo pacman -S lua

I’m on Arch so it’s a quick call to Pacman for Lua 5.3.

VM Setup

Add the required headers.

#include <iostream>

#include <lua.hpp>

Creating a new state will produce an isolated context. You can have thousands of these without issue. Because this is primarily a C library error checking will have to be done manually.

lua_State* const context = luaL_newstate();

if(context == NULL){
  throw std::runtime_error("Failed To Initialize Lua Context");
};

Adding our custom function to the environment is pretty simple. Lua has a standard library of sorts that you can also import into the environment. By default nothing is included, we are going to just use the raw context.

lua_register(context, "myprint", myPrint);

We can take our hello world, and load / execute it in the context. Once you have executed a script in the context the context is run until it has nothing to do. Once a script stops the state is maintained. If you were to define Lua functions inside your script, you could import them by executing the script. Later (not demonstrated here) you could call them from the C++ side of things.

// our program
const std::string script = "myprint(\"hello lua!\")";

// execute script
const int failed = luaL_dostring(context, script.c_str());

// check for errors
if(failed){
  throw std::runtime_error(
    "Lua Error: " + std::string(lua_tostring(context, -1))
  );
}

Final cleanup once you are totally done with your context.

lua_close(context);

C++ Printer

The Lua C interface is stack based, and like Lua, indexing starts at 1, not 0.

int myPrint(lua_State* const context){
  // count arguments on stack
  if(lua_gettop(context) != 1){
    return luaL_error(context, "myprint expects 1 argument");
  }

  // get first parameter on stack
  const char* const text = luaL_checkstring(context, 1);

  std::cout << text << std::endl;

  return 0; // no results pushed to stack
};

You can push multiple result values back on the stack when you are finished. You just have to tell Lua how many items you pushed via the C return. In our case we did not push anything so the return value is 0.

Compiling

Standard procedure plus telling the compiler to use the system Lua install.

g++ -o main main.cpp -l lua

Result

> ./main
hello lua!

ChaiScript

ChaiScript is the new kid on the block having only been started in 2009. It is intended to perfectly integrate with C++, and when using it this is clear.

chai

It is a standard imperative language fairly similar to JavaScript, but with an actual Class system and other features.

Dependencies

ChaiScript is distributed by default as a header only library. Take the latest distribution from the set of releases and extract it into our projects working directory.

> wget https://github.com/ChaiScript/ChaiScript/archive/v6.1.0.tar.gz
HTTP request sent, awaiting response..
> tar -zxvf v6.1.0.tar.gz
ChaiScript-6.1.0/unittests/vector_push_empty.chai
ChaiScript-6.1.0/unittests/zip_with.chai ..

There is no extra required build phase, that’s it.

VM Setup

Add the required headers

#include <iostream>

#include <chaiscript/chaiscript.hpp>

This is the total requirement for setting up a ChaiScript context and adding to the FFI. There is no extra error handling required as it makes use of the C++ exception system and will automatically throw if anything goes wrong.

chaiscript::ChaiScript context;

context.add(chaiscript::fun(&myPrint), "myprint");

Actual evaluation is just as simple without much explanation required.

const std::string script = "myprint(\"hello chai!\");";

context.eval(script);

C++ Printer

What is most impressive about ChaiScript is this segment. Notice the type signature of our print function. It is exactly a default implementation.

void myPrint(const std::string text){
  std::cout << text << std::endl;
};

ChaiScript is capable of directly interfacing with any common types you are likely to use by taking advantage of advanced templating features available in recent C++ standards. Because of these implementation details the C++ 14 standard is required.

Compiling

We set the language standard, add ChaiScript to our include path, and link with required runtime libraries.

g++ -o main main.cpp -std=c++14 -I ChaiScript-6.1.0/include/ -l pthread -l dl

Result

> ./main
hello chai!