Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
BBC RussianHomePhabricator
Log In
Maniphest T308250

Should Wikifunctions use a WebAssembly runtime?
Closed, ResolvedPublic

Description

WebAssembly (WASM) is an open standard for an efficient and portable bytecode format for a stack-based virtual machine. One of the main goals for WASM is to provide a secure execution environment for untrusted code. Consequently, WASM code runs in a sandboxed environment that is separated from the host and is hardened against certain classes of memory safety bugs.

Programmers don't write WASM bytecode directly. WebAssembly code is generally written in a high-level, statically-typed language, that is then compiled into WASM bytecode. Languages that support WASM as a compilation target include C, C++, C# Go, Forth, Lisp, and others. Dynamic, interpreted languages like Python and JavaScript are also supported, typically by compiling the interpreter to WASM.

The features that make WebAssembly attractive for Wikifunctions are:

  • WebAssembly code can execute on both client and sever;
  • WebAssembly provides stronger isolation and security guarantees compared to executing native code in containers;
  • WebAssembly is generally fast;
  • WebAssembly gives us a way to support multiple languages with a single, common runtime.

The main disadvantage is that WebAssembly is still relatively new technology and the ecosystem is still maturing.

Let's investigate this!

Thanks to @Joe and @faidon for the suggestion.

Link dump:

Event Timeline

It would be very nice indeed if WebAssembly was the common target.

I will just add here a few notes from my experimentations, and folks please correct me.

So I wrote a POC to see how this could work.

import os
import tempfile
import wasmtime

cfg = wasmtime.Config()
engine = wasmtime.Engine(cfg)
linker = wasmtime.Linker(engine)
linker.define_wasi()

module = wasmtime.Module.from_file(engine, "wasi-python3.13.wasm")
config = wasmtime.WasiConfig()
config.argv = ( "python", "-c", "print(2+2)" )
config.preopen_dir(".", "/")

with tempfile.TemporaryDirectory() as chroot:
    out_log = os.path.join(chroot, "out.log")
    err_log = os.path.join(chroot, "err.log")
    config.stdout_file = out_log
    config.stderr_file = err_log
    store = wasmtime.Store(engine)
    store.set_wasi(config)
    # instance = wasmtime.Instance(store, wasm_module)
    instance = linker.instantiate(store, module)
    # print(instance.exports(store)._extern_map)
    python_function = instance.exports(store)['_start']
    result = python_function(store)
    with open(out_log) as f:
        result = f.read()
        print(result)

# based on https://til.simonwillison.net/webassembly/python-in-a-wasm-sandbox

It needs wasmtime installed and Python compiled to wasm (how to do that is described here https://enarx.dev/docs/webassembly/python - that step takes a while).

It seems to work, and I would like to try out to replace the current executor with this, but it needs a bit more work then I thought (in particular, it seems that the serializers / deserializers cannot be handed over as function objects, but rather need to be code in the template).

One way or the other, here are the runtime results:

Executing Python on WASM seems to be more than sufficiently fast. The only thing that takes time is loading the Python WASM module, since it is about 120 MB big, it requires on my laptop about 18 seconds. Executing the script is very much subsecond speed.

The module needs to be loaded only once, and can then be executed on an arbitrary number of scripts. Loading the module should be considered as part of the ramp up time of the evaluator / executor, and we should write the code accordingly.

For JavaScript, javy https://github.com/bytecodealliance/javy seems to provide the toolchain. I haven't written a POC for that yet.

@DVrandecic AIUI right now we shell out to the interpreter from the function-evaluator service to execute the code; is my understanding correct?

If that's the case, we would just need to execute $wasm_runtime python3.wasm <args> instead of python3 <args>, we don't really need to do the execution from within python.