September 13, 2020

Emscripten Notes - C from Desktop to Web

Over the past while I've been using Emscripten to make web builds of my games. This page is a collection of notes about using it on Linux, and some of the changes needed to adapt code to run in the browser. All the lessons were learned by working on Faur, my personal C framework.

Setup#

Install from Git Repo#

Throughout this page, $EMSCRIPTEN refers to the Emscripten SDK's absolute installation path. I like to put all 3rd party SDKs in /opt:

#
# Create installation path
#
EMSCRIPTEN=/opt/emsdk/
mkdir -p $EMSCRIPTEN
cd $EMSCRIPTEN

#
# Clone the repo
#
git clone https://github.com/emscripten-core/emsdk .

#
# Download and set up tools
#
./emsdk update-tags
./emsdk install latest
./emsdk activate latest

Update#

git pull
./emsdk update-tags
./emsdk install latest
./emsdk activate latest

When I was first checking out Emscripten, it targeted asm.js and implemented a partial SDL 1.2 API. Today it builds Web Assembly binaries and the full SDL 2 from source!

You can check the available SDK versions with ./emsdk list.

Reset & Start Over#

Sometimes odd things break and you want to reset and re-install everything:

git clean -dfx
git pull
./emsdk update-tags
./emsdk install latest
./emsdk activate latest

The git clean command removes all files that are not part of the emsdk repo:

-d removes directories not just files
-f is to allow removing without an explicit permission set in .git/config
-x is to remove files even if they match .gitignore rules, like build output might.

Building and Running#

Building with Regular GNU Make#

Emscripten comes with its own build tools like emcc, emmake, and more that correspond to their standard equivalents (see them in $EMSCRIPTEN/upstream/emscripten). To build your program with a makefile called Makefile.emscripten, you would open Bash and run:

#
# Must first set emsdk environment variables
#
source $EMSCRIPTEN/emsdk_env.sh
emmake make -f Makefile.emscripten

However, I want to build with just make -f Makefile.emscripten like for all other platforms, without any other extra stuff. I solved this with a recursive makefile that I run with normal GNU Make, and which then calls emmake on itself:

#
# MY_SECOND_CALL is undefined on the first run
#
ifndef MY_SECOND_CALL
THIS_MAKEFILE := $(firstword $(MAKEFILE_LIST))

BUILD_COMMAND := \
    source $EMSCRIPTEN/emsdk_env.sh \
    && emmake $(MAKE) -f $(THIS_MAKEFILE) MY_SECOND_CALL=1

all :
    bash -c "$(BUILD_COMMAND)"

% :
    bash -c "$(BUILD_COMMAND) $@"
else
#
# The actual makefile goes here
# Use aliases like $(CC) and $(CXX) for tools
#
%.c.o : %.c
    $(CC) -c -o $@ $< $(CFLAGS)

...
endif

This conveniently sets the emsdk environment variables as part of the same call, too. Note that Bash is one of the shells that is known to be friendly with emsdk_env.sh.

See my Gamebuino Makefile notes for another example of this pattern in action.

Emscripten Settings#

Check out $EMSCRIPTEN/upstream/emscripten/src/settings.js for all the available flags. I build my programs with these, which I pass to the compilers and linker:

MY_EMSCRIPTEN_OPTIONS := \
    -s USE_SDL=2 \
    -s USE_SDL_MIXER=2 \
    -s USE_ZLIB=1 \
    -s USE_LIBPNG=1 \
    -s WASM=1 \
    -s ALLOW_MEMORY_GROWTH=1 \

CFLAGS += $(MY_EMSCRIPTEN_OPTIONS)
CXXFLAGS += $(MY_EMSCRIPTEN_OPTIONS)
LDFLAGS += $(MY_EMSCRIPTEN_OPTIONS)

These flags tell Emscripten what libraries to build with, to target Web Assembly, and to allow the available run-time memory to grow as needed.

Apparently, ALLOW_MEMORY_GROWTH has low performance with asm.js, but is not an issue with WASM. You can also specify the available RAM at program start with INITIAL_MEMORY.

Show Application on an HTML Page#

If your build target ends with .html, Emscripten will automatically generate an HTML shell for your application. When you build without optimizations, with -O0, you can see all the un-minified HTML and JavaScript that loads your program, and learn how to customize it.

You can link with --shell-file path/to/shell.html to specify your own template. Here is my own default shell.html, based on the one generated by Emscripten. Note the special {{{ SCRIPT }}} tag near the end, which is replaced at build-time with JavaScript code that loads the WASM program.

Faur's default HTML shell

Run Application on Local Computer#

Browsers impose some security-related restrictions on pages loaded from the local file system instead of from a server. This means that you can't just open the generated HTML file to run the program, the code will not load.

I work around this with a Python 3 script that spins a server from the target directory and points Firefox to the appropriate address. The script runs as part of the Emscripten makefile's make run target, so I get the same dev flow as if I was working on a desktop app. C is great, but Python is my build-time secret weapon:

import http.server
import socketserver
import subprocess
import threading

host = 'localhost'
port = 0
server = socketserver.TCPServer(
            (host, port), http.server.SimpleHTTPRequestHandler)

with server:
    host, port = server.server_address
    file = 'MY_TARGET.html'

    server_thread = threading.Thread(target = server.serve_forever)
    server_thread.start()

    status, output = subprocess.getstatusoutput(
        f'firefox -new-window http://{host}:{port}/{file}')

    input('\nPress ENTER to exit web server\n\n')

    server.shutdown()
    server_thread.join()

C Language Details#

Compiler Flags#

I build C and C++ with -pedantic -pedantic-errors -Werror across as many platforms as possible, with some exceptions for old toolchains with very outdated compilers. For Emscripten, I build clean with those settings and with -std=gnu11 for C and -std=gnu++11 for C++. I usually stick to ISO standards, but Emscripten requires GNU for EM_ASM blocks.

If you build with -pedantic, then you may also need -Wno-dollar-in-identifier-extension, because EM_ASM block arguments are assigned automatic names like $0, $1, etc. which are not allowed by the language standard. Similarly, -Wno-gnu-zero-variadic-macro-arguments allows EM_ASM blocks without any arguments.

EM_ASM blocks are for writing inline JavaScript in C code, like you sometimes see assembly language being used on other platforms:

int year = EM_ASM_INT({
    // This is JavaScript!
    return (new Date()).getFullYear();
});

printf("Hello Year %d\n", year);

Casting Function Pointers#

Whether you target asm.js or WASM, function pointers are small integers like 1, 2, or 0x38 instead of larger values like 0x56242647907e. These numbers are indexes in function tables instead of addresses to function code, and the pointer type associated with the value identifies the table. When you cast a function pointer to a different type and call it, you end up reading an entry from the wrong table.

Thankfully, void and typed data pointers resolve to the same signature, which allows convenient patterns like using functions with typed pointer arguments in places that take callbacks with generic void* parameters. Being able to write functions like void my_cleanup(mytype* object) instead of void my_cleanup(void* object) is a small win for succinct and readable C.

Note that while convenient, this pattern is undefined behavior according to the C standard. The function types are incompatible because mytype* and void* are incompatible types. Technically, you're not supposed to cast and call one as if it was the other, although doing so is alright as long as your compiler implements the expected behavior.

While functions with matching void* and mytype* arguments might be interchangeable in practice, functions with and functions without return values are definitely not. I once changed a function that used to not return anything to return a number, but I overlooked a place where it was still being used as a callback without a return value... The program kept running fine on amd64 and ARM, but it threw an exception in the browser because it read at an invalid index from the wrong function table.

Order of Evaluating Function Arguments#

When you make a call like f(g(x), h(y)), does it matter which one of g(x) and h(y) runs first? If it does, or if you want to guarantee the same order on every platform, then you need to evaluate them separately beforehand:

int g_result = g(x);
int h_result = h(y);

f(g_result, h_result);

I hit this problem while working on procedurally-generated game levels. Both g and h used the same random number generator, and they each received different values from it depending on the order they ran. The generated levels looked different between the desktop and web game builds, despite starting from the same PRNG seed.

Program Flow#

The Infinite Game Loop#

The old-school game loop might look something like this:

1
2
3

while(game_is_running()) {
    game_run_frame();
}

Unfortunately this simple pattern does not work here. An infinite game loop would never give the browser a chance to handle user input or render any updates. Instead, the pattern is to run a single game frame and hand control back to the browser. You do so by registering a frame callback and giving up control of execution:

void game_loop_callback(void)
{
    if(game_is_running()) {
        game_run_frame();
    } else {
        emscripten_cancel_main_loop();
    }
}

emscripten_set_main_loop(game_loop_callback, 0, true);

In the call to emscripten_set_main_loop,

0 lets the browser decide how often to call your callback, like v-sync. You can pass a positive integer to set a specific frame rate.
true stops execution at this point, your next code that runs will be the loop callback.

Exiting the Program#

Emscripten programs call emscripten_force_exit(int code) where they would normally call exit(int code). I use this function at the end of a fatal error handler routine, otherwise the application naturally exits when the browser is closed.

Using Files#

Embedded Files#

Emscripten makes it easy to bundle up and embed files and directories for the application to use, like music and images. You can add them with --preload-file linker flags that identify local files and set their run-time paths.

Here is an example project with separate trees for assets, source code, and build files:

Project/
├── assets/
│   ├── image.png
│   └── sound.wav
├── build/
│   ├── Makefile
│   ├── Project.data
│   ├── Project.html
│   ├── Project.js
│   └── Project.wasm
└── source/
    └── main.c

You would build Project.html like this:

1
2
3

cd Project/build/
source $EMSCRIPTEN/emsdk_env.sh
emmake make

If the application wants to access Project/assets/image.png as assets/image.png at run-time, and given that we are building from Project/build/, the preload flag should look like so:

1 2	`# --preload-file <build-time path>@<run-time path> --preload-file ../assets/image.png@assets/image.png`

Finally, you can declare all the files and automate the flags:

1 2	`MY_FILES := assets/image.png assets/sound.wav LDFLAGS += $(foreach f, $(MY_FILES), --preload-file ../$(f)@$(f))`

Writing Files to Persistent Storage#

Emscripten uses the ephemeral MEMFS as the default backing file system, so everything your program writes is lost when the browser is closed. To persist files between sessions, you need to use IDBFS instead.

IDBFS has to be initialized before you can use it, and you have to link with -lidbfs.js. The init code should be placed somewhere early in main, before any other code that uses files. In this example, /my-idbfs is the starting point to all R/W paths, so the files you might work with would look like /my-idbfs/hiscore.sav, not ./hiscore.sav.

EM_ASM({
    FS.mkdir("/my-idbfs");
    FS.mount(IDBFS, {}, "/my-idbfs");

    Module.fs_is_ready = 0;

    FS.syncfs(
        true,
        function(Error)
        {
            // TODO: check Error
            Module.fs_is_ready = 1;
        }
    );
});

The Module.fs_is_ready flag is set in the post-init callback, which runs asynchronously. Module is a global Emscripten object that represents the running application. It is created in the HTML shell, and is a convenient place to attach our own JavaScript context data, like this flag.

IDBFS starts up very quickly, so to keep things simple the loop callback could just drop every frame until the post-init callback sets fs_is_ready. Going back to the loop example from before:

void game_loop_callback(void)
{
    if(!EM_ASM_INT({ return Module.fs_is_ready; })) {
        return;
    }

    if(game_is_running()) {
        game_run_frame();
    } else {
        emscripten_cancel_main_loop();
    }
}

Finally, I like to queue a file system storage sync after every write. Note FS.syncfs's boolean direction parameter: we want to flush file system memory to storage after a write, while previously we wanted to initialize the memory with data from storage.

1 2	`fwrite(...); EM_ASM({ FS.syncfs(false, function(Error) {}); });`

Using the SDL 2 Library#

Getting the Browser Window Size#

This is useful for making a full-size screen that fills the entire window:

1 2	`int width = EM_ASM_INT({ return window.innerWidth; }); int height = EM_ASM_INT({ return window.innerHeight; });`

You can then pass these values to SDL_CreateWindow, SDL_RenderSetLogicalSize, etc. as appropriate. I use this together with a special HTML shell that shows a full-sized canvas screen and nothing else.

Enable Sound in Chrome#

Annoying autoplay videos are why we can't have nice things. Chrome, and maybe other browsers too, automatically block audio on a page so your new application will be muted.

There are ways to enable audio programmatically, but they have to happen in the call stack of a user-initiated interaction, like a key press or a mouse click. This JavaScript code in the HTML shell fixes SDL 2 sound:

function audio_fix()
{
    try {
        if(!Module.SDL2
            || !Module.SDL2.audioContext
            || !Module.SDL2.audioContext.resume) {

            return;
        }

        if(Module.SDL2.audioContext.state == 'suspended') {
            Module.SDL2.audioContext.resume();
        }

        if(Module.SDL2.audioContext.state == 'running') {
            document.removeEventListener('keydown', audio_fix);
            document.removeEventListener('click', audio_fix);
        }
    } catch(e) {
        Module.printErr(e.toString());
    }
}

document.addEventListener('keydown', audio_fix);
document.addEventListener('click', audio_fix);

audio_fix is a handler for key and click events, so the fix will run in the context of a user interaction as required. You can find more extensive fixes that go beyond just SDL 2 in the GitHub issue linked above, but this was all I needed myself.

Using Game Controllers in Web Browsers#

In the interest of user privacy, browsers do not automatically allow web pages to use game controllers. First the page has to load, then the user needs to press an input on the controller, and finally your program must handle an add-controller event prompted by the user's press.

So, the common pattern of opening all the attached controllers in a loop at start-up does not work with Emscripten. SDL_NumJoysticks will always show 0 then:

1
2
3

for(int j = 0; j < SDL_NumJoysticks(); j++) {
    SDL_Joystick* joystick = SDL_JoystickOpen(j);
}

Typical SDL applications have an event loop that runs every frame, where you can now also handle the Added and Removed controller events:

for(SDL_Event event; SDL_PollEvent(&event); ) {
    switch(event.type) {
        case SDL_JOYDEVICEADDED: {
            int index = event.jdevice.which;
            SDL_Joystick* joystick = SDL_JoystickOpen(index);
            SDL_JoystickID id = SDL_JoystickInstanceID(joystick);

            // Add new controller and note its id
        } break;

        case SDL_JOYDEVICEREMOVED: {
            // Remove any existing controller whose
            // SDL_JoystickID matches event.jdevice.which
        } break;
    }
}

Epilogue#

Emscripten and Arduino are some of my framework's most unusual build targets. I think cross-platform software benefits most from these sort of systems, because they have interesting platform-specific needs that you have to abstract away, in turn making the entire project more robust and more portable.

A cross-platform game I'm working on