Skip to content

Over the clouds: CPython, Pyodide and SPy

The Python community is awesome.

It is full of great people and minds, and interacting with people at conferences is always nice and stimulating. But one of my favorite things is that over time, after many conferences, talks, pull requests and beers, the personal relationship with some of them strengthen and they become friends.

View from a ski slope in Valtournenche, Italy

I am fortunate enough that two of them, Łukasz Langa and Hood Chatham, accepted my invitation to join me in Cervinia, at the border between Italian and Swiss Alps, for a week of hacking, winter sports and going literally over the clouds. This is a brief summary of what we did during our time together.

About us

The three of us from the panoramic terrace on top of Klein Matterhorn, Switzerland

Łukasz doesn't need much introduction, as he's one of the most visible personalities of the Python world: among other things, he has been the release manager of CPython 3.8 and 3.9, he is the creator of Black and these days is the CPython developer in residence.

Hood is mainly known for his work on Pyodide, which in my opinion is one of the most underrated projects in the Python world: it allows us to run CPython in the browser by compiling CPython and huge sets of extension modules to WebAssembly, using Emscripten. This may sound easier than it is, because WebAssembly is a very weird and young platform, meaning that over the years the Pyodide maintainers have had to develop a considerable amount of patches to CPython itself, Emscripten, LLVM, etc. If you use any website or project which allows you to run Python in the browser such as PyScript, there is a good chance it's actually Pyodide under the hood (pun intended 😅).

As for me, like many other Pythonistas, my name is associated with many projects starting or ending (or both!) with "Py", like PyPy, HPy and PyScript. A couple of years ago I decided that all these Pys weren't enough, so I started SPy, which is an experiment to see whether we can come up with a Python variant which can be easily interpreted (for good development experience) and compiled (for performance). This is the appropriate place to give a big thanks to Anaconda, which is allowing me to work on it full time currently.

Hacking

There is one thing which face-to-face pairing does and which is impossible to achieve with async and remote collaboration: you can see all the little tricks and tools which other people use in their daily hacking. On the first day, Łukasz showed me the wonder of Xonsh, a multi platform shell written and scriptable in Python.

Colored TAB completions with fancycompleter

Likewise, very soon he noticed that whenever I pressed TAB at my Python REPL, I'd get colored completions, thanks to fancycompleter. This is a project which I started ~15 years ago and I've used since then: at that time, it couldn't work out of the box on CPython, because it required a patched version of libreadline: but nowadays CPython ships with pyrepl which does support colored completions out of the box: with that in mind, we thought that it could be a good idea to integrate it in CPython. The result of this work ended up in PR #130473: it is still very WIP but hopefully I'll be able to continue working on it in the next days or weeks.

Meanwhile, Hood discovered that the latest version of Pyodide didn't work on iOS. In perfect accordance to the spirit of the week, Łukasz promptly paired with him to fix the issue.

So, with this, we have PRs for two of our three projects. SPy is next in line, but it deserves its own section.

The first SPy program ever 🥸

SPy occupied a significant portion of our week. At the beginning of the week, we dedicated time to explaining the fundamental concepts and design decisions to Łukasz and Hood.

Quoting what I wrote above:

SPy is an experiment to see whether we can come up with a Python variant which can be easily interpreted (for good development experience) and compiled (for performance).

Currently, the documentation is very scarce. The best source to understand the ideas behind it are probably the slides and recording of the talk which I gave at PyCon and EuroPython.

Now, this is a perfect time for a big disclaimer:

Warning

SPy is in super early stage, not even alpha quality. It probably contains lots of bugs, the language design is not fully stabilized, many basic features are probably missing.

That said, SPy can already do interesting things. In particular, after I showed the array example, Łukasz realized that despite the immaturity, SPy is already good enough to speed up one of his generative art projects.

He has already written an extensive post about it, so I'm not going to repeat the full story here. Let me just quote a few intriguing excerpts to pique your interest in reading more:

Let’s get one thing out of the way. SPy is a research project in its early stages at the moment. You should not attempt to use it yet, unless you plan to contribute to it, and even then you have to come with the right mindset. [...]

With all this in mind, SPy looks very attractive to me already. It’s a language designed to be friendly to Python users, but is not attempting to be Python-compatible. It can’t be, because with SPy, user code can be fully compiled to native binaries or WebAssembly.

For the first end-user project in SPy, I decided to convert an existing Genuary entry I made with PyScript that draws an endless abstract topographic map. [...]

This computation was too much for pure Python in either Pyodide or MicroPython to happen inside the animation loop, so in the original project I pre-computed the map area in a Web worker. [...]

The SPy version of the project ditches the Web worker as the computation is over 100X faster. You’re literally waiting longer for the background audio file to load. The result looks exactly the same, which was an important metric for us.

Remember when we said that SPy still misses many basic features? Here is a list of PRs that Łukasz had to make in order to achieve his goals:

Thanks to this, we also got Łukasz as a contributor to SPy. Only Hood was left...

SPy playground

The SPy interpreter is written in Python (for now... eventually, it will be written in SPy itself), and Pyodide/PyScript makes it very easy to run Python in the browser. The goal for Hood and me was to be able to run the SPy interpreter on top of Pyodide, to make it easier for people to try it out.

This proved to be challenging because of the very peculiar way in which SPy relies on WASM. WebAssembly plays a central role in SPy, for two reasons:

  • compilation to .wasm is a first class feature (by converting .spy to .c and then invoking clang)

  • the interpreter uses wasmtime as a sandbox for SPy's application-level memory allocation

The latter point requires some extra explanation: SPy includes a special "unsafe" mode that allows the use of low-level constructs like pointers and structs. These constructs, while powerful, pose a risk of crashing the interpreter due to their unsafe nature. To mitigate this risk, SPy executes these unsafe portions within a WASM sandbox using wasmtime. This approach ensures that any potential crashes are contained within the sandbox, preserving the stability of the interpreter. Additionally, this system is employed to call the runtime library, which is partially written in C. The C code is compiled to WASM and subsequently loaded by wasmtime, providing a secure and efficient execution environment.

Another nice aspect of this architecture is that you can instantiate multiple SPy VMs in the same process, since all the state is stored in the sandoxed WASM memory.

The challenging part is that wasmtime doesn't run on top of Pyodide. On the other hand, Pyodide literally sits on top of another WASM engine, provided by the browser and exposed by Emscripten.

With this in mind, Hood and I started a SPy branch called pyodide: the plan was to create a layer called llwasm to abstract away the differences between wasmtime and Emscriptem, so that we could transparently use one or the other from the SPyVM.

This proved to be more challenging than expected, in part because the WASM API exposed by Javascript/Emscripten is async, while the one exposed by wasmtime is sync. Anyway, after a few days of intense work we managed to have it working 🎉, although the PR is not merged yet because it requires some polishing.

With that, I could hack together a quick & dirty SPy playground: it is written with PyScript + LTK. My web design skills leave a bit to be desired, so improvements and PRs are totally welcome. You can open it in an separate page to have it full screen, or you can load the inline version below:

The playground allows to see the effect of all the various passes of the SPy pipeline:

  1. --execute: run the code in the SPy interpreter
  2. --parse: parse the code and dump the AST
  3. --redshift: perform redshifting
  4. --cwrite: convert redshifted code into C

There is the last missing step: compiling the generated C code to WASM, which is currently not possible because it would require to run clang in the browser. Again, if anybody has any idea on how to make it happen, PRs are welcome.

Conclusion

Łukasz, Antonio and Hood with snowshoes

It has been a fun and productive week! While this post is rich in technical details, I think it's important to highlight the value of personal relationships and the joy of spending time together. A big thank you to Łukasz and Hood for visiting!

Comments