Description
What is LLVM-IR? The LLVM Compiler Infrastructure Project provides a transportable intermediate representation (LLVM-IR) that can be compiled and linked into multiple types of assembly code. LLVM-IR is great because you can take any language and distill it into a form that can be run on many different machines. Once the code gets into IR it doesn’t matter what platform it was originally written on, and it doesn’t matter that Python can be slow. It doesn’t matter if you have weird CPUs - if they are supported by LLVM it will run. What is Tupleware? TupleWare is an analytical framework built at Brown University that allows users to compile functions into distributed programs that are automatically deployed. TupleWare is unique because it uses LLVM-IR to be language and platform independent. What is PyLLVM? This is the heart of the talk. PyLLVM is a simple, easy to extend, one-pass static compiler that takes in the subset of Python most likely to be used by Tupleware. PyLLVM is based on an existing project called py2llvm (https://code.google.com/archive/p/py2llvm) that was abandoned around 2011. This talk will go through some basic compiler design and talk about how some LLVM-IR features make our lives easier, and some much harder. It will cover types, scoping, memory management, and other implementation details. To conclude, it will compare PyLLVM to Numba, a Python-to-LLVM compiler from Continuum Analytics and touch on what the future has in store for PyLLVM. Talk Objective Attendees will learn what LLVM-IR is and how it can be leveraged to allow data scientists to write their algorithms in Python. They will leave with a high-level understanding of the design process and considerations of writing a simple compiler. Last, they will know all about how PyLLVM coaxes Python code into LLVM-IR. It will become evident how cool LLVM and Python can be when they work together!
Slides available here: http://il.pycon.org/2016/static/sessions/anna-herlihy.pdf