"Creating a language on the JVM" a talk by Ola Bini

I was lucky enough to be able to attend a talk organised by London Geek
nights
at the
Thoughtworks London office entitled “Creating
a language on the JVM” by Ola Bini. Ola is a committer
for JRuby and is also working on his own language
Ioke.

The following are rough notes that I made during the talk. They’ve been tidied
up a little and I’ve added links where appropriate but I have not put any
effort into making a coherent narrative. Phrases in “double quotes” are
quotations from Ola himself. Ola’s slides are now published on
here on his
blog.

The first issue Ola addressed was why target the JVM when producing a new
language.

  • memory management is robust and efficient
  • hotspot compiler gives high performance
  • support for concurrency
  • open source, many implementations
  • mature code loading
  • highlevel abstraction
  • platform independent
  • reflective access to classes and objects
  • good debugging tools
  • many mature libraries

Ola was asked to comment on the JVM as a platform compared to the CLR. He said
that the JVM is five years older, currently has better optimisation and so
tends to be faster. The CLR has some bytecodes that the JVM lacks but Ola
didn’t seem to think that they were particularly important.

Most of the talk covered general language design with some mention of how some
of these issues related to the JVM with concrete examples from JRuby and Ioke.

JRuby is now almost at version 1.2 “best implementation of ruby for many
purposes.” Ioke is an experiment to see how expressive a language can be. Has
a prototype based object system.

Decisions when designing a language

  • General purpose or special purpose
  • Paradigm: object-oriented (what kind, eg prototype vs classes)
  • syntax
  • type system: static vs dynamic, weak vs strong
  • other features

One advantage of the JVM is the ability to use Java languages - however you
might find that the java languages are not a good match for your new language.
For example Java does not support closures and so the Java libraries are not
designed with closures in mind they would not feel therefore natural for a
language such as ruby. In contrast the ruby libraries are designed to take
advantage of closures.

Syntax - often separated into lexing and parsing with both the lexer and
parser being auto- generated from some form of grammar. Jruby uses a hand-
coded lexer and a parser generated by
jacc Ioke uses a lexer/parser generated
by antlr.

The JRuby lexer is complicated (approx 2400LOC) both because of the nature of
ruby itself and also because in JRuby it’s possible to turn individual ruby
1.9 features on & off.

Type checking - what kind nominal or structural?

  • System F
  • typed lambda calculus
  • parametric polymorphism
  • variance

Scala & Haskell are currently the state of the art in type checking.

An issue to consider is the notion of bottom types - types that are a subtype
of every type, for example null in Java and C#. Note that nil in dynamic
languages is not the same thing. It’s been proven (apparently) that if you
have bottom types then you can’t avoid runtime errors. Several languages go
for option or maybe types instead.

Scoping - dynamic, object or lexical - emacs lisp is the only currently
used dynamically scoped language.

Control structures expressions vs statements

  • precedence , associativity
  • initialisation
  • assignment, not all languages have it, for example in erlang what looks
    like assignment is actually unification
  • ordering within expressions
  • short circuiting

control flow

  • goto style vs continuations
  • iteration
  • recursion

What about operations such as ruby’s conditional assignment operator ||= (only
assign a value to a variable if it is not already bound)?

How about the trinary if statement - useful if everything is not an expression
(but ruby has a trinary if anyway since it’s more compact than the standard if
form).

Notions of truth - what is counts as true and false?

data types - do you have primitive versions of types for efficiency? What
constitutes equality?

Parameter passing - call by value, by reference, by name?

can you return multiple values from functions? - difficult to do in JVM and
not every efficient since all return values have to be wrapped Optional
arguments? Default values?

The closer the language is to java the easier it is to make the language
efficient at run-time.

Evaluation - how to evaluate your language? interpreter, internal byte
code, java byte code, continuation passing.

Java memory leak issue - need to create new classloader in order to reload
code for a class - this does not get garbage collected (anonymous classes
don’t have this problem?). CLR is better than java in this respect.

Internal representation - JRbuy wraps all objects so that all JRuby
objects inherit from a common type. however this means that Java objects
cannot inherit from jruby objects. They are in the process of removing this
wrapping.

Visitor pattern is obvious way to implement AST interpreter, however this
prevents hotspot from optimising so they use a large switch statement.

Final methods can be much faster in java so worth seeing if you can make use
of them.

“write bad code to get good performance” but “write good code first” (use
profiling).

Error handling

  • exceptions
  • conditions (“a formalised agreement on how to handle
    errors”), can be built using exceptions.

Ola briefly talked about error handling using
conditions and how they allow more flexibilty than
exceptions. A condition consists of code for signalling the exception, code
for handling it and code for restarting the normal program flow. Ola showed us
how conditions work in Ioke by intentionally introducing errors in some code
and showing how they could be recovered from interactively using conditions.
Whereas exceoptions are only for errors, conditions are can also be used for
warnings.

Java integration - Can be hard to get right. Calling java methods
including static ones. Choosing between overloaded implementations. Need to
implement java types in host language. How to implement interfaces? Can use a
dynamic proxy but this approach does not work for extending classes. What
about calling methods using super?

Need to really understand bytecode. Ola uses ASM

Can use code annotations to get sensible output from java debugger.

Limitations of java bytecode

  • statically typed bytecode
  • primitive types but not tagged values
  • no continuations
  • no tail call optimisiton

Ioke

Ioke sounds like it has inherited some ideas from
self

Entire language written test first - rspec cases for entire language, rspec
tests then converted to ispec tests.

ruby concurrency support is not too good - Ola plans to implement transparent
futures in Ioke.