View on Github

Jython 3 Roadmap

This discussion document attempts to outline the steps to Jython 3, defined by the MVP Features. There are probably glaring omissions. It is deliberately without dates.

Apart from delivering the features, it aims to satisfy certain voluntary constraints, perceived as healthy in the long-term:

In the interests of the second of these objectives (history), in the following, where it is implied we implement some class or package, the default approach will be to build on and credit prior art. To achieve this for each source file, we git-move, (commit) and afterwards modify the closest corresponding Jython 2 file.

The middle commit will often produce a version that does not build, and should not be pushed to the project repository as the tip. Subsequent editing and another commit will correct that. Dead code, normally a Bad Thing, should remain until we know we won’t resurrect it to supply a later feature. It may be necessary to create stubs to satisfy references in un-resurrected code.

A Sketchy Plan

Scorched Earth

  1. Restructure the code base so it builds with Gradle. Legacy code stays where it is, waiting to be moved and modified.

  2. The first build target is a library jython-3.8a1-DEV, but initially it will be empty.

  3. Provide some basic landmarks (modular project structure) and a convention for controlling log messages by sub-system. (It’s a debugging aid for us and evolves to information for production use.)

Type and Arithmetic

  1. Outline architecture for objects, types, operations, slots. Specify the abstract object API (analogous CPython’s).

  2. Implement PyBaseObject, PyType, and some simple types (mostly without operations). Implement only the Java API and write JUnit tests for type construction and inheritance.

  3. Implement type slots, add slot functions to simple types, and implement an abstract object API. Test the whole via JUnit, calling the abstract object API. From here on, new operations added imply new abstract object API and JUnit tests to match.

  4. Validation that acceptable performance is achieved, invoking arithmetic operations through the API. Likely to be measured using micro-benchmarks, built as an application over the library. Parity in the performance of add(a, b) with CPython a+b is acceptable performance. (A range of operations is intended, not just +.) The micro-benchmark suite should grow as features are added.

Interpreters and Threads

  1. Outline architecture for interpreters, frames and the thread model.

  2. Interpreter, PyFrame and PyCode supporting execution of initial subset of CPython byte code. From here on, addition of a new feature includes corresponding additions to the repertoire of the byte code interpreter, in order to accept byte code that depends on that feature.

  3. The means to read a code object output by CPython. It may be just a provisional mechanism, or a partial implementation of pickle.

  4. PyJavaFunction and PyJavaModule (but not import yet).

  5. Rudimentary form of builtins module. Subsequently, objects will be added here as needed.

  6. Micro-benchmarks that execute the compiled form of Python fragments in the compatible Jython PyFrame. Target is parity with CPython timeit results on the same fragment. Code and reference generated from a string. (Is a framework possible to make this ever-expanding suite least work?)

  7. Micro-benchmarks validating parity with CPython f(args, kwargs), over a variety of argument patterns f(), f(x), f(x, k=1), etc..

  8. Validation of correct operation with concurrent threads, especially that types do not escape incomplete from construction. This suite should grow as features are added that carry a risk of incorrect concurrency.

Descriptor Protocol

  1. Further architecture of the object model, aiming for a complete description of types defined in Python or Java and of multiple inheritance.

  2. Implementation of classes defined in Python (but still compiled by CPython).

  3. Descriptor protocol and mechanisms to populate the PyType dictionary and slots from classes. Test via JUnit (directly or via the abstract object API).

  4. Definition of classes, members and methods using annotations in Java. (Something like the Jython 2 exposer but less opaque, documented, and simplified using MethodHandle.)

Experiment with Object

Consider the advantages to performance, and to the transparency of Java integration, of making every Object a Python object. Explore the idea of “acceptable implementations” of common built-in types to allow e.g. String to be a str. Experiment with CallSite as a consumer of the MethodHandle already in slots.

Resolve the PyObject vs Object dilemma.

Java Integration

Approach to and basic implementation of treating Java objects as Python objects, having a Python type related to their Java class, when they have not been specifically identified (built-ins). Performance micro-benchmarks modelling code compiled to Java.

Module Import

  1. Outline architecture for modules and importers, giving special attention to the semantics of Java packages and modules. Advances in the module concept in Python should allow us to avoid some of the special cases and thrashing around we find in Jython 2.

  2. Rudimentary forms of sys, io. Subsequently, objects to be added as needed.

  3. Implement import mechanism closely following CPython.

  4. Use custom finders (probably) to import objects from Java.

Compiler

  1. Further selected stdlib modules as necessary in the compiler.

  2. AST classes generated from Python ASDL. Generated classes are Python objects in an ast module. (Question: should they be generated in Java? With ANTLR?)

  3. Compiler from Python source to AST, probably using the PEG parser. (If adopting PEG, compile it with CPython and run it with Jython.)

  4. Compiler from AST to CPython byte code: using the version in Python if possible (compiled with CPython). Otherwise, follow CPython implementation in Java. (There is no CPython byte code compiler in Jython 2 legacy.)

Jython Command

  1. Jython 3 console command as a Java application built over the library.

  2. Means to invoke Jython on all major OSes.

Python stdlib

Progressively introduce the stdlib and its regression test suite.

The CPython regression test suite is hugely useful for driving our conformance and completeness, but the test process itself relies on a large proportion of the language and stdlib working already.

Implementation Notes

A few principles, some drawn from discussions on jython-dev or off-list.

Build environment

All this ought to be followed from the start in order to maintain acceptable quality (which is MVP).

Some roads not to be taken