Roadmap

From SlateWiki

This is an outline of the Slate creators' plans for the future.

Table of contents

Bootstrapping

The end-product of all of the work on Mobius is a Slate implementation framework where the full system will consist of a group of modules that are linked together quickly on startup to form a full system that accomplishes everything that an OS does at run-time, but differently. First, a dynamically-available on-line binary compiler will reside in the system and work together with system-observation mechanisms such as type-inferencers, type-collectors, and profiling and instrumentation tools. There will still be a "virtual machine", but it will be available not as the implementation's core but as another online tool for creating quick-but-compact code for low-intensity uses (e.g. scripting and interactive use).

General Tasks

  • Complete and test the binary compiler framework end-to-end
  • Implement a source database that works with the module and linking framework and has some basic but extensible versioning support.
  • Set up a binary compiler and image, built entirely from linking previously-mentioned modules.

Pidgin Migration Path

The eventual goal of the optimizing compiler is to provide a "primitive mode" for the full Slate language. See doc/mode_notes.txt (http://slate.tunes.org/repos/main/doc/mode_notes.txt) for Lee's thoughts in the area; doc/ (http://slate.tunes.org/repos/main/doc/) also contains the relevant documents ir_notes.txt, mr_notes.txt, and ir2mr.txt. To summarize, Slate could be recursively embedded with primitive modes using a simple macro that annotated source trees. The mode of evaluation for the primitive mode sections would have been to use a translation system akin to Pidgin but targeting the IR tree form (and some memory model starting with the optimizer types plus the MR perhaps) instead of C Syntax and types/memory semantics. This would be qualitatively different from the compiler front-end, which does exist and mostly works.

I think we can finish this by looking at Pidgin and improving it to the point that it can be abstracted from specific C issues, and then implement a similar translator for (a sub/super-set of) the same resulting dialect to the IR. There are a number of things necessary to accomplish this; I'll outline what I've thought of already below.

Type System Mapping

We'd need to raise the level of abstraction of the type system so that the IR types were used and then translated to C types for the C translator. This mostly consists of abstracting over size of various types and mapping the ones for specific types to short/long/etc. If you read: src/mobius/optimizer/types.slate (http://slate.tunes.org/repos/main/src/mobius/optimizer/types.slate), there are still a lot of 32/64-bit hardcoded assumptions that just make things a bit messy. I think parametrized/dependent types are the kind of direction it should move in, mostly to generate most of this code.

Pidgin Cleanups

Adjust Pidgin so that it is not so quirky. For example, code in method definition bodies is written in "actual" Pidgin, whereas code outside of it is processed in a mix of Slate-in-a-funny-namespace but minus some things you'd normally get in Slate. For example, -> and True and False don't mean the same thing there. This needs to be sorted out into something more comprehensible and composable, even if it means making more verbose code. Ultimately, the idea is that we can compartmentalize code and semantics and descriptions into places where we just wrap it in a macro. For example, we do this with `pidginPrimitive in the libraries in src/mobius/vm/ext/ which define Slate methods which can be called from the image-side but are defined in the VM as wrapping C routines or other VM internal routines.

I've also thought about the merits of embedding more C code into the Pidgin sources, but this maybe makes re-compilation heavier and is probably best not delved into until we are free of source files per se.

Higher-level Libraries

I'd like there to be (but there doesn't need to be) use of better control-flow idioms at the low-level for code that can handle it. For example, in Pidgin, we can translate the following into C syntax:

ifTrue: ifFalse: ifTrue:ifFalse:
whileTrue: whileTrue whileFalse: whileFalse loop
upTo:do: downTo:do: below:by:do: below:do: above:do: to:by:do:

This is because code has been explicitly written for the C SimpleGenerator to translate them. However, some of these methods are defined in src/lib/method.slate (http://slate.tunes.org/repos/main/src/lib/method.slate) as merely calling the more "basic" methods with default arguments. So it occurs to me that there are a lot of opportunities to use higher-level methods without extra coding, by treating specific method definitions as translations. I say "specific" because I don't want to have to use an inference engine for this at first; I'll probably just gather up idioms I'd like to use and run one analysis on them to get the translation out of them. Eventually it might be better, and I have ideas about that, but they're not worth detailing until the basics work. Sufficed to say, that it'd be nice to have Array's #do: and relevant kin be available - anything that translates well enough to low level code with trivial re-interpretation.

As a note, src/mobius/optimizer/ir/generator.slate (http://slate.tunes.org/repos/main/src/mobius/optimizer/ir/generator.slate) has similar but slightly outdated translations.

FFI Integration

The FFI needs to share more code with this combined C-and-IR system than it does now, which is just src/mobius/c/types.slate (http://slate.tunes.org/repos/main/src/mobius/c/types.slate). Being able to generate plugins and FFI binding specs from the same source Pidgin code seems like it would be more usable and powerful than what we have now. The bootstrapper and so forth are pretty monolithic right now and only do one job, so it'll take time to figure out a more generic approach or at least how to re-use the Pidgin generator for a different task.

As a side-effect, it occurs to me that an FFI for the VM itself as well as some extension for bytecode-level stepping of some sort would be a productivity boost when talking about debugging and diagnostics. Basically we wouldn't need GDB for a lot of purposes in testing new changes and so forth. At first, I'd probably be the only user, but it'd be a lot more productive and maybe be useful to everybody once it had a decent setup.

A more important change might be that I'd translate the relatively simpler FFI plugins like the Socket and Posix and Time stuff into Pidgin and try those out for basic tests. It's certainly important for them to work well (and have good performance), and I think we can do this without compromising our goals if the above plans work.

A very simple FFI change would also be to adapt the use of CObject, which takes C structure/type definitions and makes faux Slate objects that get/set binary representations of them in Slate ByteArrays, to the new malloc()-memory VM module. This might have implications that I don't really foresee too well, but obviously we'd have to consider security even while the benefits of easily seeing at a Slate level what's going on "below" could be profound.

VM splitting

As part of the longer-term goal of moving to an optimizer-based platform, I will try to move parts of the VM into malloc()-managed memory via our new VM module or something like that. This will remove compile-time-specified limits on resource usage and might make the migration less onerous on the VM design itself, via the CObject extension I mention above.

Development Goals

Slate should support the full development life-cycle from within itself. This is not an ideological goal, nor does it preclude working with other tools: the intent is that we develop a very generic and pluggable tool-chain simply for everyone's best benefit.

Browsing / Exploring

There are several items to address:

  • Sources need to be available from the image, starting with the CompiledMethod sourceTree annotation slots (see slurpSourceIn).
  • The OmniBrowser engine needs to be ported to Slate, with Slate-specific adaptations for roles/role positions.
  • A text editor needs to be completed, using the UI or ncurses or portable over both.
  • The browser and the editor need to be able to call into one another, with standard hooks for the usual code-navigation.
  • A lint equivalent needs to be available as an extensible framework, and available from the editor.
  • The editor needs to be extended with an incremental parser framework and a Slate incremental parser for interactive feedback on the code.

Remote Development

  • Develop a means of remotely calling introspective API's from other Slate systems (like the Swank protocol for SLIME for Common Lisp).
  • Refactor the tools to use this API and then treat local and remote systems polymorphically.
  • Build easy configurations of master/slave development cycles for embedded (in software and on remote hardware) development. See also Squeak's Spoon.

Code / Version Management

  • Build a change-expression and change-migration framework. Initial influences: DARCS (http://www.darcs.net) and Monticello (http://www.wiresong.ca/Monticello/).
  • What we care about: diffs, patches, merges, publication, and subscription.
  • Extend it for version-management of object-graphs so we can version the Slate environment itself.

Migration System

Overview

Complete the implementation and use of the module system to trace out sections of the image for persistence and migration - basically we need to be able to "slice" parts out of the system for distinguishment of portable vs. non-portable parts.

Prior Art:

  • Squeak's Image Segments (http://minnow.cc.gatech.edu/squeak/2316) when used for export.

Open issues:

  • What binary format is needed? Do we need to layer it over a facility we can re-use?
  • What kind of slice operations will be needed? Will the graph always be connected?
  • How do we version the format itself?
  • Do we need a startup hook mechanism for the loader?

Down the road:

  • We need to publish and refer to standard modules for easier linking resolution.

Tracing

A migration is started by tracing a sub-set of the Slate heap, from some particular namespace, into a graph representation. The graph should contain high-level records for objects in the heap such as methods, maps, arrays, numbers, etc. Records need to represent objects as abstractly as necessary to allow for portability. External links must be represented using special records which are recognized specially by the loader. Records that describe how to locate the external object (i.e. by a path from the lobby), which are evaluated when the graph is loaded as the value of these records, as opposed to creating entirely new objects. The graph is then serialized for storage / transport (using streams, of course).

Loading

Objects are reconstructed and linked according to the records (either by creating new objects or locating existing ones) in the serialized representation. Eventually, we need to make the loader's relinking phase robust and extensible, enough to handle various types of failures and choices due to incompatibilities of varying degrees (CPU type, API expectation mismatch, basic linking error, Set/Dictionary hashing issues).

Language

  • Developing the meta-object protocols to support generative-style programming and better code-factoring.

Syntax

Abstraction

Use the results of the User Interface project to abstract over the concrete syntax:

  • to allow and present Unicode characters for appropriate operators.
  • to allow "styled views" of code and direct manipulation of expressions at a high level instead of a low (character-strings) level.

Idioms

See this thread (http://lists.tunes.org/archives/slate/2005-September/001605.html) about making a new idiom for expressing assignment vs. verbage. Reasons:

  • For attribute foo, foo: looks like the syntax for using a verb of the same name.
  • This idiom should work for both literal, physical slot updates as well as virtual ones where the assignment is delegated to some component object, or where the idea of assignment is just used for intuition (or compatibility with an interface previously using a physical slot).
  • Meta-data or annotations are no substitute for a separate idiom, because the person or program reading Slate code will not be able to find the method with that annotation unless it has perfect type inference. This is not an acceptable compromise.

Summary of conclusions:

  • Binary selectors would be appropriate since we can use alphanumerics in those selectors. However, selectors starting with alpha's still count as unary - requires a lexer change - is this a bug to fix or what?
  • This is quite a bit of effort while we still don't have automatic code-manipulation tools or an in-image IDE, so the action on this will wait.

Possible syntaxes, for attribute foo:

  • foo= - easy to type, looks right
  • =foo - so it doesn't resemble foo =.
  • foo:= - a little longer to type, looks like Smalltalk
  • =foo= - we'd like to reserve this instead for an idiom of foo-style equality comparison, or comparison of an attribute foo

Lexer / Grammar extensions

  • SEXP-style statement separation - the parser can do this easily with a quick change (and did in fact do this until it was discovered), where separation of expressions with a space was the same as period-separator usage when it was not ambiguous. Parentheses surrounding statements can be used where it would be ambiguous (e.g. (foo) (bar) for two unary sends in a row).
  • As mentioned above, allowing selectors to be binary that start with letters but end in the usual binary punctuation operator symbols.
  • Make foo (bar) legal - right now it causes hideous errors. One choice is to interpret it as foo: (bar) which would save a bit of typing but might be harder on the human reader of code. Also, the syntactic abstraction project would probably also remove the importance of concrete colon punctuation, using visual globbing instead. Of course, we could just have it throw a proper error.