Roadmap
The roadmap document define the direction that the project is taking.
The initial and decisive part of the project is the implementation of native tensor abstractions backed by Apache Arrow. But in order to get to that point, we need first implement a bunch of small pieces across the Arx + IRx stack. Arx owns the surface front end (lexer, parser, docs, examples), while IRx owns AST definitions, semantic analysis, lowering, and code generation.
Improve the language structure
Data type support
ArxLang is based on Kaleidoscope compiler, so it just implements float data type for now.
In order to accept more datatypes, the language should have a way to specify the type for each variable and function returning.
Implement native tensors
Native tensors now have an initial Arrow C++ backed implementation. Remaining work should continue to make runtime-shaped tensor values usable in more contexts, while preserving the same runtime-layout rules for every collection type that uses that approach.
DataFrames and Series
DataFrames are a distinct public collection abstraction for heterogeneous named columns. Static-schema values use dataframe[name: T, ...], column views use series[T], and literals are constructed with dataframe({...}).