Self-describing format
What it is
It's a format on how to tokenize files.
It need a file header that starts with an element separator
Then a part separator, comment opener & closer, context opener & closer are defined in between element separator.
It then follows with trimmable characters up until an element separator.
As an example:
ABACADAEAFAGGGA
DATA HERE
Here:
A
is the element separatorB
the part separatorC
the comment opener,D
the comment closerE
the context opener,F
the context closerG
are the trimmable characters
See more examples below
What's weird about it
- Elements separator must be a single character
- Space as an element charcter makes the header look weird
- The header kind of looks weird on its own :/
Why
The goal of this format is to not be restrictive as to how to present data to the user.
It also thrives to be easily made compatible with existing formats.
The semantics of the data would still be up to the program to decide.
Why not only one abstraction depth
Why have parts and elements inside a context? Why not only elements and contexts?
The idea behind it is that most format out there already have a two level deep.
The two examples below kinda show this. CSV has ,
for parts and ;
or \n
for elements. Smalltalk has as delimiters between methods and objects, and has
.
has separator between statements.
To access deeper levels I propose to use contexts openers and closers kind of how Smalltalk has [...]
blocks or C-link languages have {...}
blocks.
If I would only define elements separator and context, we would basically have s-expressions which forces a lot of context indentation, also known as parens-hell.
Examples
CSV
As an example, CSV could be parsed from this format.
;,;{{{;}}};{{{{;}}}}; \n\t;
header 1, header 2, header 3;
value 1, value 2 , value 3 ;
Here we do not want comments nor contexts
A Smalltalk like language
. ./*.*/.[.]. \n\t.
SMALLTALK HERE
Lisp
___ (* *) ( ) \n\t
LISP HERE