This document benefited from discussions with Vasil Diadov, Kian Wilcox, Carl Shapiro, Willi Riha and Brian Spilsbury. Shen was directly derived from Qi. Qi is a hypermodern functional language that offers many features not currently available under other functional platforms. Qi was
written solely to run under Common Lisp. The motive for
Shen came from the observation that the implementation of
Qi used only 15% of the system functions described in The Shen mission was to develop an ultra-portable version of Qi that can run under a wide variety of platforms and which incorporates missing features in Qi such as streams. This approach involved rewriting and redesigning Qi to fit within the smallest feasible instruction set. The means
of achieving this involves developing a small fast Lisp
called In terms of design, Kl reflects the following design criteria. 1. It
should be powerful enough to express Qi. Some comparable small Lisps to Kl are PicoLisp, FemtoLisp, TinyScheme and SmallLisp. Unlike Common Lisp, Kl is very portable over different platforms. In place of Common Lisp's 1150 pages of specification, Kl requires only 46 system functions. It is much easier to develop a stand-alone implementation of Kl than to build a complete version of CL. Even more significantly, it is possible to map Kl into popular existing platforms such as Clojure, Python as well as Common Lisp. The
implementation, Shen, attached to this document runs Qi
on this reduced instruction set and, in terms of
primitives required, has nearly 1/3 of the footprint of
Qi II. Kl misses many of the features of bigger Lisps (like macros) because Shen effectively incorporates many of these features. Even LIST and QUOTE are missing from Kl because they are not needed for Shen and Qi has never needed or missed them. Similarly we do not impose a comment convention on Kl because we do not consider that people need to or should write in Kl. The emphasis is on devising an absolutely efficient minimal set needed to port Qi which is easy for people to code in other languages. However
anybody who implements Kl, and
wants to write directly in this notation, can add extra
features to their implementation to Kl. This does not affect the standard; just
as a CL which includes (e.g.) threads does not cease to
be a CL even though it incorporates a non-standard
feature. The development of Shen was motivated by the desire to use Qi outside the ambit of Common Lisp. Though Common Lisp is a very powerful language, it's usage and libraries do not compare well with other languages such as Python. It is, for example, difficult to find providers who support Common Lisp, though many providers will offer Python as part of their services. Hence Shen was devised as the solution. Shen is Qi packaged to run under many platforms. People have asked why Shen is called 'Shen' . There is a deep reason. The words 'qi' and
'shen' are part of the Taoist vocabulary. The concept of In terms of this
process, Shen is that of a RISC
version of Qi II with extensions to incorporate streams;
it is Qi running on a reduced instruction set and that
reduced instruction set defines (much of) Kl. The release of Shen follows this convention: all spec changes increment the leading digit; all patches increment the number following the point. The Shen license is a
freeware license that enables you to freely use Shen for
commercial purposes. You are free to write application
programs for Shen and do what you want with them. This standard is fixed
by Dr Mark Tarver and is described in the latest The motivation for these restrictions is simple. We want to build a community around a reliable standard platform that people can depend on. Unlicensed, non-conformant and buggy hacks do not help us in this mission. The detailed license and it's explanation can be read here.
There are 12 basic types in Shen. 1. symbols
.... abc hi-there, The_browncow_jumped_over_the_moon Note that the last two categories can merge depending on the platform. Any symbol,
string, number or boolean is an The type
system of Shen differs from Qi II in not placing
variables and symbols into different types. This arises
mainly from dropping rule closures that appeared in Qi
II. The type The
following set represents the set of 46 primitive
functions which Shen requires and which are used in Kl. All other functions are included in the
Shen sources. The CL definitions are given to explain the
semantics of these functions. Note in some cases these
primitives can be reduced still further (e.g Kl uses strict applicative order evaluation except for the boolean operations etc. which follow the usual pattern. NIL is just a symbol with no special status. () denotes the empty list.
is very simple and conforms to a Lisp. Well-formed sentences of Kl are symbolic expressions (s-exprs). Any symbol, boolean, string or number is an atom. Any atom is an s-expr. Any abstraction (lambda X Y) is an s-expr if X is a symbol and Y is an s-expr. Any local assignment (let X Y Z) is an s-expr if X is a symbol and Y and Z are s-exprs. Any definition (defun F X Y) is an s-expr if F is a symbol and X is a (possibly empty) list of symbols (formal parameters) and Y is an s-expr. Any application (X _{1}... X_{n}) is an s-expr if X_{1}, ... X_{n}are s-exprs.
These notes are for programmers wishing to implement Kl.
(defun
f (x y) y) is sugar for (set 'f (lambda
x (lambda y y))). This model supposes a
single namespace for functions and variables. You
have this in Python.So in the classic model 'defun' and 'set' together are logically superfluos. It is more reasonable to regard 'set' as logically prior since incrementing a global variable is much more easily carried out through 'set' than 'defun'. Incrementing should be very fast - function compilation is generally very slow. In a dual namespace model, the 'defun' and 'set' are regarded as creating an association between the symbol and something else; it is a mapping not an assertion
of identity. Generally Qi and Shen
require a dual namespace for symbols. The reason
for this is to do with the Qi evaluation strategy for
symbols which is that symbols are self-evaluating.
In Qi the symbol 'f' evaluates to itself. If
we want to get at the value associated with 'f', we type
'(value f)'.
But Qi does not work that way. Allowing symbols as self-evaluating really points away from this model. Hence Kl and Shen are committed to the dual namespace approach.
Note in
Shen 8.0, typed zero place functions were introduced.
Here is
In Kl
The Shen evaluator
compiles Shen into Kl and Kl into
native code. The Shen evaluator accepts S-exprs as legal
input and since Kl expressions are such, A. (defun list-all (x y z) [x y z]) is not legal Kl and would have to be written as follows to be legal Kl B. (defun list-all (x y z) (cons x (cons y (cons z ())))) However expression A.
will be accepted and compiled by Shen into expression B. The list of
boolean operators contains some logical redundancy.
Logically only Shen
includes a
xxxxxx<test b> <result
b>...............) and which
is equivalent to In Shen an
atom is either a symbol, boolean, string or number. All
atoms, thus including symbols, are self-evaluating. The
empty list is not counted as an atom but is also
self-evaluating. Hence there is no quote in Kl or Shen. Mapping into CL which does use
quote is trivial. The function <symbol>
:= <alpha> <symbolchars> | <alpha> <alpha>
:= a | b | c | d | e | f | g | h | i | j | k | l | m | n
| o | p | q | r | s | t | u | v | w | x | y | z <digit> := 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 A symbol is a variable if it begins in an uppercase letter.
Qi II and Common Lisp are all members of the Lisp family of languages. In Lisp, symbols are often semantically context sensitive; that is, the interpretation of a symbol may depend on the context in which it occurs. For example in this Common Lisp expression; (list 'John 'put 'the 'car 'into 'reverse), or in Qi II [John put the car into reverse], the symbol 'reverse' denotes a symbol. But in Common Lisp the expressions (reverse '(1 2 3)) (in Qi II (reverse [1 2 3])) and (mapcar 'reverse '((1 2) (2 3) (3 4))), (in Qi II (map reverse [[1 2] [2 3] [3 4]])), the symbol 'reverse' actually denotes a function. That means
in the Lisp family, symbols are semantically context
sensitive. At the head of an application they can only
denote a function and there is no ambiguity; but within
the body of the application they can denote either a
function or the symbol quoted. Lets us say a symbol is To help reading, Common Lisp includes the expression 'function' which disambiguates innocent symbols from the other kind; the expression (mapcar (function reverse) '((1 2) (2 3) (3 4))) does what (mapcar 'reverse '((1 2) (2 3) (3 4))) does but the semantics of 'reverse' is clarified to show that the programmer is referring to the function not the symbol itself. Qi I and Qi
II followed Common Lisp in making symbols semantically
context sensitive, relying on the Common Lisp compiler to
decipher the ambiguity and make the correct choice. Being
specifically tailored for multiplatform development, Shen
can make no such assumption. It is quite possible that
the native platform does not support symbols as data
objects in the manner of Lisp; in which case it is
possible to construct a simulation of Lisp symbols by
using tagged 2 element vectors (one element a tag to show
it is impersonating a symbol, the other to hold a string
representation of the symbol; see However the
problem arises of the context sensitivity of symbols;
when should a symbol should be 'vectorised' in a non-Lisp
language and when not? For instance, if the symbol
'reverse' is vectorised in the context (map reverse [[1
2] [2 3] [3 4]]), then the There are several solutions to this problem. One is to simply avoid innocent symbols and insist that strings do all the work of innocent symbols. This follows the path of languages like ML where innocent symbols do not exist. It would mean that in Shen, [p & q] could not be written; only ["p" "&" "q"]. However this would probably not appeal to those Lispers, like myself, who enjoy the convenience of working with innocent symbols. The second is to treat all non-variable symbols in the body of an application (apart from the leading symbol which must denote a function or procedure of some form) as innocent, thus vectorising them. In this case, to prevent any higher-order function H from failing, H must, when applying any vectorised symbol S, look for the function F associated with S and use F instead. This introduces a delay into the execution of H and moreover means that native higher-order functions in the platform, which are not set up to interface to vectorised symbols, will still fail in the manner described above. The third
solution is to provide ' Packages
exist to avoid the usual danger of overwriting when two
programmers accidently choose the same symbols in their
programs to identify different values. The polyadic
function 1. The Shen
reader prepends the package symbol followed by dot ( (a) symbols
listed as belonging to the system (such as Symbols
which are prepended we say are Hence The
philosophy of Shen is that once a programmer has decided
a symbol is internal to a package and is hidden from view
that decision cannot be overridden except by changing the
definition of the package. Hence the complexities of The The
function Note that
Shen will allow you to reference symbols that are
internal to a package by citing the package name (e.g. Symbols
which are external to the Shen package may not be
redefined; the user cannot redefine Shen
contains a Prolog, just as Qi II, but the In place of
the awkward dual convention, Shen has one Prolog notation
consistent with the rest of Shen which uses
xxX [X | _] <--;xxX [_ | Y] <-- (member X Y);)
xx[] [] <--;xx[X | Y] Z <-- (rev Y W) (conc W
[X] Z);)
xx[] X X <--;xx[X | Y] Z [X | W] <-- (conc Y Z
W);)The following functions are found in Shen Prolog
retrieve
was introduced in version 14 to allow Prolog to be called
with a variable whose binding is made outside Prolog.
Thus
will call Shen contains a YACC, just as Qi II, but the syntax was modified in 9.0 to bring it closer to Shen. In addition Shen-YACC II was provided a type theory which meant that Shen-YACC programs could themselves be typechecked, a feature that was lacking in Qi-YACC and Shen-YACC I. This section briefly describes the differences. The
returns the tail of a list. Shen-YACC II recognises lists in the input.
recognises
lists of the form A string in
Shen begins with " and ends with ". The basic
functions for strings are The action
of For a symbol S, **str**returns the string "^S^" that results from enclosing it in quotes.For a number N, **str**returns a string "^M^" whose contents M are such that (= M N) is true in Shen.**str**applied to a large floating point number may return a string whose contents are in e number notation.For a string S, **str**returns a string that when printed off reads as "^S^". However the internal representation of the string may and probably will differ from its print representation.For the failure object, **str**returns**"..."**
In earlier
versions of Shendoc (< 6.1) the question of the print
representation of streams and closures was not
determined. In fact these objects are sent to A more
general function The Shen
reader reads unsigned 8 bit bytes into unit strings and
parses these strings into other tokens as required. By
default the list of acceptable unsigned bytes is a subset
of the ASCII code points between 0 and 127, including all
the points from 32 to 126 and the points 9,10 and 13. The
function The Shen uses
decimal notation for reading bytes and for character
codes. The character string The
polyadic function The Shen reader parses (@s "123" "456") in a special way; as (@s "1" (@s "2" (@s "3" "456"))). The leading argument, if a string, is decomposed into a series of concatenations of unit strings. The significance of this is realised in the use of @s for pattern-matching over strings. @s is not a fast operation because many platforms represent strings as vectors and in these cases @s runs in linear time in respect of the size of the arguments. Within a function @s may be used for pattern-matching. For example; the following removes all occurences of my Christian name from a string.
xx{string
--> string}xx ""
-> ""xx(@s "Mark" Str) ->
(remove-my-name Str)xx(@s S Str) -> (@s S
(remove-my-name Str)))which is parsed into the following.
xx{string
--> string}xx"" -> ""xx(@s "M" (@s
"a" (@s "r" (@s "k" Str))))
-> (remove-my-name Str)xx(@s S Str) -> (@s S
(remove-my-name Str)))In Shen as
in Qi, a list consists of a series of items, seperated by
whitespace, and flanked by [ and ] to the left and right.
[] is the empty list as is (). Note that Kl does not understand [...] and that the
Shen reader translates this idiom into Kl. The basic constructors are
There is
the question of how to treat the application of For that
reason in Shen, Similar
observations apply to Note that cons does
have a type because the type checker is capable of
failing dotted pair applications as type insecure. Hence
the type of cons is A -->
(list A) --> (list A). In Shen, the dotted
pair (cons a b) is printed off as [a
| b].Shen does
not incorporate characters as a data type. The failure
object is no longer Streams in
Shen were introduced to encode some of the functions that
were hard-wired into Qi II like The basic functions for streams are
The
function The Every Shen
stream is read in or written to as a series of
The type
(open File Direction) : (stream
Direction);where the
values of In Shen 16
the zero-place function The Shen
reader sits on top of the primitive bytes stream reader
and reads from either a file (using
xx[exec Expr] -> [trap-error
[time Expr] [/. E failed]])The action
of this macro is to macroexpand every instance of The mode of
operation of Within the reader: the list of macros is composed to a fixpoint on every read token *t*(i.e. atom or list of atoms). If the token*t*is changed to*t*' where*t*<>*t*', the process is repeated on every subterm of*t*'.Within **eval**; the list of macros is composed on every subterm (i.e. atom or list of atoms) to a fixpoint.
In Kl and Shen, (1 dimensional) vectors fulfil all the tasks of arrays. In Kl there are only 4 primitive functions concerned with vectors. 1. All of these functions are accessible from Shen but only the last has a type, since Kl vectors may have elements of any type. It is an error to try to access an address beyond the limit of the vector or to supply any number which is not a whole number between 0 and the limit of the vector. Note that Vectors are
numbered from 0; so If a vector
V is created and nothing has been stored address V(N)
then the result returned by In Shen
absolute vectors are partioned into standard and
non-standard vectors. In Shen, by convention, when a 1. The 0th
element of V is allocated a positive integer which
indicates the size (limit) of the vector. The
function The
shortest standard vector is created by expression In Kl the primitive function The type
secure version of A
2-dimensional array is simply a vector of vectors and
therefore has a type which is an instance of (vector
(vector A)). Note that a vector of vectors may
incorporate vectors of different sizes (the result is
called a For
changing the contents of a vector, the function The
function
The
polyadic function The
semantics of Shen
accepts pattern-matching using
xx{(vector number) --> (vector
number)}xx<> -> <>xx(@v X Y) -> (@v (+ X 1) (add1
Y)))
A
non-standard vector is a vector where the 0th element is
not a non-negative integer. The utility of non-standard
vectors is that they can be used to construct other data
types like Tuples in
Shen are not primitive to Kl but
are represented in the Shen sysfile code as non-standard
three element vectors where the 0th element is a tag Because
tuples are defined internally using Kl primitives as non-standard vectors, the
type system of Shen does not recognise them as standard
vectors and hence, though they can be manipulated using
vector operations
The recognisor **tuple?**returns 'true' to tuples, but not to any other type checked Shen datatype.The tuple (@p 1 2) is printed off as such. **@p**is a two place function which associates to the right; (@p a b c) is just (@p a (@p b c)).(fst (@p a b)) = a and (snd (@p a b)) = b for all a and b.
Used to
define all printing in Shen, is As with Qi,
Shen includes the Note that Shen provides an extra formatting command ~R, which prints the argument using ()s rather than []s, which is useful for printing logical and mathematical formulae. All three
functions depend on There is
also a function Note that Vectors and
lists are printed off using <...>. The vector whose
elements from address 1 to the end are
The global Non-standard vectors are printed off in a special manner. For example, in porting Shen to a platform which lacked symbols as a basic datatype, the programmer can define symbol as a new immutable datatype comprised of a non-standard vector whose zeroth address holds a tag indicating that the vector represents a symbol and whose first address holds a string representing the symbol. In this
case the programmer can indicate in the non-standard
vector itself how the object is supposed to be printed
off by making the tag into a print function. This is
called a
The Shen printer, when confronted with a
non-standard vector V whose 0th address contains a non
positive integer F, uses F as as a
If the non-standard vector has no associated print function then it is printed off as a normal vector but with the 0th element included. The print
representation of the failure object is The global
variable The
functions
All these
functions return an error This
section deals with the generic functions;
(let X Y Z) = ((lambda X Z) Y) However this form is less natural and less familiar than the traditional local assignment and is not definable except by a macro. Note that in Shen (lambda X X) is legal.
The
function
The
primitive
as in The degree
to which any implementation of Kl uses
the information provided by this annotation will depend
on the platform and the implementation. Shen 11 and later
will, if required, automatically generate copious Optimisation of the type declarations in a type checked Shen function F is expected to preserve the I/O features of F only with respect to type secure input. Thus
may return an error on non-numeric inputs when optimised. The
following variables are external to the Shen package. The
variable Like Qi,
Shen includes property lists. However they are not
implemented using CL property lists, but instead rely on
a hashing function into a standard vector The
expression The
functions In Qi, (get-prop X P Y) = (trap-error (get X P) (/. E Y)) Unline CL,
arguments to
The hash
function The basic
error function is Note that Some examples
The numbers in Shen (and Kl) are as follows. 1. Numbers
may be positive or negative or zero. The BNF is <number>
:= <integer> | <float> | <e-number> |
<sign> <number> This is
deliberately kept simple. There is no distinction between
bignums, fixnums etc. There is already a Shen uses simple cancellation when reading signs; thus --3 is read as 3 and ---3 as -3. +3 is just 3. Note (- 3) returns a closure that subtracts its argument from 3. Any fronting number is treated as a token; so for instance [f 5a] will be parsed as [f 5 a] (since 5a cannot be a symbol). The maximum
size of any number and the precision of the arithmetic
are platform dependent. However the minimum of double
precision is For users running Shen on other platforms, it is highly likely that the platform has already defined >, +, -, *, /, <, >=, =, etc. in a way that is inconsistent with the semantics allotted to these symbols in Kl. The .kl sources use these symbols in their native dress because implementors wishing to implement Kl will want to use '*' for multiply etc. But platform providers compiling .kl sources to another language and who experience a name clash with a native function, should read carefully the notes on porting in the porting document. Division between integers in Shen will yield a float if the divisor is not a whole divisor of the numerator. In some languages, e.g. Python, (/ 3 2) gives 1 (integer division) and not 1.5. It was argued as to whether Shen should follow suit. It was felt that in such cases it was more intuitive to return an answer with a fractional component - most people would consider 3/2 = 1 as false and for people using Shen to do wages calculations etc. 'street' division is more appealing. Integer division has been placed in the standard maths library. An interesting question concerns the comparison of floats and integers is (= 1 1.0) true or not? Mathematically the decimal notation is simply a shorthand for a sum. i.e. 1.0 =
(1 x 10 Therefore
if = represents identity then (= 1 1.0) is true. In Shen
(= 1 1.0) ^{0}) + (2 x 10^{-1})
+ (3 x 10^{-2}). E numbers
are parsed similarly i.e. 1.23e2 = ((1 x 10 A contrary
approach is taken in Prolog where '1 = 1.0' is false. In
ML the comparison is meaningless (returns an error)
because 1.0 and 1 belong to different types - real and
int. This is wrong. Computing has fogged the issues here
and committed the traditional error of confusing The integer test in Shen runs in log time and is predicated on the following 'equations' integer -
integer = integer These 'equations', though mathematically true, can fail outside a certain range which depends on the precision of the platform and therefore the accuracy of this test depends on the precision of the arithmetic. Thus we recommend that Shen be installed with at least double precision which gives accuracy up to 16 digits. The For the
argument It is
optional to set this function to record not real time,
but run time; that is the actual CPU time. The argument The
argument Note that
not all platforms may support both One flaw in
Qi II and Qi I was that comments were made using \ .....
\ which meant that comments could not themselves be
commented out and the syntax clashed with that for
characters which also used \. Shen follows the convention
of starting comments with Version 10 allows for single line comments -\\ will blank out all the input before the next RETURN or newline. The following are all special forms in Shen and have their own hard-coded type rules; they are not type checked in curried form and do not sustain type secure currying. @p @s @v cons lambda let type where input+ define defmacro datatype /. synonyms open |

Mark

copyright (c) 2013, Dr Mark
Tarver

dr.mtarver@gmail.com