Syntax & Semantics
General Structure
A Shrimp program—just called program in the following—consists of sections that must appear in a specified order. Every section has a type and a section of every type may occur at most once. A section starts with a line containing only the name of the section type and possibly trailing white spaces (no leading white spaces allowed). The content of a section is defined as all the lines from the start of the section until the end of the section minus the first line, i.e. the one containting the section type name.
Every line starting with a #
indicates the start of a new section and consequently must specify an existing section type. There are the following section types and the order in which they are listed reflects the order in which they must appear in a program:
#NAME
#OPTIONS
(optional)#FUNCTIONS
(optional)#VARIABLES
#CONSTRAINTS
(optional)#OPERATIONS
#PREDICATES
(optional)#FLOW
#VERBATIM
(optional)
Section types not marked as optional have to appear in a program.
The lines before the #NAME
-section in a program can contain anything (except a line indicating the start of a section) and will be treated as a comment. For example, it could be used to give a description of the program's underlying algorithm.
A program should not contain tabulator characters (replace with white spaces).
Comments
Comments are only allowed within a section. This means the line which indicates the start of a section (#...
) cannot contain a comment.
The syntax for comments is the same as in Haskell, i.e. --
for single line comments and {- -}
for multi-line comments.
For a multi-line comment it should hold that:
- there are only white spaces (or nothing) before
{-
in the line containing{-
- there are only white spaces (or nothing) after
-}
in the line containing-}
Legal identifiers
A legal identifier is a varid
(see section 2.4 Identifiers and Operators in the Haskell Language Report 2010) with the following additional restrictions:
- it may not begin with
sp__
- it may not begin with a small letter followed by
__
The program name and the name of every variable, function, operation and predicate must be a legal identifier.
EBNF Notation
The extended Backus-Naur form is used to specify the syntax of certain sections. Non-terminals are written as nonTerminal
and terminals as "str"
. Non-terminals used:
ws0
= {" "
} (0 or more white spaces)ws1
=" "
ws0
(1 or more white spaces)legalId
= set of legal identifiers (see above)legalId2
=ws0
legalId
ws0
Indentation rule
The right-hand side (everything right of =
) of
- an assignment in an operation
- a predicate
- a binary decision tree (BDT)
can span multiple lines, provided every line after the one where it starts is more indented than the first.
Empty lines
For the specification of the content of the section types (except for the #VERBATIM
-section) we assume that the program contains no comments and empty lines (lines only consisting of white spaces or nothing) without loss of generality.
#NAME
-section
The content of this section is a single line containing the program name, which must be of the form:
legalId2
The name is used to give the program a name in the host Haskell source code file. For example, if the program name is exampleAlg
then the function sp__exampleAlg
in the host Haskell source code file contains the description of this program.
#OPTIONS
-section
Each line in the content of this section is treated as an option.
An option may affect how a program is translated or parsed. An option is a line of the form OPTIONNAME
or OPTIONNAME $ARG1 $ARG2 ..
. Trailing white spaces are ignored. Unknown options or options with invalid arguments may be ignored by the compiler.
The following options exist:
TRANSLATION $MODE
where$MODE
is one of the following:Empty
,Interpreted
,Production
ALLOW_LOCAL_VARIABLES_WITH_TRAILING_APOSTROPHE
MERGE_PHANTOM_OPERATIONS
whenever a phantom operation is referenced in a BDT it is replaced with its BDTEXCLUDE_FUNCTION_SIGNATURE $FUNC
omits signature of$FUNC
in the Haskell code
The following options only apply to the the Interpreted
translation mode:
UNTRACED
no trace files are written when the program is executedUNTRACED_FUNCTION $FUNC
same as above but only when the function$FUNC
is calledMAX_STEPS $STEPS
program is executed for at most$STEPS
steps (e.g.MAX_STEPS 20
)MAX_STEPS_FUNCTION $FUNC $STEPS
same as above but only when the function$FUNC
is calledTRACE_DIRECTORY "$DIR"
writes trace files to$DIR
(must be escaped like a string in Haskell). The directory is relative to the one containing the host Haskell source code fileTRACE_DIRECTORY_FUNCTION $FUNC "$DIR"
same as above but only when$FUNC
is called
#FUNCTIONS
-section
Each line in the content of this section must be of the following form:
functionName
arguments
"="
output
where
functionName
=legalId2
arguments
= {ws1
,inputVariableName
}inputVariableName
=legalId2
output
=singleOutput
|tupleOutput
singleOutput
=legalId2
tupleOutput
=ws0
"("
legalId2
","
legalId2
{","
legalId2
}")"
ws0
Each line in this section describes an interface that allows one to execute the program. The function names are used in the host Haskell source code file. This interface specifies which variables are the input to the program and which variable(s) are returned as output at the end of the execution. All variables which are not part of the arguments of a function are set to undefined
at the beginning of the program execution.
#VARIABLES
-section
Each line in the content of this section must be of the following form:
ws0
["?"
] ["!"
]legalId
ws0
"::"
ws0
type
where
type
= any syntactically valid type expression in Haskell that may include type variables
Each line in this section describes a variable (name and type) used by the program.
A ?
-modifier indicates that the variable should not be included in the trace output.
A !
-modifier indicates that the variable should be evaluated strictly (not implemented in spc
yet).
Note: it is only sensible that variables which are input variables, i.e. occur as input parameter of a function, to contain type variables.
#CONSTRAINTS
-section
Each line in the content of this section is a type constraint on the type variables that occur in the #VARIABLES
-section and is written as in Haskell, i.e. Show a
.
#OPERATIONS
-section
The content of this section consists of a sequence of operations and each operation may consist of a set of assignments. A line which indicates the start of an operation is of the form:
operationName
":"
ws0
|operationName
ws1
"SAMEAS"
ws1
operationName
where
operationName
=legalId2
All subsequent lines after a line starting a new operation are interpreted as assignment belonging to that operation if they are more indented than that starting line, except if the operation is of them form x SAMEAS y
. Otherwise it is assumed that a line must be the start of a new operation. An assignment is of the form:
pattern
"="
expression
where
pattern
= any syntactically valid pattern in Haskell
An operation describes how the variables of the program can change in a single execution step.
The first operation in the section is the start operation of the program.
For an operation we distinguish between global variables (or program variables) and local variables (or helper variables).
A global variable is one that is defined in the #VARIABLES
-section. A local variable only occurs in the pattern of some assignment in the operation and is only visible within that operation.
Let varName
be the name of a global variable. Within the assignments of an operation varName
refers to the value of this variable before the operation was executed (old value) and varName'
to the value after the operation was executed (new value). Consequently, the patterns on the left-hand side should only use the global variables names with a trailing apostrophe. It is recommended that local variables do not use a trailing apostrophe in order to avoid confusion with global variables. However, this can be disabled with the option ALLOW_LOCAL_VARIABLES_WITH_TRAILING_APOSTROPHE
.
Note: due to the indentation rule it is possible that an assignment spans over multiple lines
#PREDICATES
-section
Each line in the content of this section must be of the form:
predicateName
"="
expression
where
predicateName
=legalId2
expression
= any syntactically valid expression in Haskell that may reference global variables
A predicate describes a boolean property in terms of the values of the variables. Predicates are only used in the #FLOW
-section to describe the binary decision trees.
Note: due to the indentation rule it is possible that a predicate spans over multiple lines
#FLOW
-section
Each line in the content of this section must be of the form:
ws0
["."
]operationName
ws0
"="
bdt
where
operationName
=legalId
predicateName
=legalId2
bdt
=ws0
(innerNode
|leaveNode
)ws0
innerNode
="("
predicateName
ws1
bdt
ws1
bdt
")"
leaveNode
="HALT"
|operationName
A line in this section specifies a binary decision tree (BDT) that is associated with an operation.
For every operation there must be an associated BDT. This BDT describes with what operation to continue or whether to halt the execution.
The first child in an innerNode
describes the branch that is taken when the predicate predicateName
is true. Otherwise the second child is taken.
The modifier .
indicates a phantom operation. This is an operation which does not change the contents of the global variables and thus does not have to (and should not) appear in the #OPERATIONS
-section. It can be used to compress common sub-BDTs.
Note: due to the indentation rule it is possible that a BDT spans over multiple lines
#VERBATIM
-section
The content of this section may contain anything but a line indicating the start of a new section. The content of this section is pasted as is in the Haskell source code file.
Program semantics
When a function from the #FUNCTIONS
-section is called, the program variables which are arguments of that function are initialized with the given values. All other variables are initialized with undefined
. Then the program begins execution with the start operation (the first operation that occurs in the #OPERATIONS
-section).
When an operation is visited then the variables change their values according to the specified assignments of that operation. Then the associated binary decision tree (BDT) of that operation is considered to determine what operation to visit next or to halt the execution. The predicates are evaluted w.r.t. the new values of the variables.