# Introduction

This is the documentation for the embedded domain-specific language Shrimp which is based on Haskell. Visit the project website to download the newest version and for additional background info. If you have not installed it yet, please follow the instructions on the Install page.

It is assumed that you are familiar with the basics of Haskell. If this is not the case, check out the free online tutorial Learn You a Haskell, which is an excellent introduction for beginners. The first four chapters already suffice to get started with Shrimp.

## First steps

In Haskell, functions are defined using expressions. For example, the line `f x y = y + sum [1 .. x+y]` computes the pairing function. Sometimes, in order to define more complex functions the use of recursion is required and multiple functions that call each other. Shrimp provides a simple way to define such functions in terms of flow charts.

Consider the following problem. Given a string such as `'abbcccaa'` return `[('a',1),('b',2),('c',3),('a',2)]`, i.e. how often the same character appears consecutively. The following Shrimp program (also called sp-program) describes an algorithm for this:

``````#NAME
countAlg

#FUNCTIONS
count input = result

#VARIABLES
input  :: String --must be non-empty
result :: [(Char,Int)]
cur    :: Char
n      :: Int

#OPERATIONS
init:
cur'    = first input
n'      = 1
input'  = removeFirst input
result' = []

same: --same character as before
n'     = n + 1
input' = removeFirst input

new: --different character than before
cur'    = first input
n'      = 1
input'  = removeFirst input
result' = append (cur,n) result

final:
result' = append (cur,n) result

#PREDICATES
hasNext = not (isEmpty input)
newChar = cur /= (first input)

#FLOW
init  = (hasNext (newChar new same) final)
same  = (hasNext (newChar new same) final)
new   = (hasNext (newChar new same) final)
final = HALT

#VERBATIM
first = head --returns first element of a list
isEmpty = null --true iff list is empty
removeFirst = tail --returns list without first element
append x y = y++[x] --appends element x to end of list y
``````

We will discuss this program in more detail later on. First, let us see how this sp-program can be executed. Every sp-program must be included in a host Haskell source code file in order to be executed. Let us call this file `Counter.hs` and let it have the following content:

``````import qualified Shrimp.Interpreter

sp__thisFileName = "Counter.hs"

--START OF PROGRAM countAlg
--END OF PROGRAM countAlg
``````

The two comment lines state where `spc` should insert the sp-program. Additionally, sp-programs referenced from a Haskell file need to be in the correct place in order for `spc` to find them. In this case the correct directory structure is:

• `./Counter.hs`
• `./Counter/countAlg.sp`

The sp-program should have the same file name as what is written in the `#NAME`-section plus the suffix `.sp` and it should be in the directory that has the same basename (filename without extension part) as the Haskell source code file that references it.

Now, we can compile the file `Counter.hs`. If you don't want to create these two files yourself, they are also included in `Shrimp/Tests`. Open the terminal and navigate to the directory containing `Counter.hs` and execute the command `spc Counter.hs`. There should be no output on the terminal which means everything worked as expected. If you open `Counter.hs` again, you see that a translated version of the sp-program was inserted between the two comment lines.

Open `Counter.hs` with `ghci` and type `count "abbcccaa"`. It should return the correct result. Additionally, the trace produced by executing the sp-program is written in `./Counter/countAlg/1.csv`. If you open this in a spreadsheet program such as LibreOffice Calc then you should see a table like this: The values in trace row number 0 describe the input values after calling the function count. The bottom symbol ⊥ indicates that a value has not been initialized (undefined). If a cell in a variable column is blank this means the value has not changed.

Note: If you have problems seeing the trace correctly, import it as csv-file with `;` as only delimiter and `"` as string delimiter using `UTF-8` encoding.

Some things you can try out:

• Figure out the meaning of each section of `countAlg` using the trace as help
• What happens when `count ""` is called?
• How could `countAlg` be modified so that it returns `[]` for the input `""`? If you change `countAlg.sp` simply execute `spc Counter.hs` again to recompile

## Semantics

Let us go through each section of the sp-program `countAlg` and see what it means. A section begins with a line that starts with `#` and ends when the next section begins or the file ends.

``````#NAME
countAlg
``````

In the `#NAME`-section the name of the sp-program is defined. The name is used to reference the sp-program in the Haskell file (look for `sp__countAlg` in `Counter.hs`).

``````#VARIABLES
input  :: String --must be non-empty
result :: [(Char,Int)]
cur    :: Char
n      :: Int
``````

Program variables must be declared upfront in Shrimp in the `#VARIABLES`-section. It is also possible to use type variables in the type expressions.

``````#FUNCTIONS
count input = result
``````

A function in an sp-program specifies which program variables should be given as input and the value of which variable should be returned as result at the end of the execution. When an sp-program is called via a function, all variables which are not part of the input are set to `undefined` and using them before initializing them will cause an exception.

An sp-program can compute more than one function. Try adding the line `count2 n input = cur` to the section and see how it behaves.

``````#OPERATIONS
init:
cur'    = first input
n'      = 1
input'  = removeFirst input
result' = []

same: --same character as before
n'     = n + 1
input' = removeFirst input

new: --different character than before
cur'    = first input
n'      = 1
input'  = removeFirst input
result' = append (cur,n) result

final:
result' = append (cur,n) result
``````

An operation defines how the values of the program variables can change in one execution step. To distinguish the new values from the old ones, an `'` is appended to the variable name. For example, `cur' = first input` means that the new value of `cur` after executing the operation should be the first element of the old value of `input`.

Program variables on the left-hand side must always have a `'` at the end (because the old value can't be changed anymore). On the right-hand side both is possible. If the first assignment in `init` is replaced with `cur' = first input'` then the new value of `cur` would be the second element of the old value of `input` because that is the first element of `input'` (assuming `input` has enough elements).

All assignments of an operation must be more indented than the line where the operation starts. It is also possible that an operation contains no assignments. Multi-line assignments are allowed, provided the additional lines are more indented than the line where the assignment starts.

``````#PREDICATES
hasNext = not (isEmpty input)
newChar = cur /= (first input)
``````

In order for the sp-program to know with what operation to continue or whether the execution should halt, it needs to consider if certain properties hold with respect to the values of the program variables. A predicate describes such a property. The predicate name is on the left-hand side. The variables in the expression on the right-hand refer to the values after the operation has been executed. For example, the predicate `hasNext` is true afer the operation `init` iff the input string has at least two characters.

Multi-line expressions can be used to define predicates, provided each additional line is more indented than the line where the predicate begins.

``````#FLOW
init  = (hasNext (newChar new same) final)
same  = (hasNext (newChar new same) final)
new   = (hasNext (newChar new same) final)
final = HALT
``````

After executing an operation the sp-program needs to know with what operation to continue. For each operation this information can be specified using a binary decision tree (BDT). The inner nodes of the tree are labeled with a predicate and the leaves with operations or `HALT`. If a predicate is true then the first child is taken, otherwise the second. If the BDT of an operation consists of only one operation (or `HALT`) then there should be no parentheses.

The BDT `(hasNext (newChar new same) final)` is read as follows. If the predicate `hasNext` is true then continue with checking the predicate `newChar`. If `newChar` is true then continue with the operation `new`, otherwise with `same`. If `hasNext` is false then continue with the operation `final`.

``````#VERBATIM
first = head --returns first element of a list
isEmpty = null --true iff list is empty
removeFirst = tail --returns list without first element
append x y = y++[x] --appends element x to end of list y
``````

The `#VERBATIM`-section can be used to include functions and constants which are added on the global level to the Haskell source code file.

## Good to know

Additional functions: `spc` has some additional functions such as computing the code coverage given an sp-program and a directory of traces or comparing traces in various ways. See spc Usage for more information.

Tracing self-defined data types: In order for Shrimp to be able to generate traces for self-defined data types they must be a member of the type class `Show`. Alternatively, you can also hide variables from traces by prefixing them with `?`, e.g. `?result :: [(Char,Int)]`.

Local variables: It is possible to use variables which have not been declared in the `#VARIABLES`-section; they are called local variables or helper variables. Simply choose a variable name not used by a program variable in an assignment on the left-hand side. Such variable is only visible within the operation where it is defined.

``````#OPERATIONS
opExample:
a' = c * c
b' = c + c
c = 123 --local variable
``````

Phantom operations: Reconsider the `#FLOW`-section from the previous example.

``````#FLOW
init  = (hasNext (newChar new same) final)
same  = (hasNext (newChar new same) final)
new   = (hasNext (newChar new same) final)
final = HALT
``````

Notice how all operations but `final` have the same BDT. You can use a phantom operation to refactor this:

``````#FLOW
init  = nop
same  = nop
new   = nop
.nop  = (hasNext (newChar new same) final)
final = HALT
``````

The `.` in front of `nop` means that it does not modify the contents of the variables and thus does not need to occur in the `#OPERATIONS`-section. This can also be used if there is a sub-BDT that occurs repeatedly. The only difference in the traces are the additional rows with `nop` as operation. If you don't want to see phantom operations in the traces, use the option `MERGE_PHANTOM_OPERATIONS` (see below).

Options: There exists an `#OPTIONS`-section which can be used to modify certain aspects of how an sp-program is translated or executed. For example, an sp-program can be translated in different modes (`Production`, `Interpreted` and `Empty`). The default mode is `Interpreted` which was used in the previous example. It causes traces to be written as files during execution. If you are certain your program works as intended and you want maximum speed use the `Production` mode instead. You can do so by adding the following section right after the `#NAME`-section.

``````#OPTIONS
TRANSLATION Production
``````

Another useful option is `MAX_STEPS 42` which limits the number of execution steps to 42 (or any other number you like). This only works when using the `Interpreted` translation mode. A typical use-case for this option would be when a program's execution does not seem to terminate and you want a trace to see what is going on. For all possible options see the `#OPTIONS`-section in Syntax & Semantics.