			       CHANGES TO MALAGA
		      Bjrn Beutel, September 17th, 1999


= Changes in Malaga 4.3 =======================================================

In a rule set, there may be more than one rule after an "else" keyword, like
"rules (A1, A2, A3 else B1, B2, B3 else C1, C2, C3)".

The "matches" condition has been reworked. Now it looks like:
     MatchCond ::= Expr "matches" "(" Segment "," ... "," Segment ")" ";" .
     Segment ::= [Variable ":" ] Pattern-String .
A Pattern string may be any constant String (consisting of literals, constant
values and the operator "+"). The value of the String must be a pattern,
which may now contain parentheses for grouping.

The "input_rule" and "filter_rule" have been renamed to "input_filter" and
"output_filter", respectively.

The pruning rule now works differently; It has only one parameter, namely a
list of categories, and must execute a "return" statement with a list of
"yes"/"no"-symbols, one for each category in the parameter.

The value heap now grows automatically if needed; it can grow indefinitely, so
the option "heap-size" is no longer needed and abolished.

In the emacs Malaga support file, Malaga mode will also be invoked for files
with suffix ".mal", but no longer for files with suffix ".nav" or ".sub".

In the emacs Malaga support file, The commands "C-c m", "C-c r", "C-c l" and
"C-c d" have changed to "C-c C-p", "C-c C-r", "C-c C-l" and "C-c C-d" resp.

In an assert statement, you can now use "!" as a shorthand for "assert".

The function "floor(<n>)", which returns the greatest integer number
not greater than <n>, has been introduced.

The repeat statement has been introduced.

The expression "<list> / <number>" yields <list> without its leftmost <number>
elements, if <number> > 0, or <list> without its rightmost abs(<number>)
elements, if <number> < 0.

The function "symbol_name" has been changed to the function "value_string"
which can convert every value to a string.

The rule set in the initial state or in a result statement may be enclosed in
parentheses. 

Subrules may now have zero parameters.

The environment variables "MALAGA_TRANSMIT" and "MALAGA_DISPLAY" have been
abandoned. The command lines are now defined by setting the options "transmit"
and "display", respectively.

The user can now set preferred options for malaga and mallex in the startup
file "~/.malagarc".

The Operator "<record1> * <record2>" works like "<record2> + <record1>".
So ":=*" is useful to add a default attribute to a record if that attribute
doesn't exist in the original record.

Malaga switches can now have any values, even records and lists.


= Changes in Malaga 4.2.5 =====================================================

The function "transmit" has been introduced, which allows communication with an
external process via pipe.

Indexes and floats are merged into the Malaga value type "number", which has
the same properties as "float".  If you want to access the sixth element of
$list, write "$list.6" instead of "$list.6L". Negative numbers count from the
right end of a list, e.g. "$list.-2". Attention: a dot that is immediately
following a number is part of the number, so "$list.2.4" is different from
"$list.(2).(4)".

The "match" condition, now called "matches" condition, is now written as
"<value> matches <pattern>" instead of "match <value> = <pattern>".

In mallex, the command "debug" has been renamed to "debug-entry". The command
"debug-file" has been introduced which generates allomorphs for a file in debug
mode. 

In mallex, the commands "ga-file" and "debug-file" leave their results
permanent, so the results can be displayed with "output" or "result".

The commands "ma-file" and "sa-file" don't interrupt if an error occurs during
file analysis. Instead, the error message is written into the output file and
the next item is analysed.

The new function "symbol_name" returns the name of a symbol as a string.

A condition can be used everywhere a (non-constant) value can be used. The
value of a condition used in such a place is "yes" or "no".

Conditions are now grouped by ordinary parentheses "()" instead of "{}".

A match condition can now be used in every place where an ordinary condition
can be used. Exception: If a match condition defines variables, it may not be
part of a disjunction or a negation. 

The pattern in a match condition may contain constants and literal strings;
it may contain parentheses and the operators & (for concatenation of 
subpatterns) and | (for alternatives). They may be only mixed if precedence is
indicated by parentheses.
A variable definition which subsumes only part of a pattern must
be in parentheses:  '$x: "A" & "B"'  will assign "AB" to $s if it does match, 
whereas '($x: "A") & "B"' will assign "A" to $x. A variable definition may not
be part of an alternative.

A lexicon file may now contain constant definitions "define @Name :=
Value;". The lexicon entries in a lexicon file may now be arbitrary Malaga
expressions, i.e. they may contain constants and the operators ".", "+", "-", 
"*" and "/".

The command "ga-line" has been introduced in mallex, which generates allomorphs
for a single entry in a lexicon file.

Allomorphs can be displayed graphically in mallex using "result". The commands
"output" and "result" have been incorporated into mallex as well as the options
with the very names.

There may now be only one allo_rule in the allo rule file, which may only
call subrules and create allomorphs. The allomorphs are now created by the
command "result", not "allo". An allo rule file may also contain a
filter_rule which is called once for each set of generated allomorph lexicon
entries that share the same surface. This rule can be used to join entries with
a common surface.

In a morphology file, aside the combination rules, there may be a filter_rule
 (formerly located in the mfil-file) and a robust_rule (formerly called
"unknown_rule").

In a syntax file, aside the combination rules, there may be a input_rule
(formerly known as "filter_rule" in the ifil-file), a pruning_rule, and a
filter_rule (formerly located in the sfil-file).

In Emacs Malaga mode, comments that start at the first column will not be
indented. 

The command line options of malaga, mallex etc. now have
one-letter-abbreviations, for example "-v" for "-version".

The option "alias" has been introduced. It is used to define command line
abbreviations.

The "paradigm" command has been deleted, so the "generate" statement has been
deleted, too.

Identifiers may also include the character "|".

"include" is now forbidden within rules.

The operators "+=" and "-=" have changed to ":=+" and ":=-", resp. They are 
complemented by the new operators ":=*" and ":=/".

The unary prefix operator "-" has been introduced which inverts floats and
indexes, i.e. it converts an index <n>L to <n>R and vice versa.

Constant values can now also contain parentheses "()", and the operators "+",
"-", "*", "/" and ".".

The end of a rule may now include the rule name: "end <rule_name>;"

The command "value" has been renamed to "print". It now also accepts indexes in
a variable path.

The option "sort-records" now has three possible settings: "internal", 
"alphabetic", and "definition" (as in the symbol-table).

Float values may now be preceded by a "-" sign.

The operator "value_type()" returns the type of a Malaga value coded as one of
the symbols "symbol", "string", "float", "index", "list", and "record". 

Indexes like 1L or 4R may now be part of Malaga values. They can also be part
of a path in an assignment.

The "." operator may now be followed by a list <e1, e2, e3> of symbols and/or
indexes. This will be interpreted as ".e1.e2.e3".

The operator "length()" returns the number of elements in a list as an index, 
e.g. length(<A, B, C>) = 3L.

The statement "choose" may now choose indexes:
"choose $Index in 6L;" generates paths where $Index has values 1L, 2L, ..., 6L.
"choose $Index in 6R;" generates paths where $Index has values 1R, 2R, ..., 6R.

The statement "foreach" may now iterate over indexes:
"foreach $Index in 6L: <Statements> end;" executes <Statements> where $Index
is assigned the values 1L, 2L, ..., 6L sequentially.

The operator "<list> - <index>" now removes ONLY the element at position
<index> in <list>.

The "remain" part of a "choose" statement has been removed. Where it has been
needed, it can be replaced by index iterating and removing elements by
position:
"choose $Element in $List remain $List" would be replaced by
"choose $Index in length($List); 
 define $Element := $List.$Index;
 $List :=- $Index;"

The argument to the function "switch" must now be a symbol instead of a string.

The commands "ma" and "sa" without arguments don't enter ma-mode or sa-mode
any longer; they re-analyse the last input. Use "ma-mode" and "sa-mode" to
enter ma-mode or sa-mode, respectively.


= Changes in Malaga 4.1 =======================================================

The functions in libmalaga now return also when an error occurred. In this 
case, the error message is in "malaga_error". Else, "malaga_error" is NULL.

The command "clear-cache", which deletes all wordforms in the cache, has been
introduced.

You can set switches in malaga and mallex with the option "set switch", and
you can query them in rules using the operator "switch".

The option "variables" has been introduced, to show Variables automatically in
debug mode.

Output is now sent to a single graphical display program via pipes.
The program command line must be in the environment variable
MALAGA_DISPLAY.

A subrule can now be called before it is defined.

A command "trace" has been added, to show the current call stack.

The option "graphics" has been deleted. For textual result output, use the
command "output". For graphical results, use the command "result".
The commands can be automatically executed after "sa" or "ma".
Use the options "result" and "output" for this purpose.

The option "cache" has been deleted. Use "set cache-size 0" to deactivate the
cache.

"result-format" and "unknown-format" are also used for textual output with
"ma", "sa" and "result".

In "result-format" and "unknown-format", "%n" means the number of states for
this analysis. In "result-format", "%r" is the ambiguity-index.
 
"get cache-size" now also shows how many cache entries are used.

sa-file, ma-file and ga-file take an additional optional parameter, namely the
output file name.

malaga and mallex print statistic information when they work in batch mode.

analyse_item () in libmalaga now takes an additional argument, which says
whether malaga should create an analysis tree. 

mallex now also reads the project file if it is called via malmake.

Renamed option "heapsize" to "heap-size".

Implemented a word form cache and option "cache" to switch it on or off. The
cache size can be set using the option "cache-size"

Replaced option "format" by "allo-format", "result-format" and
"unknown-format".

libmalaga now reads the "malaga:" option lines from the project file, not the
"libmalaga:" lines. It ignores the options that only make sense for malaga.

The option "heapsize" has been introduced to set the heapsize to a new value.

Lines in the project file that start with "morinfo:" or "syninfo:" will be
stored as mor-info or syn-info. In malaga, use the command "info mor" or 
"info syn" to get this information. In libmalaga, use the function
"get_info(grammar_t grammar)".

Option lines in included project files ("include:" lines) are now also
executed.

In the left hand of an assignment, paths can now also include any expressions,
like "$var.$var1.($var2.attr) := value;"

"sa" now supports sa-mode. "ma" now supports ma-mode.

The "output" option has been replaced by the "graphics" option and the "tree"
option.

The "set" keyword must now be used when setting options that appear in the 
project file.

The "define" keyword must now also be used for constant definitions.

The "hidden" option syntax now needs a "+" in front of each symbol to hide, a
"-" in front of each symbol to hide no more and a "none" to hide no symbols.

TAB in Malaga mode only jumps to first non-blank if the cursor previously was
in front of the first non-blank.


= Changes in Malaga 4.0 =======================================================

The symbols "yes" and "no" are now defined by the system. A condition that
consists only of a value (without condition operator) is tested whether it
contains the value "yes" or "no". The former condition "capital" is now a 
standard function that returns a "yes" or "no" value.

A definition of a new variable (formerly an assignment) now needs the keyword
"define" in front. It is now called the define-statement. An assignment 
(formerly a "set"-statement) doesn't need a "set" in front any longer.

A test-statement may be introduced by the "?" as well as the new keyword 
"require".

The "next" command has been introduced. It works like "step", but it executes
subrules without interruption.

The "set" command is now introduced to set options; there are no individual
commands for the individual options left. The "get" command is used to get the
current settings.

The initial state is now described in the format 
"initial <cat>, rules <rules>;" (for combi rules), 
or "initial rules <rules>;" (for other rules).

The result statement now displays the result in a TCL/Tk window. This can be changed by the "output" option.

The "set()" function has been implemented. It takes one parameter and it
converts a list (multi-set) to a set where every element is contained at most
one time.

The debug commands have been renamed to form a more regular pattern:
"debug" (for allomorph rules), 
"debug-line" (for lexicon lines),
"debug-mor" (for morphology combination rules), 
"debug-mfil" (for morphology output filter rules),
"debug-ifil" (for syntax input filter rules),
"debug-syn" (for syntax combination rules),
"debug-sfil" (for syntax output filter rules),
"debug-node" (for analysis states).

The keywords "base" and "cat" have been deleted. The "generate" statement now
takes an "allo" keyword instead of "base" if the rule can generate a base
allomorph. The "allo" statement now looks like: "allo <allo>, <cat>, <base>;".

Syntax input filter rules have been introduced. A syntax input filter rule file
has the ending ".ifil" and is executed after morphology output filter rules
have been executed and before syntax combination rules will do their work. 
As a consequence, the morphology output filter rules may only use symbols of
the symbol file, not of the extended symbol file (since the morphology output
filter rules now belong totally to the morphology system).

The "filter" command now takes the keywords "mfil", "sfil" and "ifil" instead
of "mor" and "syn".

The "error" statement now needs a string, namely the error message that it
should print.

The keyword "final_state_check" has been changed to "end_rule" again, since it
IS a rule (although not a combi-rule).

The "foreach" statement can now only include one list over which to
iterate. This reduces complexity of Malaga statements.

The "choose" statement can now assign the remainder to an existing variable:
use the form "choose $var1 in $list remain set $rem_var", which will assign the
remainder to "$rem_var".

The "start" statement has been deleted. Instead, the rule parameters have to be
specified behind the rule name, in parentheses, like:
"rule ABC ($start, $next, $surf):".
Combi rules have 2-4 parameters, namely start, next, next-surface (optional) 
and index (optional).
Pruning rules have 3 parameters: the list of state-cats already tested,
the category of the state currently tested, and the list of state-cats to be
tested later on.
Filter rules, allo rules and end rules have one parameter.

There is no difference between test statements and result statements any
longer. Therefore, the "case" statement was superfluous and could be
erased. The "parallel" statement has been changed in syntax: instead of the
"subrule" keyword IN FRONT of each parallel part, and "and" keyword BETWEEN 
two parallel parts is now used.

Subrules (i.e. Malaga functions) have been introduced. They start with the
keyword "subrule" and their parameter list can have any number of parameters. 
A subrule must return a value via the "result" statement: "result $xyz;". 
It is called in an expression like "$new := subrule_name ($Param1, $Param2);".
Subrules may nest, but they must not be called recursively. Every subrule must
be defined before it is called (no forward declarations are possible).

Values can be much bigger now (up to 1 Gigabyte, which is perhaps academic).


= Changes in Malaga 3.0 =======================================================

The command "hide" has been renamed to "hidden", it takes an additional first
argument: "add", "delete" or "clear". For "add", all subsequent arguments are
added to the list of hidden arguments. For "delete", all subsequent arguments
are removed from the list of hidden arguments. For "clear", all symbols are
removed from the list of hidden arguments.

The command "attributes" has been renamed to "sort_records".

The command "hangul" now gets a parameter "on" or "off", so command "roman"
could be deleted.

The command "show" has been renamed to "output".

The command "unknown" has been removed, it functionality has been included 
into command "format".

The command "debug-node" takes a state number which you get from the title of
a TCL/Tk state window, and executes all rules in debug mode that have this
state as Start-state.

The "Disam" package has been removed: the commands "disam" and "prune" are
no longer available.

The command "value" now supports paths: You can write a series of attributes
behind a variable name, e.g. "malaga> value $start.Form.Syn"

Filter rules have been introduced. You can have a filter rule system for your
morphology and one for your syntax. The morphology filter rule file has to end
in ".mfil", the syntax filter rule file has to end in ".sfil". The filter 
rules are called after the combination rules have been executed. They are
similar to the allomorph rules, only that they begin with "filter_rule"
instead of "allo_rule", and they get the list of results of the
combination rules as their start parameter. In the filter rules, you can
compare the results, change them and create the new actual analysis
results by using the "result ... accept;" statement. Filter rules can use the
symbols in the symbol file and the symbols in the extended symbol file.

If you include filter rules in your project file or in your command line
arguments when calling malaga, the execution of filter rules is switched on by
default. You can switch it on or off using the "filter" command. The filter
command needs two arguments: the filter rule system type ("mor" or "syn") and
one of "on" or "off".

The file ending ".sys" (for syntax symbol file) has been changed to ".esym"
(for extended symbol file) because the symbols in this file can now also be
used by the filter rules.

Pruning rules have been introduced. You can have a single pruning rule in your
syntax rule file. Before a set of states (which have consumed the same amount
of analysis input) is combined with a new next-input, the pruning rule is
called for each state of this set. The rule decides whether the state should be
deleted or not. The pruning rule starts with "pruning_rule", and in the start
statement, it gets a list of three elements as a parameter: the first element
is the list of state categories of the states that have already been examined
by the pruning rule, the second argument is the category of the state that is
to be examined currently, and the third argument is the list of state
categories of the states that haven't been examined yet. When the pruning rule
executes an "accept" statement, the state will be preserved; else it will be
killed.

In morphology rule files, after the initial state, you can now include a
"unknown_cat <cat>;". When robust analysis is activated, an unknown
wordform is assigned the category <cat>. Robust analysis is switched on and off
with the command "robust".

Commands that are included in the project file after "malaga:" or "mallex:" 
are now also executed in batch mode, so only commands that change settings are
allowed here. Command line options for malaga and mallex have been reduced to
"-version", "-readable" and "-interactive" for mallex and "-version", 
"-syntax", "-morphology" for malaga.

The "-" operator for lists is the MULTI-SET difference now, whilst the "/"
operator for lists is the SET difference.


= Changes in Malaga 2.1 =======================================================

The keyword "end-rule" has been renamed to "final_state_check".

The character "-" may not be part of symbol names, keywords, variable
names... any longer. 

The new Operater "-" can subtract floats and create the difference of
multisets. 

The symbol "nil" can be compared to any value (even records, lists and
strings). 

A rule set may contain multiple default rules 
(rules aa, bb, cc else dd else ee else...)

The "set" statement can now set the value of a specified attribute, like
"set $start.Form.Mor := $New_Value;". 

The new assignment operators "+=" and "-=" have been introduced (in set
statements only). The statement "set a += b;" is an abbreviation for 
"set a := a + b"; the analogon holds for "-=".

The atomizing operator "* a" now has to be written as "atoms(a)". The inverse
operator has been introduced: "multi(a)" returns the multi-symbol whose atomic
symbols are equal to a, which must be a list of atomic symbols. An error is
reported if that multi symbol doesn't exist.

In the symbol table, the "*" is not needed to mark multi-symbols; it is now
forbidden. Furthermore, every multi-symbol's symbol list must contain two
symbols at least and all multi-symbols need to have different definitions.

The output format of the commands "ma-file", "sa-file" and "ga..." can be
configured. The commands "format" and "unknown" have been introduced for this
purpose. 

The condition "capital (string)" now tests whether "string" starts with a
capital letter.

In rule files, global constants can be defined by definitions of the form
"@Const := <constant>". Constants can only be defined outside of rules; they
are valid throughout the rule file.

The comparison operators "greater", "less_equal" and "greater_equal" compare
floating point numbers (like "less").

In Malaga Emacs mode, the mode-line now also includes the name of the project
file that is being used.

mallex now can also generate readable lexicon files in batch mode, use the
command line option "-readable". The lexicon file will then be printed on the
standard output stream.

Analysis statistics are now printed on the standard output stream.

= end of file =================================================================
