Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
start [2018/07/10 10:55]
134.99.122.132 [Type errors]
start [2018/09/27 16:43] (current)
admin
Line 1: Line 1:
-====== Introduction ====== +The documentation ​moved to GitHub: [[https://​github.com/​spetitjean/​XMG-2/​wiki]]
- +
-The XMG system corresponds to what is usually called a "​metagrammar compiler"​ (see below). More precisely it is a tool for designing large scaled grammars for natural language. Provided a compact representation of grammatical information,​ XMG combines elementary fragments of information to produce a fully redundant strongly lexicalised grammar. It is worth noticing that by XMG, we refer to both +
-  * a formalism allowing one to describe the linguistic information contained in a grammar, +
-  * a device computing grammar rules from a description based on this formalism.  +
- +
-===== What is a metagrammar ? ===== +
- +
-This term has been introduced at the end of the 1990s by MH Candito. During her PhD, she proposed a new process to generate semi-automatically a Tree Adjoining Grammar (TAG) from a reduced description that captures the linguistic generalizations appearing among the trees of the grammar. This reduced description is the metagrammar. +
- +
-===== What is a metagrammar compiler ? ===== +
- +
-Once we have described the grammar rules by specifying the way structure is shared, i.e. by defining reusable fragments, we use a specific tool to combine these. Such a tool is called a metagrammar compiler.  +
- +
-===== What is XMG-2 ? ===== +
-A distinction has to be made between XMG and XMG-2 (sometimes called XMG-NG). +
-[[https://​sourcesup.cru.fr/​xmg/​|XMG]] is a metagrammar compiler dedicated to the generation of Tree Adjoining Grammars and Interaction Grammars. XMG-2 is a whole new project which has been developed at the [[http://​www.univ-orleans.fr/​lifo/?​lang=en|LIFO]] (University of Orléans) and the [[http://​www.sfb991.uni-duesseldorf.de/​|SFB 991]] (University of Düsseldorf). XMG-2 makes it possible to create new compilers, adapted to other generation tasks. Its modularity allows to simply assemble Domain Specific Languages, and automatically generate the processing chain for these languages. +
- +
-In other words, XMG-2 is a tool which allows to generate compilers such as XMG: a metacompiler (or compiler compiler). +
- +
-This user documentation ​of XMG-2 is based on the documentation for XMG, and includes the new features provided by the recent extensions. +
- +
-====== First steps ====== +
-This section presents the different ways XMG can be used, and how to use it to generate a first resource from a toy example. +
-===== Installation ===== +
-There are several ways to use XMG: it can be installed on the system (only for linux users), used on a virtual system, or through a webpage. The first two options are recommanded for developing large scale resources.  +
-==== Option 1: standard installation ==== +
- +
-if you are using a Debian based distribution (like Ubuntu), open a terminal and follow the following steps: +
- +
-=== Git: ===   +
- +
-{{ :​git.png?​nolink |}} +
- +
-    sudo apt-get install git +
- +
-=== Download and install Gecode (4.0 not supported yet): === +
-{{ :​gecode.png?​nolink |}} +
- +
-From here: http://​www.gecode.org/​download.html (recommended:​ http://​www.gecode.org/​download/​gecode-3.7.3.tar.gz,​ also available here: [[https://​drive.google.com/​uc?​export=download&​id=0B2gwCa-ajJXmOUd0VHBZYnNTZ2c|Gecode 3.7.3]]). +
- +
-    ./configure --disable-qt --disable-gist +
-    make  +
-You should read the following if the installation is successful: ​        +
-  Compilation of Gecode finished successfully. To use Gecode, either add  +
-    /​.../​gecode-3.7.3 to your search path for libraries, or install Gecode using +
-   make install ​       +
-   +
-Then, as suggested, you can type:  +
-    make install ​    +
-Note that you might need to throw this command as superuser (''​sudo make install''​). +
-If the installation succeeds, you should be able to run Gecode, you can try it by typing: +
-  ./​examples/​queens +
-   +
-If the installation fails, there is probably a dependency missing. To install them:  +
-    sudo apt-get install g++     +
-    sudo apt-get install make +
-=== Download and install YAP (Yet Another Prolog): ===  +
- +
-    git clone https://​github.com/​spetitjean/​yap-6.3.git +
- +
-Then: +
-    ./configure --without-readline +
-    make  +
-    make install ​   +
-For dependencies:​  +
-    apt-get install libgmp3-dev +
-=== Install Python3 (>3.2): === +
-{{ :​python.png?​nolink |}} +
- +
-    sudo apt-get install python3 python3-yaml python3-pyqt4 +
- +
-=== Download XMG: === +
- +
-    git clone https://​github.com/​spetitjean/​XMG-2.git ​    +
-You can also get it as an archive here: [[https://​drive.google.com/​uc?​export=download&​id=0B2gwCa-ajJXmRmNfU1FvRFpwOGM|XMG-NG]] (this solution will not allow you to update XMG-2). +
- +
-=== Add XMG-2 to your PATH === +
- +
-Edit your ''​~/​.bashrc''​ file and add this line (your path_to_xmg should be for example ''​~/​xmg-ng''​):​ +
-    export PATH=path_to_xmg:​$PATH ​    +
-To edit the ''​bashrc''​ file, you can type: +
-  emacs ~/.bashrc +
- +
-==== Option 2: using Virtualbox ==== +
- +
-A VirtualBox image of XMG is available for an easier installation. +
-Use [[https://​www.virtualbox.org/​|VirtualBox]] and download one of the XMG virtual images: +
- +
-  *  [[https://​www.dropbox.com/​s/​nltmtbxram2yd73/​XMG-Ubuntu-18.04.ova?​dl=1|Ubuntu 18.04 virtual image]]: includes the parser TuLiPA-frames (default password is xmg) - last updated 06/​06/​2018. +
-  * [[https://​www.dropbox.com/​s/​knkg4qtfld3ir4g/​XMG-Xubuntu-18.04.ova?​dl=1|Xubuntu 18.04 virtual image]]: lighter version, includes the parser TuLiPA-frames (default password is xmg) - last updated 30/​05/​2018. +
- +
-====  Using XMG without installing anything ==== +
- +
-An online compiler is available at this address: [[http://​xmg.phil.hhu.de/​index.php/​upload/​workbench]]. +
- +
- +
-===== Updating XMG-2 ===== +
- +
-To get the latest version of XMG-2, regardless of the installation option you chose, you can type this command (in the xmg-ng directory):​ +
-   +
-  git pull +
-=====  Creating a first compiler ===== +
- +
-The instructions detailed here is equivalent to using the script **reinstall.sh** (see section [[http://​dokufarm.phil.hhu.de/​xmg/​doku.php?​id=start#​scripts|Scripts]]). This means that you can skip this section by only typing: +
- +
-  ./​reinstall.sh +
-  (at the root of the XMG-2 installation directory) +
- +
-Before compiling a metagrammar,​ a compiler needs to be created. XMG-2 assembles compilers by combining compiler fragments called bricks. These bricks are distributed into packages called contributions. For example:  +
-  * the contribution ''​core''​ provides bricks offering support for the basic features of a compiler +
-  * the contribution ''​treemg''​ makes it possible to process tree descriptions +
-  * the contribution ''​synsemCompiler''​ makes the synsem compiler (equivalent to XMG-1) available +
- +
-Installing a contribution,​ with the command ''​install''​ makes all the bricks of this contribution available for being assembled. +
- +
-First, install the needed contributions for the synsem compiler: +
- +
-  xmg bootstrap ​               +
-  cd contributions ​            +
-  xmg install core           +
-  xmg install treemg ​         +
-  xmg install compat ​         +
-  xmg install synsemCompiler ​  +
- +
-Then, build the compiler: +
- +
-  cd synsemCompiler/​compilers/​synsem +
-  xmg build +
- +
- +
-After these operations, the compiler synsem (Tree Adjoining Grammar with semantics based on predicate logic) is available. +
- +
-=====  Compiling a toy-metagrammar ===== +
- +
-The XMG system includes a toy metagrammars that we highly recommend to manipulate. The files containing these metagrammars should be in the Metagrammars directory of the XMG installation. To compile one of the synsem examples (adapted to the compiler we just built), just type: +
- +
-    xmg compile synsem MetaGrammars/​synsem/​TagExample.mg +
-(see also List of XMG's options below) +
-The result of this compilation will be a file named TagExample.xml. +
- +
-To launch the GUI, type: +
- +
-    xmg gui tag +
- +
-You can then open the grammar file (.xml) which was generated by the compiler (Fichier -> Ouvrir un XML). +
- +
-=====  Compiling an existing metagrammar ===== +
- +
-To compile metagrammars which were created using XMG1, it is usually necessary to use the ''​--notype''​ option to cancel the type checking steps which did not exist in XMG1.     +
- +
-====== Writing a Metagrammar ====== +
-This section gives the general shape of a Metagrammar. The resource itself is described with domain specific languages (depending on the type of resource) which are provided by XMG dimensions. The different description languages available will be presented in the section [[http://​dokufarm.phil.hhu.de/​xmg/​doku.php?​id=start#​dimensions|Dimensions]]. +
- +
-===== Choosing a compiler ===== +
- +
-The first decision which needs to be made is the choice of the compiler. This decision depends on the type of linguistic resource to describe. Each compiler was created for a specific grammar engineering task, and features a set of dimensions. This means that each compiler comes with its own language. The list of available dimensions is given in the next section ([[http://​dokufarm.phil.hhu.de/​xmg/​doku.php?​id=start#​dimensions|Dimensions]]), ​ and a list of existing compilers using these dimensions is given in the section [[http://​dokufarm.phil.hhu.de/​xmg/​doku.php?​id=start#​bricks_contributions_and_commands|Bricks,​ contributions and commands]].  +
-===== Getting started ===== +
- +
-A metagrammar is composed of one or several text files, which are usually using the prefix ''​.mg''​ or ''​.xmg''​. Any text editor can be used to write XMG code, although [[https://​www.gnu.org/​software/​emacs/​|Emacs]] is recommended because of the different XMG modes created for it: +
- +
-    * the [[https://​sourcesup.cru.fr/​xmg/​xmg.el|emacs]] and [[https://​sourcesup.cru.fr/​xmg/​xmg.vim|vim]] modes for XMG-1 (only tree descriptions and predicate semantics). +
-  *  new emacs modes inspired from this one, which are automatically generated when a compiler is built (in the file .install/​yap/​xmg/​compiler/​X/​generated/​emacs_mode,​ where X is the name of the compiler). +
-    * a more advanced emacs mode for tree descriptions and frame semantics: [[https://​github.com/​xmg-hhu/​xmg-mode]] +
- +
-The XMG online compiler also provides an online interactive editor: [[http://​xmg.phil.hhu.de/​index.php/​upload/​workbench]].  +
- +
- +
-===== Including data from other files ===== +
- +
-To ease their development and reuse Metagrammars can be written in separated files. For example, all type declarations can be isolated in a file. To include the code of a file into another: +
- +
-  include file_to_include.mg +
- +
-===== Principles and Constants ===== +
-==== Principles ==== +
- +
-The first piece of information one has to give in a metagrammar is the principles that will be needed to compute the grammar structures. The instruction used to do this is the ''​use principle with (constraints) dims (dimensions)''​ statement. For instance, one may decide to force the syntactic structures of the output grammar to have the grammatical function gf with the value subj only once. This is told by: +
- +
-    use unicity with (gf = subj) dims (syn) +
-In the syn dimension, we use the unicity principle on the attribute-value ''​gf = subj''​. The description of the unicity principle, together with all information about principles and how to use/create them, can be found in the section [[http://​dokufarm.phil.hhu.de/​xmg/#​principles_and_plugins|Principles and plugins]]. +
- +
-Note that principles use as parameters pieces of information that are associated to nodes with the status property (see below).  +
- +
-==== Types and Constants ==== +
-Every piece of information in a XMG metagrammar is typed. This is of course the case for values in feature structures, but also for syntactic nodes, dimensions, classes, etc. There are 4 ways of defining types: +
- +
-    *as an enumerated type, using the syntax ''​type Id = {Val1,​...,​ValN}''​ such as in:  +
- +
-    type CAT={n,​v,​p} ​    +
-(note that the values associated to a type are constants) +
- +
-    * as an integer interval, using the syntax ''​type Id = [I1 .. I2]''​ such as in:  +
- +
-        type PERS=[1 .. 3] +
- +
-    * as a structured definition (T1 ... Tn represent types) ''​type Id = [ id1 : T1 , id2 : T2 , ..., idn : Tn ]'',​ such as in:  +
- +
-    type ATOMIC=[ +
-         mode : MODE, +
-         num : NUMBER, +
-         gen : GENDER, +
-         pers : PERS] +
- +
-    * as an unspecified type type Id !, such as in:  +
- +
-    type LABEL ! +
-(this is useful when one wants to avoid having to define acceptable values for every single piece of information). Note that XMG integrates 3 predefined types: int, bool (whose associated values are + and -) and string. +
- +
-==== Properties ==== +
- +
-Once types have been defined, we can define typed properties that will be associated to the nodes used in the tree descriptions. The role of these properties is either ​   +
-  -  to provide specific information to the compiler so that additional treatments can be done on the output structures to ensure their well-formedness or  +
-  -  to decorate nodes with a label that is linked to the target formalism and that will appear in the output (see XMG's graphical output). The syntax used to define properties is ''​property Id : Type'',​ such as in: +
- +
-    property extraction : bool     +
-A set of properties is specific to principles: it is the case for the properties **color** and **rank**. This means that when using these principles, these properties must be declared. See the section [[http://​dokufarm.phil.hhu.de/​xmg/#​principles_and_plugins|Principles and plugins]] for more information about how to use these properties. +
- +
-Properties can also be used to give a "​global"​ name to a node, thanks to the **name** property. To perform interfacing with the lexicon, one may want to give global names to some specific nodes, in order to be able to refer to these nodes in the lexicon. Such an interfacing can be used for instance +
-to manage semantic information. To associate global names that will appear in the semi-automatically +
-produced grammar, you have to: +
-  *  declare an enumerate type containing all the names you will use: +
- +
-  type NAME = {subjNode, objNode, anchor} +
- +
-  * declare a property ''​name''​ of this type: +
- +
-  property name  : NAME +
- +
-  * associate to the specific nodes the predefined names: +
- +
-  node (mark=subst,​name=objNode)[cat=n] +
- +
-N.B.: make sure these name properties will not cause node unification failures, ie. do not give different +
-names to nodes that will be merged. At the end, the node names are visible in the output file (as an attribute of the node element): +
- +
-  <node type="​subst"​ name="​objNode">​ +
-==== Features ====    +
-  +
-Eventually we have to define typed features that are associated to nodes in several syntactic formalisms such as Feature-Based Tree Adjoining Grammars (FBTAG) or Interaction Grammars (IG). The definition of a feature is done by writing ''​feature Id : Type'',​ such as in: +
-    feature num : NUMBER +
-Up to now, we have seen the declarations that are needed by the compiler to perform different tasks (syntax checking, output processing, etc). Next we will see the heart of the metagrammar:​ the definition of the clauses, ie the classes.  +
- +
-===== Classes ===== +
- +
-Here we will see how to define classes (i.e. the abstractions in the XMG formalism). Note that in TAG these classes refer to tree fragments. A class always begins with ''​class Id'',​ such as in: +
- +
-    class CanonicalSubj +
- +
-N.B. A class may be parametrized,​ in that case the parameters are between square brackets and separated by a colon. Parameters should be identifiers which do not appear in the namespace of the class. The values for the parameters are given when a class is instantiated. Values can be constants, variables, or other class instances. +
- +
-==== Import ==== +
- +
-To reach a better factorization,​ a class can inherit from another one. This is done by invoking ''​import Id''​ (where Id is a class name), such as in: +
- +
-    import TopClass[] +
-That is to say, the metagrammar corresponds to an inheritance hierarchy. But what does inherit mean here ? In fact, the content of the imported class is made available to the daughter class. More precisely, a class uses identifiers to refer to specific pieces of information. When a class inherits from another one, it can reuse the identifiers of its mother class (provided they have been exported, see below). Thus, some node can be specialized by adding new features and so on. +
- +
-Note that XMG allows multiple inheritance,​ and besides it offers an extended control of the scope of the inherited identifiers,​ since one can restrict the import to specific identifiers,​ and also rename imported identifiers. Restriction is done by using the keyword ''​as'':​ +
- +
-  import Class[] as [?V1,..., ?Vn] +
-   +
-will only import the variables listed (''?​V1,​...,?​Vn''​) to the scope of the current class. Renaming is also made possible by the keyword ''​as'',​ by using the ''​=''​ sign: +
- +
-  import Class[] as [?​V1,​...,?​Vi=?​X,​...,?​Vn]  +
- +
-will do the same as the previous example, except that the variable initially named ''?​Vi''​ will be known in this namespace as ''?​X''​. This is especially useful to avoid name conflicts. +
- +
- +
-==== Export ==== +
- +
-As we just saw, we use identifiers in each class. One important point when defining a class is the scope we want these identifiers to have. More precisely we can give (or not) an extern visibility to each identifier by using the export declaration. Only exported identifiers will be available when inheriting or calling (ie instantiating) a class. Identifiers are exported using ''​export id1 id2 ... idn''​ such as in: +
- +
-    export X Y +
- +
-==== Identifiers ==== +
- +
-In XMG, identifiers can refer either to a node, the value of a node property, or the value of a node feature. But whatever an identifier refers to, it must have been declared before by typing ''​declare id1 id2 ... idn'',​ such as in: +
- +
-    declare ?X ?Y ?Z +
-Note that in the declare section the prefix ? (for variables) and ! (for skolem constants) are mandatory.  +
- +
- +
- +
-==== Content ==== +
- +
-Once the identifiers have been declared and their scope defined, we can start describing the content of the class. Basically this content is given between curly-brackets. This content can either be: +
- +
-    * a statement +
-    * a conjunction of statements represented by ''​S1 ; S2''​ in the XMG formalism +
-    * a disjunction of statements represented by ''​S1 | S2''​ +
-    * a statement associated to an interface (see [[http://​dokufarm.phil.hhu.de/​xmg/​doku.php?​id=start#​ifaceconnecting_dimensions|Interface]])  +
-By statement we mean: +
- +
-    * an expression: E (that is a variable, a constant, an attribute-value matrix, a reference (by using a dot operator, see the example below), a disjunction of expressions,​ or an atomic disjunction of constant values such as ''​@{n,​v,​s}''​),​ +
-    * a unification equation: ''​E1=E2'',​ +
-    * a class instanciation:​ ''​ClassId[]''​ (note that the square-brackets after the class id are mandatory even if the instantiated class has no parameter),​ +
-    * a description belonging to a dimension: this is where the main description task takes place (see section [[http://​dokufarm.phil.hhu.de/​xmg/​doku.php?​id=start#​dimensions|Dimensions]]) +
- +
-===== Mutexes ===== +
- +
-Mutexes are the way provided by XMG to specify which classes are incompatible. To specify which classes are mutually exclusive, you first have to define a mutual exclusion set by typing ''​mutex Id''​ such as in: +
-  mutex SUBJ-INV +
- +
-Then classes need to be added to this set by invoking ''​mutex Id += ClassId''​ such as in: +
-  mutex SUBJ-INV += CanonicalObject +
-  mutex SUBJ-INV += InvertedNominalSubject +
-Here we specify that we cannot use in the same description both the ''​CanonicalObject''​ and the ''​InvertedNominalSubject''​ classes. +
- +
-Note that in the metagrammar file, the mutex definitions have to be placed after the type, property and feature declarations and before the valuations. This means that they can appear just before class definitions,​ between them, or right after. +
- +
-===== Valuations ===== +
-Once all the classes have been defined, we can ask for the evaluation of the classes that will trigger the +
-combination of the fragments (ie classes calling classes that contain disjunction and/or conjunction of +
-fragments). For each of these specific classes, we will obtain an accumulated tree description that may +
-lead to the building of 0, 1 or more TAG trees. The syntax of the evaluation instruction in XMG is **value Id**, such as in: +
- +
-    value n0Vn1  +
-====== Dimensions ====== +
-Dimensions contain the linguistic descriptions,​ which are composed of constraints. Each XMG dimension comes with a specific set of constraints,​ which allow to describe different structures (trees, feature structures, etc). This section presents the different dimensions supported by XMG, and their description languages. +
-===== SYN: a tree description language ===== +
- +
-The <syn> dimension allows to describe trees, initially to create Tree Adjoining Grammars or Interaction Grammars. To use this language, you can either build a new compiler using the brick **syn** (contribution **treemg**) or use one of the existing compilers including the dimension: **synsem** (contribution **synsemCompiler**,​ with predicate based semantics) or **synframe** (contribution **synframeCompiler**,​ with frame based semantics). +
- +
-A syntactic description is given following the pattern ''<​syn>​{ formulas }''​. Now what kind of formulas does a syntactic description contain ? The answer is nodes. These nodes are in relation with each other. In XMG, you may give a name to a node by using a variable, and also associate properties and/or features with it. The classic node definition is ''​node ?id ( prop1=val1 , ... , propN=valN ) [ feat1=val1 , ... , featN=valN ]''​ such as in: +
- +
-    node ?Y (gf=subj)[cat=n] +
-Here we have a node that we refer to by using the ?Y variable. This node has the property gf (grammatical function) associated with the value subj, and the feature structure [cat=n] (note that associating a variable to a node is optional). +
- +
-Once you defined the nodes of the tree fragment, you can describe how they are related to each other. To do this, you have the following operators:​ +
- +
-    * -> strict dominance +
-    * ->+ strict large dominance (transitive non-reflexive closure) +
-    * ->* large dominance (transitive reflexive closure) +
-    * >> strict precedence +
-    * >>+ strict large precedence (transitive non-reflexive closure) +
-    * >>* large precedence (transitive reflexive closure) +
-    * = node equation +
- +
-Each subformula you define can be added conjunctively (using ";"​) or disjunctively (using "​|"​) to the description. For instance, the fragment: +
- +
-{{ :​tree.png?​nolink |}} +
- +
- +
-can be represented by the following code in XMG: +
- +
-    class Example +
-      declare ?X ?Y ?Z +
-      {<​syn>​{ +
-        node ?X [cat=S] ; node ?Y [cat=N] ; node ?Z [cat=V] ; +
-        ?X -> ?Y ; ?X -> ?Z ; ?Y >> ?Z +
-      } +
-    } +
-XMG also supports an alternative way of specifiyng how the nodes are related to each other. This alternative syntax should allow the user to both define the nodes and give their relations at the same time: +
- +
-    * node { node } strict dominance +
-    * node { ...+node } strict large dominance (transitive non-reflexive closure) +
-    * node { ...node } large dominance (transitive reflexive closure) +
-    * node node strict precedence +
-    * node ,,,+node strict large precedence (transitive non-reflexive closure) +
-    * node ,,,node large precedence (transitive reflexive closure) +
-    * = node equation +
- +
-Thus the tree fragment above could be defined in the XMG syntax the following way: +
- +
-    class Example +
-    {<​syn>​{ +
-       node [cat=S] { +
-               node [cat=N] +
-               node [cat=V] +
-               } +
-       } +
-    } +
- +
-Note that the use of variables to refer to the nodes becomes useless inside the fragment, nonetheless we may want to assign variables to node to reuse them later through inheritence.  +
- +
- +
-===== IFACE: connecting dimensions ===== +
- +
-Interfaces correspond to attribute-value matrices, allowing one to associate a global name to an identifier. +
-The syntax of the interface is the following (the interface is between square-brackets):​  +
- +
-  class Id +
-  { ... }*= [Name1=Id1, ... , NameN=IdN] +
- +
- +
-The *= operator represents unifying extension.  +
-When a class is valuated, the descriptions (contained in the classes) it refers to are accumulated. At the +
-same time, the interfaces associated with these descriptions are accumulated. The semantics of their +
-accumulation may correspond to unification. +
- +
-Let us see the use of an interface in an example. Considering the tree fragment used so far. Imagine we +
-want to refer to the N node outside of the class. To do so, we give this node a global name. We can do +
-this by using the following interface:​ +
- +
-  class Example +
-  declare ?X ?Y ?Z +
-  {<​syn>​{ +
-       node ?X [cat=S] { +
-                        node ?Y [cat=N] +
-                        node ?Z [cat=V] +
-                       } +
-       ​}*=[subj = ?Y] +
-  } +
- +
-In a class A which is combined with Example, you can constraint the identification of a local node X with the subj node of Example by reusing the feature subj in the interface of A:  ''​*=[subj=?​X]''​ +
-Note that the interface may also be used to give names to properties or feature values. +
- +
-The interface can also be accessed as a regular dimension, meaning that the ''​*=''​ operator can be replaced as follows: +
- +
-  class Example +
-  declare ?X ?Y ?Z +
-  {<​syn>​{ +
-       node ?X [cat=S] { +
-                        node ?Y [cat=N] +
-                        node ?Z [cat=V] +
-                       } +
-       }; +
-   <​iface>​{[subj = ?Y]} +
-  }  +
-     +
- +
- +
-   +
-===== FRAME: describing semantics using typed feature structures ===== +
- +
-The <​frame>​ dimension can be used in a compiler by using the **frame** brick (contribution **framemg**). A set a pre-assembled compilers use this brick: **synframeCompiler** (with Tree Adjoining Grammars) and **framelpcompiler** (with morphological descriptions). +
- +
-This dimension allows to describe typed feature structures. These structures use conjunctive types, which means that types are not atomic, but rather sets of elementary types. When two typed feature structures get unified, the type of the resulting structure is determined by a type hierarchy. In the simple case, and if the types are compatible, the resulting type is the union of both types.  +
- +
- {{ :​hierarchy.png?​400 |}} +
- +
-Type hierarchies are defined in two steps. First, the declaration of the atomic types: +
- +
-  frame-types = {t1,​t2,​...,​tn} +
-   +
-where t1, t2, ..., tn are constants. +
- +
-In a second time, the atomic types get organized into a hierarchy by specifying constraints:​ +
- +
-  frame-constraints = {c1, c2,..., cn } +
-   +
-where c1, c2, ..., cn are type constraints. Several sorts of them are available:  +
- +
-  * constraints concerning subtyping: ''​ t1 t2 ... tn -> tt1 tt2 ... ttn ''​ +
-  * incompatibility constraints:​ ''​ t1 t2 -> - ''​ +
-  * constraints concerning attributes ''​ t1 t2 ... tn -> c1 ... cn '',​ with ''​c1 ... cn''​ constraints on attributes +
- +
-Constraints on attributes can be of the following types: +
- +
-  * existence constraint: ''​att : +''​ +
-  * value constraint: ''​att : val''​ +
-  * path equality ''​att1 = att2''​ +
- +
-Note that all attributes in these constraints can be paths, using dots. For example, ''​actor.name : +''​ means that there is an attribute ''​actor'',​ and that the value of this attribute has an attribute ''​name''​.  +
- +
-The following example makes use of all the types of constraints:​ +
- +
-  frame-types = {event, motion, activity, causation, locomotion} +
-  frame-constraints = {  +
-          causation -> event, +
-          motion -> event, +
-          activity -> event, +
-          motion causation -> -, +
-          activity causation -> -, +
-          activity motion -> locomotion,​ +
-          activity -> actor:+, +
-          motion -> mover:+, +
-          causation -> cause:+ effect:+ +
-  } +
-   +
-The first three constraints are subsumption constraints. ''​causation -> event''​ means for example that all frames of type ''​causation''​ also have type ''​event''​. The two next constraint express incompatibilities of types, meaning for instance that a frame cannot have both types ''​motion''​ and ''​causation''​.  +
-''​activity motion -> locomotion''​ means that all frames having both type ''​activity''​ and ''​motion''​ will also have type ''​locomotion''​. +
-The three last constraints concern attributes. For instance, ''​causation -> cause:+ effect:​+''​ makes sure that all frames of type causation will have attributes ''​cause''​ and ''​effect'',​ both with value ''​+''​. +
- +
- +
-{{ :​frames.png?​500 |}} +
- +
-   <​frame>​{ +
-    [causation,​ +
-      actor:?​X1,​ +
-      theme:?​X2,​ +
-      cause: ​   [activity,​ +
-                   ​actor:​ ?X1, +
-                   ​theme:​ ?X2], +
-      effect:?​IN[activity,​ +
-                   ​actor:​ ?X2] +
-    ] +
-   } +
-    +
-==== Exporting the type hierarchy ====  +
- +
-XMG computes the type hierarchy to handle the unification of typed features structures during the compilation of the metagrammar. However, to be able to reuse this type hierarchy with the generated resource (with a parser for example), the hierarchy needs to be exported. When compiling the metagrammar,​ the option ''​--more''​ activates the export of additional useful resources, which is the hierarchy in our case. For a file called ''​example.mg'',​ the complete command is the following:​ +
- +
-  xmg compile synframe example.mg --force --more +
-   +
-The compiled grammar can then be found in the file ''​example.xml''​ and the type hierarchy in the file ''​more.mac''​. +
- +
-===== SEM: describing semantics using predicates ===== +
- +
-Here we will see how to describe semantic information with predicates. Basically, this dimension allows one to  +
-describe: +
-  * predicates with 0, 1 or more arguments and a label,  +
-  * negation,  +
-  * a specific relation called "​scope_over"​ for dealing with quantifiers,​ +
-  * and semantic identifiers. +
- +
-So the language of the semantic dimension is: +
-  Description ::= l:​p(E_1,​...,​E_n) | ~ l:​p(E_1,​...,​E_n) | E_i << E_j | E  +
-In XMG concrete syntax, one may define a class with a semantic content by:  +
- +
-  class BinaryRel +
-  declare !L ?X ?Y ?P +
-  {  +
-    <​sem>​{ !L:?​P(?​X,?​Y) }*=[pred=?​P] +
-  } +
- +
-That is to say, we define the class ''​BinaryRel''​ in which 3 variables and a skolem constant (prefixed +
-by "​!"​) are declared. This class only contains semantic information (dimension <​sem>​),​ more +
-precisely it contains a predicate (whose value is the variable ?P) of arity 2, its arguments are the +
-variables ?X and ?Y. !L represents the label associated to this predicate. Note that we use the interface +
-dimension to give the name **pred** to ?P. Further, this variable may be unified with a constant, and the +
-value of the predicate thus given. +
-Finally, it is possible to define a class containing both a semantic and syntactic dimension, and these +
-dimensions may share identifiers. Besides sharing identifiers may also be done by using the interface +
-dimension. Thus XMG provides efficient devices to define a syntax / semantics interface within the  +
-metagrammar. +
- +
-===== MOPH_LP: describing morphology with ordered fields ===== +
- +
-This dimension allows to form words by assembling morphemes. First, fields need to be defined and ordered (using constraints),​ then information can be added to the fields. The description language offered by dimension consists of only one keyword and two operators. +
- +
-    * ''​field''​ definition of a field +
-    * ''>>'' ​  ​linear precedence between fields  +
-    * '':''​ affectation of a value to an attribute  +
- +
-The following class shows concretely how the dimension can be used: +
- +
-  class plural_suffix +
-  { +
-    <​morph>​{ +
-   field suffix; +
-   root >> suffix; +
-          suffix <- "​s"​ +
-    } +
-  } +
-   +
-A field ''​suffix''​ is created, and placed on the right of another field ''​root''​ (defined in another class). The string "​s"​ is added into the new field. This complete example can be found in the ''​MetaGrammars/​lp_morph/​example.mg file''​ of XMG-2 (or on [[https://​github.com/​spetitjean/​XMG-2/​blob/​master/​MetaGrammars/​lp_morph/​example.mg|GitHub]]). +
-   +
-Metagrammars containing contributions to the **morph_lp** dimension can be compiled with the compilers named lp (only morphology) and framelp (with semantic frames): +
- +
-  xmg compile lp file.mg +
- +
-===== LEMMA: describing lexicons of lemmas ===== +
- +
-This dimensions is used  when parsing (with a TAG for example). The typical use of these lexicons is to list which TAG families are compatible with the lemmas of the language. The description language basically allows to associate values to different attributes. An example of class using the lemma dimension is as follows: +
- +
-  class LemmeAller +
-  { +
-    <​lemma>​ { +
-      entry <- "​aller";​ +
-      cat   <- v; +
-      fam   <- n0Vloc1 +
-     } +
-  } +
-   +
-where ''​entry''​ is the lemma, ''​fam''​ a TAG family which can use this lemma as anchor, and ''​cat''​ the syntactic category of the anchor. +
-   +
-Metagrammars containing contributions to the <​lemma>​ dimension (only) must be compiled with the compiler named lex: +
- +
-  xmg compile lex lemma.mg +
-   +
-===== MORPHO: describing lexicons of inflected forms===== +
- +
-This dimensions is used when parsing (with a TAG for example). The typical use of these lexicons is to list which lemmas are compatible with the inflected forms of the language. The description language basically allows to associate values to different attributes. An example of class using the morpho dimension is as follows:  +
- +
-  class a +
-  { +
-    <​morpho>​ { +
-      morph <- "​a";​ +
-      lemma <- "​avoir";​ +
-      cat   <- v +
-     } +
-  } +
-   +
-where ''​morph''​ is the inflected form, ''​lemma''​ is the lemma associated to this inflected form, and ''​cat''​ the syntactic category of the inflected form. +
-   +
-Metagrammars containing contributions to the <​morpho>​ dimension (only) must be compiled with the compiler named mph: +
- +
-  xmg compile mph morph.mg +
-   +
-====== Principles and plugins ====== +
-This section contains descriptions of existing XMG principles, and a method for the user to create their own principles . +
-===== Solvers and principles ===== +
-XMG's most complex task is to compute all possible models for the descriptions of the metagrammar. These descriptions are sets of constraints,​ therefore extracting the models is a constraint satisfaction problem. Every dimension comes with its own solver (sometimes identity), which builds only structures with the right properties. For example, the **syn** dimension (which allows to describe trees) accumulates tree descriptions:​ nodes, dominance and precedence constraints. The **syn** dimension comes with a solver called **tree**, makes sure that all solutions of the constraint satisfaction problem will be well-formed tree (one root, etc) which also takes into account the constraints given in the metagrammar.  +
- +
-Principles are sets of constraints which come in addition to a solver. They are useful to describe constraints which cannot be described in classes. The metagrammar makes it possible to express constraint between two objects (two nodes for instance) when they can be referred to with variables, but it is sometimes needed to express constraints between one structure from a class and a structure that might appear in another class, or several classes, or not appear at all. In other words, the XMG constraints can be considered as local (to the class) constraints,​ whereas principles allow to express global constraints. ​     +
- +
-Three "​historical"​ principles are provided by XMG, for the dimension **syn**, namely: +
-    * **unicity**:​ uniqueness on a specific attribute-value +
-    * **rank**: ordering of clitics by means of associating the rank property to nodes +
-    * **color**: automatization of the node merging by assigning color to nodes +
-but more principles can also be added as **plugins**. This is what led to the creation of the new principles:​ +
-    * **precedes**:​ two properties are given as parameters. A node with the first property must precede one with the second property +
-    * **requires**:​ two properties are given as parameters. If a node with the first property exists, then a node with the second property must also exist. +
-    * **excludes**:​ two properties are given as parameters. If a node with the first property exists, then no node with the second property can exist. +
-==== Colors ==== +
- +
-The **colors** principle consists in the use of a color language to semi-automatize node unification during tree description solving. This idea has been proposed by B. Crabbé (see [[https://​link.springer.com/​chapter/​10.1007/​11424574_3|[Crabbé and +
-Duchier, 04]]]). The process is the following:  +
-  - we decorate nodes with colors (red, black or white), +
-  -  the description solving is extended so that the nodes are unified according to specific color combination rules: +
-{{ :​colors.png?​200 |}} +
- +
-That is to say:  +
-a black node may be unified with 0, 1 or more white nodes and thus produces a black node,  +
-a white node has to be unified with a black one producing a black node,  +
-and eventually a red node cannot be merged with any other node.  +
-As a result, a satisfying model is a model where all the nodes are either black or red. +
- +
-The important advantage of this color labelling process is that we do not need to explicitly specify all +
-the node unifications that have to be performed. Actually the saturation of colors will trigger these +
-unifications. In other words we can think of nodes in terms of "​relative addresses"​. This means that we +
-do not have to manage node variables (which correspond to "​absolute addresses"​) as the colors give a +
-way to refer to "​mergeable"​ nodes, ie black nodes that can be unified (thus that can receive a fragment). +
-By lessening the use of variables, we prevent name conflicts and thus we can for instance easily reuse +
-the same tree fragment within the same tree description (this happens in TAG for trees with double +
-prepositional phrase). +
- +
-To use the **colors** principle, the metagrammar must include the following declarations:​ +
- +
-  use color with () dims (syn) +
-  type COLOR ={red,​black,​white} +
-  property color : COLOR +
- +
-When the principle is used, every node needs to be affected a color (or to be unified with a node having one). As a reminder, giving such a property to a node is done as follows: +
- +
-  node ?X (color=red) +
-==== Rank ==== +
- +
-The **rank** principle is used to express linear orders between nodes which do not appear in the same classes. For example, in languages where there are strict constraints on the order of clitic pronouns, this principle makes the description task easier. The idea is to give a **rank** to every node representing a clitic, and this rank will make sure that if other clitics are added to the description (with their own ranks), they will be placed on the right side of this clitic. In a description where two nodes have respectively ranks 3 and 4, the only valid solutions will be the ones where the node with rank 3 precedes (not necessarily immediately) the one with rank 4. +
- +
-Warning: the **rank** principle only applies to sister nodes. If two ranked nodes have different mother nodes, no linear precedence constraint will apply on them. +
- +
-To use the **rank** principle, the metagrammar must include the following declarations:​ +
- +
-  use rank with () dims (syn) +
-  type RANK = [X..Y] +
-  property rank : RANK +
- +
-where ''​X''​ and ''​Y''​ and integers indicating the lowest and highest values for a rank. When the principle is used, nodes can be affected a rank. As a reminder, giving such a property to a node is done as follows: +
- +
-  node ?X (rank=3) +
- +
- +
-==== Unicity ==== +
-To use the **unicity** principle, the metagrammar must include the following declaration:​ +
- +
-  use unicity with (attribute=value) dims (syn) +
- +
-where the pair of parameters (''​attribute''​ and ''​value''​) should only be seen in one node (at most) in every valid model. The principle ''​unicity''​ can be used either for features or properties. For example, with the following instance of the principle:​ +
- +
-  use unicity with (rank=1) dims (syn) +
- +
-models will be able to have only zero or one node of rank 1. +
- +
-==== Requires ==== +
-To use the **requires** principle, the metagrammar must include the following declaration:​ +
- +
-  use requires with ( attribute1= value1, attribute2=value2 ) dims (syn) +
- +
- +
-==== Excludes ==== +
- +
-To use the **excludes** principle, the metagrammar must include the following declaration:​ +
- +
-  use excludes with ( attribute1= value1, attribute2=value2 ) dims (syn) +
- +
- +
-==== Precedes ==== +
- +
-To use the **precedes** principle, the metagrammar must include the following declaration:​ +
- +
-  use precedes with ( attribute1= value1, attribute2=value2 ) dims (syn) +
- +
- +
-===== Plugins ===== +
- +
-XMG-2 makes it possible for a user to create their own principles, without involving too much programming efforts. The solution provided to create these new principles is to use **plugins** for solvers. More documentation coming soon. +
- +
-====== Examples ====== +
- +
-===== Simple TAG example ===== +
- +
-Now, we will see in details how to write a metagrammar. We will define a metagrammar generating a +
-small TAG for French. This small TAG will contain 2 trees, namely the ones representing a transitive +
-verb either with a canonical subject or a subject in relative position. +
- +
- +
- +
-==== Specifying data ==== +
-  +
-First thing to do: defining the principles, types, properties and features we will use. For the sake of clarity, +
-we will only constraint the produced trees to have no duplicate grammatical function. That is to say, +
-we will only activate the unicity principle with the gf property as parameter:​ +
- +
-  use unicity with (gf = subj) dims (syn) +
-  use unicity with (gf = obj) dims (syn) +
- +
-We will deal with few types in this example. We only pay attention to grammatical functions and syntactic categories. The first one is a node property and the second one a node feature (ie part of the TAG formalism):​ +
- +
-  type CAT = {n,v,s} +
-  type GF = {subj, obj} +
-  property gf : GF +
-  feature cat : CAT +
- +
-==== Defining blocs (tree fragments) ==== +
- +
-The metagrammatical rule we will use is the following:​ +
- +
-  transitive = (CanSubject | RelSubject) ; Active ; Object  +
- +
-So we will handle 4 tree fragments: Active, CanSubject, Object, and RelSubject. The class transitive will consist of an abstraction on a conjunctive combination including a disjunction on the subject that is used. +
-The Active class corresponds to the verbal spine:  +
- +
-{{ :​active.png?​100 |}} +
- +
-  class Active +
-  export ?X ?Y +
-  declare ?X ?Y +
-  {<​syn>​{ +
-        ?X -> ?Y +
-        } +
-  } +
- +
-The CanSubject class corresponds to the Example class introduced previously:​ +
- +
-{{ :​cansubj.png?​100 |}} +
- +
-  class CanSubject +
-  export ?X ?Y ?Z +
-  declare ?X ?Y ?Z +
-  { <​syn>​{ +
-        node ?X [cat = s]{ +
-                node ?Y (gf=subj)[cat=n] +
-                node ?Z [cat = v] +
-                } +
-        } +
-  } +
- +
-The Object class is the symetric class of CanSubject:​ +
- +
-{{ :​obj.png?​100 |}} +
- +
- +
-  class Object +
-  export ?X ?Y ?Z +
-  declare ?X ?Y ?Z +
-  { <​syn>​{ +
-        node ?X [cat = s]{ +
-                node ?Y [cat = v] +
-                node ?Z (gf=obj)[cat=n] +
-                } +
-        } +
-  } +
- +
-The RelSubject class and its concrete syntax are given below: +
- +
-{{ :​relsubj.png?​200 |}} +
- +
-  class RelSubject +
-  export ?X ?Y ?Z +
-  declare ?X ?Y ?Z ?U ?V +
-  { <​syn>​{ +
-        node ?U [cat = n]{ +
-                node ?V [cat = n] +
-                node ?X [cat = s]{ +
-                        node ?Y (gf=subj)[cat=n] +
-                        node ?Z [cat = v] +
-                        } +
-                } +
-        } +
-  } +
- +
-At this point, we may wonder why associating variables to nodes ? The answer is that we still have to +
-merge these fragments, we will use the exported variables to unify specific nodes. +
- +
-==== From tree fragments to trees ==== +
-  +
-Once the basic blocs have been defined, we can combine them to produce the expected trees. We define the transitive class: +
- +
-  class transitive +
-  declare ?SU ?OB ?AC +
-  { +
-        ?SU = {CanSubject[] | RelSubject[]} ; ?OB = Object[] ​ ; ?AC = Active[] ; +
-        ?SU.?X = ?OB.?X ; ?SU.?Z = ?OB.?Y ; ?SU.?X = ?AC.?X ; +
-        ?SU.?Z = ?AC.?Y  +
-  } +
- +
-In this class, we use the dot operator to associate a variable to the record of exported identifiers. For +
-instance, ?OB being the variable representing the Object class, ?OB.?X refers to the ?X variable of this class, provided it has been exported. In the transitive class we combine conjunctively 3 classes (one being either CanSubject or RelSubject, and Object, and Active). We also unify their s and v nodes so that the tree fragments get merged. Note that we may prefer using a color system to semi-automatize this node unification (see Controlling fragment combination semi automatically by coloring nodes). +
-Eventually, we know that the  transitive class contains all the information needed to build 2 TAG +
-trees. So we ask for its evaluation by invoking: +
- +
- +
-  value transitive +
- +
-As a result we obtain the 2 following trees (the first one represents the relative subject, and the second +
-one the canonical subject) : +
- +
-{{ :​solution_1.png?​300 |}} +
-   +
-{{ :​solution_2.png?​200 |}} +
-   +
-==== The whole metagrammar ==== +
- +
- +
-  use unicity with (gf = subj) dims (syn) +
-  use unicity with (gf = obj) dims (syn) +
-  type CAT = {n,v,s} +
-  type GF = {subj, obj} +
-  property gf : GF +
-  feature cat : CAT +
-   +
-  class Active +
-  export ?X ?Y +
-  declare ?X ?Y +
-  {<​syn>​{ +
-        ?X -> ?Y +
-        } +
-  } +
-   +
-  class CanSubject +
-  export ?X ?Y ?Z +
-  declare ?X ?Y ?Z +
-  { <​syn>​{ +
-        node ?X [cat = s]{ +
-                node ?Y (gf=subj)[cat=n] +
-                node ?Z [cat = v] +
-                } +
-        } +
-  } +
-   +
-  class Object +
-  export ?X ?Y ?Z +
-  declare ?X ?Y ?Z +
-  { <​syn>​{ +
-        node ?X [cat = s]{ +
-                node ?Y [cat = v] +
-                node ?Z (gf=obj)[cat=n] +
-                } +
-        } +
-  } +
-   +
-  class RelSubject +
-  export ?X ?Y ?Z +
-  declare ?X ?Y ?Z ?U ?V +
-  { <​syn>​{ +
-        node ?U [cat = n]{ +
-                node ?V [cat = n] +
-                node ?X [cat = s]{ +
-                        node ?Y (gf=subj)[cat=n] +
-                        node ?Z [cat = v] +
-                        } +
-                } +
-        } +
-  } +
-   +
-  class transitive +
-  declare ?SU ?OB ?AC +
-  { +
-        { ?​SU=CanSubject[] | ?​SU=RelSubject[] } ; ?OB = Object[] ​ ; ?AC = Active[] ; +
-        ?SU.?X = ?OB.?X ; ?SU.?Z = ?OB.?Y ; ?SU.?X = ?AC.?X ; +
-        ?SU.?Z = ?AC.?Y  +
-  } +
-   +
-  value transitive +
-   +
-===== More examples =====  +
- +
-More examples can be found in the ''​Metagrammars''​ folder of the XMG installation directory. Some grammars are also available on the [[http://​xmg.phil.hhu.de/​index.php/​upload/​resources|resources]] page of the XMG website. +
- +
-====== Bricks, contributions and commands ====== +
- +
-As stated previously, XMG-2 compilers are built using bricks, which implement the compiling steps for parts of metagrammatical languages. For example, the **avm** brick contain all the support for feature structures, and the **syn** brick for the language of the <syn> dimension. +
-Bricks are distributed in contributions,​ for instance the **core** contribution,​ which contains all the basic features of XMG-2. The **treemg** contribution contains all the bricks which, in addition to the ones of the **core** contribution,​ allow to build the **synsem** compiler (equivalent to XMG-1). +
- +
-Making bricks available is done by installing them. The **install** command takes as parameter a contribution and installs all the bricks provided by it. +
- +
-All contributions with names ending with "​compiler"​ are special, as they contain a different type of bricks. These bricks contain description of compilers which need to be created before one can use them. Creating a compiler is done with the command **build**. +
- +
-Compilers are usually named after the dimensions they feature (**synframe** provides both the <syn> and the <​frame>​ dimensions). The following compilers can be installed:​ +
- +
-**framelpcompiler**:​ for morphological description with frame semantics. +
- +
-**lexCompiler**:​ to create lemma files for a TAG parser (as LexConverter). +
- +
-**lpcompiler**:​ for morphological descriptions. +
- +
-**mphcompiler**:​ to create files of inflected forms for a TAG parser (as LexConverter). +
- +
-**synframeCompiler**:​ for TAG descriptions with frame semantics. +
- +
-**syn2frameCompiler**:​ the same with a slightly different tree description language. +
- +
-**synsemCompiler**:​ for TAG descriptions with predicate semantics (XMG-1). +
- +
-**tfcompiler**:​ for morphological descriptions specified using topological fields. +
- +
-To learn more about how these compilers were assembled, and how to assemble customized compilers for specific description tasks, see [[https://​link.springer.com/​chapter/​10.1007/​978-3-662-53826-5_16|[Petitjean et al., 2016]]]. +
- +
-The commands provided by XMG can be separated in two categories: some of them will be used by any user writing a linguistic resource, the others will be reserved to developers of XMG extensions.  +
- +
-User commands: +
-  * ''​xmg bootstrap'':​ installs the basic features of XMG. +
-  * ''​xmg install path_to_contribution'':​ makes a contribution available. +
-  * ''​xmg build'':​ assembles a compiler according to a yaml description. +
-  * ''​xmg gui gui_name'':​ starts a GUI. The only GUI provided up to now is ''​tag''​. +
-  * ''​xmg compile compiler_name path_to_metagrammar'':​ compiles a metagrammar with a given compiler. The options for this command are:   +
-    * ''​--force''​ to generate the grammar even if an XML file already exists +
-    * ''​--latin''​ to manipulate metagrammars written in latin encoding +
-    * ''​--debug''​ to print some useful information about compilation +
-    * ''​--notype''​ to disable the strong type checking (equivalent to XMG1) +
-    * ''​--more''​ to generate additional files (type hierarchy, etc) +
-    *  ''​--output''​ or ''​-o''​ allow to specify an output file (the default is the name of the metagrammar file with the xml or json extension) +
- +
-Developper commands: +
-  * ''​xmg startcommand''​ +
-  * ''​xmg startyaplib''​ +
-  * ''​xmg startbrick''​ +
-  * ''​xmg startcompiler''​ +
-  * ''​xmg startpylib''​ +
-  * ''​xmg startcontrib''​ +
- +
-====== Scripts ====== +
- +
-To ease the installation of some compilers, scripts are available at the root of the XMG-2 installation directory.  +
- +
-By typing: +
- +
-  ./​reinstall.sh +
-   +
-The synsem compiler will be built and installed (all the other contributions will be uninstalled). To add the lex and the mph compiler, one can use the script  +
- +
-  ./​install_lex_mph.sh +
-   +
-Other scripts are by convention named after the compiler(s) they install. All scripts starting with ''​reinstall''​ will first uninstall all existing compilers, all scripts starting with ''​install''​ will add compilers to the already installed set of compilers. +
- +
-  ./​reinstall_all.sh +
-   +
-will make all other bricks available (compilers still need to be built). +
- +
-====== Tools ====== +
- +
-===== XMGTOOL ===== +
- +
-XMGTOOL, packaged with XMG-2, is a utility to compare outputs generated by different metagrammars. It is tipically used for debugging, while extending the grammar, or to compare the outputs of XMG-1 and XMG-2. XMGTOOL helps tracking which classes generate more or less models, or where the entries of the grammar differ.  +
- +
-First, the command ''​pickle''​ transforms the grammar into a format XMGTOOL can handle: ''​xmgtool pickle grammar_file output_file''​ will produce the file ''​output_file''​ which will allow to analyse the grammar contained in ''​grammar_file''​ (produced by XMG). +
- +
-The command ''​fstat''​ compares the numbers of models for each class contained in two grammars. ''​xmgtool fstat file1 file2''​ will print all classes for which the number of entries differ for the two grammars contained in ''​file1''​ and ''​file2''​ (these files must have been produced by the command ''​pickle''​). +
-   +
-===== Viewers ===== +
- +
-Grammars generated with XMG can be viewed using the default GUI packaged with XMG, as showed in the introduction. Other options are available to visualize the generated resource, each of them offering support for different types of grammars: +
-  * Pytreeview, a tree viewer written in Python, for LTAG grammars: [[https://​gitlab.com/​parmenti/​pytreeview]] +
-  * the XMG webGUI, for LTAG grammars and semantic frames: [[http://​xmg.phil.hhu.de/​index.php/​upload/​upload_viewer]] ​  +
-   +
-===== Parsers ===== +
- +
-The resources created with XMG can of course be used for parsing. [[https://​sourcesup.cru.fr/​tulipa/​biblio.html|TuLiPA]] [[https://​www.aclweb.org/​anthology/​W/​W08/​W08-2316.pdf|[Parmentier et al., 2008]]] allows to parse LTAG grammars with predicate based semantics. Its new version, ​[[https://​github.com/​spetitjean/​TuLiPA-frames/​|TuLiPA-frames]] [[http://​www.lrec-conf.org/​proceedings/​lrec2018/​pdf/​567.pdf|[Arps and Petitjean, 2018]]] provides a parser for LTAG with frame semantics.  +
- +
-====== Errors and support ====== +
- +
-===== Common errors ===== +
-This section lists some errors that can be encountered while developing a resource with XMG. +
- +
-==== Tokenizer errors ==== +
-  * ''​unrecognized'':​ the given symbol is not supported by the tokenizer. You may check the encoding of the file or try to use the ''​--latin''​ option. +
-==== Syntax errors ==== +
-  * ''​expected'':​ syntax error. Check the different languages of the section [[http://​dokufarm.phil.hhu.de/​xmg/​doku.php?​id=start#​dimensions|Dimensions]]. Maybe the used compiler is not the right one? +
-==== Type errors ==== +
-  * ''​incompatible types'':​ the value of an attribute does not have the expected type. Check the type declarations. +
-  * ''​unknown constant'':​ all constants, except for the boolean values ''​+''​ and ''​-''​ (type **bool**), must be declared (a value for an enumerated type for example).  +
-  * ''​multiple definitions of constant (in type definitions)'':​ the same identifier is used to refer to two constants (a value can only have one type). +
-  * ''​variable not declared'':​ a variable appears in the class, but is not imported, nor declared, nor a class parameter. +
-  * ''​property_not_declared'':​ a node is given a property which was not defined in the headers. +
-  * ''​feature not declared'':​ a node is given a feature which was not defined in the headers. +
-  * ''​multiple definitions of feature'':​ the same identifier is used to refer to two different features. +
-  * ''​type not defined'':​ a structure is given (in the headers) a type which is not defined.  +
-  * ''​multiple definitions of type'':​ the same identifier is used to refer to two types. +
-  * ''​incompatible expressions'':​ the types of two expressions are not compatible. +
-  * ''​value not in range'':​ an integer variable has a value incompatible with its definition (out of the bounds).  +
- +
- +
- +
- +
-==== Unfolder errors ==== +
-  * ''​cycle detected with class'':​ the given class creates a cycle in the class hierarchies (it calls a class which is already one of its ancestors in the hierarchy). ​ This is forbidden as the resource generated would be infinite. +
-  * ''​no class set to be valued'':​ there should be at least one axiom in a metagrammar (see [[http://​dokufarm.phil.hhu.de/​xmg/​doku.php?​id=start#​valuations|Valuations]]). +
- +
- +
-===== Other common problems ===== +
- +
-  * Uncolored nodes: in a XMG metagrammar using the **syn** dimension and the **colors** principle, a color needs to be given to all nodes appearing in the accumulation. Warnings will be displayed if some nodes do not have colors. Grammars developed with XMG-1 can contain uncolored nodes, but they are ignored by the compiler. With XMG-2, you can simply remove these nodes from the metagrammar to obtain the same result. +
- +
-===== Support ===== +
-To report any bug concerning XMG, please use the [[https://​github.com/​spetitjean/​XMG-2/​issues|issue tracker]]. You can also use the tracker for any question or request for assistance (installing,​ developing with XMG). +
- +
-Please also use the [[https://​github.com/​spetitjean/​XMG-2|GitHub page]] if you would like to request new extensions for XMG, or to share your own extensions or resources. +
- +
- +
-====== Bibliography ====== +
- +
-Related papers: +
- +
-  * [[https://​www.mitpressjournals.org/​doi/​abs/​10.1162/​COLI_a_00144|[Crabbé et al., 2013]]] Crabbé, B., Duchier, D., Gardent, C., Le Roux, J., and Parmentier, Y. (2013). XMG : eXtensible MetaGrammar. Computational Linguistics,​ 39(3):​1–66. +
- +
-  * [[https://​link.springer.com/​chapter/​10.1007/​978-3-662-53826-5_16|[Petitjean et al., 2016]]] Petitjean, S., Duchier, D., and Parmentier, Y. (2016). XMG 2: Describing Description Languages. In Logical Aspects of Computational Linguistics. Celebrating 20 Years of LACL (1996–2016) 9th International Conference, LACL 2016, Nancy, France, December 5-7, 2016, Proceedings 9, pages 255–272. Springer Berlin Heidelberg. +
-   * [[https://​link.springer.com/​chapter/​10.1007/​11424574_3|[Crabbé and +
-Duchier, 04]]] Benoît Crabbé and Denys Duchier. Metagrammar Redux, in Proceedings of CSLP’04, Roskilde, Denmark, 2004. +
- +
-  *[[https://​www.aclweb.org/​anthology/​W/​W08/​W08-2316.pdf|[Parmentier et al., 2008]]] Yannick Parmentier, Laura Kallmeyer, Wolfgang Maier, Timm Lichte and Johannes Dellert (2008). TuLiPA: A Syntax-Semantics Parsing Environment for Mildly Context-Sensitive Formalisms. In Proceedings of the 9th International Workshop on Tree-Adjoining Grammar and related Formalisms, June 2008. Tübingen, Germany. 121-128. +
- +
-  *[[http://​www.lrec-conf.org/​proceedings/​lrec2018/​pdf/​567.pdf|[Arps and Petitjean, 2018]]] Arps, D. & Petitjean, S. (2018). A Parser for LTAG and Frame Semantics. In: N. Calzolari, K. Choukri, C. Cieri, T. Declerck, S. Goggi, K. Hasida, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, S. Piperidis & T. Tokunaga, eds, ‘Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)’, European Language Resources Association (ELRA), Paris, France. +
-====== Privacy ====== +
- +
-[[https://​www.uni-duesseldorf.de/​home/​footer/​datenschutz.html|Privacy declaration]]. +
- +
- +