Skip to content

A small but powerful java program for generating complex linguistic syntax trees

Notifications You must be signed in to change notification settings

Nallantli/JSyntaxTree

Repository files navigation

JSyntaxTree

Output

Download Here

A small Java program to build syntax trees and morphology trees, according to how my current morphology class prescribes.

Inspired by yohasebe's RSyntaxTree, which came into frequent use, however I unfortunately was pressed to make something new as the features required for my class surpassed the capabilities his program.

Legge qui per informazione italiana

CLI Options:

Syntax		Default		Desc
-i <STRING>	-		input file path (required)
-o <STRING>	OUTPUT.png	output file path (include the .png, .jpg, etc.)
-f <STRING>	Doulos SIL	font name
-fs <INT>	48		font size
-l <FLOAT>	3.0		stroke size (for the lines)
-sx <INT>	50		spacing between adjacent nodes horizontally
-sy <INT>	150		spacing between adjacent nodes vertically
-c		-		color the tree
-q		-		quits right after generation - no preview window
-b <INT>	50		adds padding around the tree
-a		-		auto-subscript - will add a numerical subscript to node types

Movement Syntax

Movement is fairly simple: it always "moves" in the left direction, however by using a negative number you may switch the arrow.

The main syntax follows:

...
[TYPE^1 value]
...

The ^ operator (no spaces) indicates that movement stems from that node, and moves over 1 element to the left. The number can be changed to any number as long as there exists an element for it to move up to. The operator may also be duplicated:

...
[TYPE^1^2^3 value]
...

This syntax makes three individual movement arrows going 1 element over, 2 elements over, and 3 elements over.

...
[TYPE^1,2 value]
...

Having two numbers seperated by a comma indicates that the movement occurs at one of the parent nodes. Here with ^1,2 the syntax indicates that the movement goes 1 element to the left, and then rises to the node 2 parents above. In this way you may have end nodes with children (see Examples/chomsky.txt for a working example).

As stated previously, negating the movement value will cause the arrow to reverse, so, let's say that one wished to indicate movement right-ward, they would they need to place the ^ operation on the 'end' node and use -x steps:

...
[TYPE^-1,2 value]
...

Options

Nodes can be given special parameters using {} brackets after the node name:

[TYPE {
    content:"value";
    color:"255,255,0";
    line-color:"255,0,255";
    content-color:"0,255,255";
    move-color:"0,255,0";
}]

Values can be applied to non-end nodes (without the content parameter, of course) likewise:

[TYPE
    {
        ...
    }
    [
        ...
    ]
]

Options must come before sub-nodes.

Text Syntax

Generally just bracket notation, with some text effects:

\n newline

[TYPE value 1\nvalue 2]

_WORDS_ subscript

[TYPE_SUBTYPE_]

*WORDS* bold

[TYPE *value*]

%WORDS% italic

[TYPE %value%]

$WORDS$ smaller font

[TYPE BIG$SMALL$]

-WORDS- underline

[TYPE -value-]

#WORDS# highlight

[TYPE #value#]

=WORDS= strikethrough

[TYPE =value=]

At the end of a node, i.e. the @ in [N value value@] you may place these additions to determine what the connecting bar should be:

^ triangle

[TYPE value^]

| full bar

[TYPE value|]

If you need to have multiple words in a token, use the ` around the words.

Of course, all combinations are acceptable.

(e.g. $*%_W_O_R_D_S_%*$)

The program defaults to the Doulos SIL font (https://software.sil.org/doulos/), so you should probably have that installed. Not entirely sure what Java defaults to automatically.

Will build web interface at some point?

Currently there's a shoddy GUI interface I built for those who don't want to use the command line.

Examples

  1. Danube river steam ship driving company captain cabin door danube

  2. Troiae ab Oris troiae_ab_oris

  3. Colorless green ideas sleep furiously chompy

  4. あなたはユニコードの書物を使える あなたはユニコードの書物を使える If your font has the characters, you can use any unicode characters in data files. (This example uses Unifont)