lt-trim(1)lt-trim(1)NAMElt-trim - This application is part of the lexical processing modules
and tools ( lttoolbox )
This tool is part of the apertium machine translation architecture:
http://www.apertium.org.
SYNOPSISlt-trim analyser_binary bidix_binary trimmed_analyser_binary
DESCRIPTIONlt-trim is the application responsible for trimming compiled dictionar‐
ies. The analyses (right-side when compiling lr) of analyser_binary are
trimmed to the input side of bidix_binary (left-side when compiling lr,
right-side when compiling rl), such that only analyses which would pass
through `lt-proc -b bidix_binary' are kept.
Warning: this program is experimental! It has been tested, but not
deployed extensively yet.
Both compund tags (`<compound-only-L>', `<compound-R>') and join ele‐
ments (`<j/>' in XML, `+' in the stream) and the group element (`<g/>'
in XML, `#' in the stream) should be handled correctly, even combina‐
tions of + followed by # in monodix are handled.
Some minor caveats: If you have the capitalised lemma "Foo" in the
monodix, but "foo" in the bidix, an analysis "^Foo<tag>$" would pass
through bidix when doing lt-proc -b, but will not make it through trim‐
ming. Make sure your lemmas have the same capitalisation in the differ‐
ent dictionaries. Also, you should not have literal `+' or `#' in your
lemmas. Since lt-comp doesn't escape these, lt-trim cannot know that
they are different from `<j/>' or `<g/>', and you may get @-marked out‐
put this way. You can analyse `+' or `#' by having the literal symbol
in the `<l>' part and some other string (e.g. "plus") in the `<r>'.
You should not trim a generator unless you have a very simple transla‐
tor pipeline, since the output of bidix seldom goes unchanged through
transfer.
FILES
analyser_binary The untrimmed analyser dictionary (a finite state
transducer).
bidix_binary The dictionary to use as trimmer (a finite state trans‐
ducer).
trimmed_analyser_binary The trimmed analyser dictionary (a finite state
transducer).
SEE ALSOlt-comp(1), lt-proc(1), lt-print(1), lt-expand(1), apertium-tagger(1),
apertium(1).
BUGS
Lots of...lurking in the dark and waiting for you!
AUTHOR
(c) 2013--2014 Universitat d'Alacant / Universidad de Alicante.
2014-02-07 lt-trim(1)