Skip to content

Core tools (driver script, transfer, tagger, formatters) for the FOSS RBMT system Apertium

License

GPL-2.0, LGPL-2.1 licenses found

Licenses found

GPL-2.0
COPYING
LGPL-2.1
COPYING.hunalign
Notifications You must be signed in to change notification settings

apertium/apertium

Repository files navigation

Apertium

Requirements

  • This package needs the package lttoolbox-3.5 installed in the system, as well as libxml and libpcre.

See https://apertium.org and https://wiki.apertium.org for more information on installing.

Description

When building, this package generates, among others, the following modules:

  • apertium-deshtml, apertium-desrtf, apertium-destxt Deformatters for html, rtf and txt document formats.
  • apertium-rehtml, apertium-rertf, apertium-retxt Reformatters for html, rtf and txt document formats.
  • apertium Translator program. Execute without parameters to see the usage.

Quick Start

There are binaries available for Debian, Ubuntu, Fedora, CentOS, OpenSUSE, Windows, and macOS. We package both nightly builds and releases. See https://wiki.apertium.org/wiki/Installation for more information. Only build from source if you either want to change this tool's behavior, or are on a platform we don't yet package for.

  1. Download the packages for lttoolbox-VERSION.tar.gz and apertium-VERSION.tar.gz and linguistic data

    Note: If you are using the translator from GitHub, run ./autogen.sh before running ./configure in all cases.

  2. Unpack lttoolbox and do ('#' means 'do that with root privileges'):

   $ cd lttoolbox-VERSION
   $ ./configure
   $ make
   # make install
  1. Unpack apertium and do:
   $ cd apertium-VERSION
   $ ./configure
   $ make
   # make install
  1. Unpack linguistic data (LING_DATA_DIR) and do:
   $ cd LING_DATA_DIR
   $ ./configure
   $ make
   and wait for a while (minutes).
  1. Use the translator
   USAGE: apertium [-d datadir] [-f format] [-u] <direction> [in [out]]
    -d datadir       directory of linguistic data
    -f format        one of: txt (default), html, rtf, odt, docx, wxml, xlsx, pptx,
                     xpresstag, html-noent, latex, latex-raw
    -a               display ambiguity
    -u               don't display marks '*' for unknown words
    -n               don't insert period before possible sentence-ends
    -m memory.tmx    use a translation memory to recycle translations
    -o direction     translation direction using the translation memory,
                     by default 'direction' is used instead
    -l               lists the available translation directions and exits
    direction        typically, LANG1-LANG2, but see modes.xml in language data
    in               input file (stdin by default)
    out              output file (stdout by default)


   Sample:

   $ apertium -f txt es-ca <input >output