Skip to content

PHP package for translation, spelling correction and text-to-speech (TTS) synthesis using external APIs

License

Notifications You must be signed in to change notification settings

GinoPane/PHPolyglot

Repository files navigation

PHPolyglot

Latest Stable Version Build Status Maintainability Test Coverage Scrutinizer Code Quality License Total Downloads

Combining and featuring different APIs for language translation, dictionary lookup, spelling correction and speech synthesis (TTS) in an easy to use and extend way.

Table of Contents

Features

  • provides an easy-to-use way to utilise different language-related APIs for translation, grammar correction, TTS, etc.;
  • custom APIs can be easily added, because the package heavily relies on implementation of different interfaces, therefore it is easy to plug-in (pull requests are appreciated);
  • open or free (possibly with limitations) APIs are preferred;
  • language codes must be ISO-639 compatible (alpha-2 or alpha-3 if there's no alpha-2);
  • third-party APIs may contain their own limitations or licensing requirements (see License)

Requirements

  • PHP >= 7.1;
  • credentials for Yandex Translate API, Yandex Dictionary API and IBM Watson API (depending on what you are going to use).

Installation

composer require gino-pane/phpolyglot

Create a copy of .env.example file, name it .env and put your own API credentials in it. File contains links to pages which may be related to required credentials.

In order to run examples from examples directory you have to specify your own valid API credentials.

Basic Usage

The package contains a plenty of ready-to-use examples in examples directory. All endpoints either return a valid response or throws a relevant exception. All APIs are configured through config.php file which contains the default API classes mapping. Support of dynamic configs was added in 1.1.0 update:

$phpolyglot = new PHPolyglot($config, $env);

This allows you to pass your own configuration values if you don't want to rely on those that are stored in configuration files.

Translation

There are two endpoints. For a single string:

function translate(string $text, string $languageTo, string $languageFrom = ''): TranslateResponse

and for multiple strings:

function translateBulk(array $text, string $languageTo, string $languageFrom = ''): TranslateResponse

As a minimum example you can pass text and language to translate into (source language will be detected by API):

$response = (new PHPolyglot())->translate('Hello world', 'it')->getTranslations(); // [ 0 => Ciao mondo ]

TranslateResponse has getTranslations method which returns an array of translations.

Supported languages may vary depending on third-party API.

Yandex Translate API

Please check the list of supported languages. Yandex Translate API is free to use with limitations (1000 000 characters per day, up to 10 000 000 per month). If you want you can get a paid plan of course. The API won't let you to get into paid plan automatically, it will simply return an error when the limit is reached. In order to use the API you need to get the valid API key.

Dictionary Lookup

There is a single endpoint, which can be used in two different forms.

For a lookup within the same language (get word forms):

function lookup(string $text, string $languageFrom): DictionaryResponse

and for translation-with-lookup (get multiple translations and additional information including word forms, examples, meanings, synonyms, transcription, etc.):

function lookup(string $text, string $languageFrom, string $languageTo): DictionaryResponse

As a minimum example you can pass text and its source language:

$response = (new PHPolyglot)->lookup('Hello', 'en)->getEntries();

$synonyms = implode(", ", $response[0]->getSynonyms());

$output = <<<TEXT
Initial word: {$response[0]->getTextFrom()}

Part of speech: {$response[0]->getPosFrom()}
Transcription: {$response[0]->getTranscription()}

Main alternative: {$response[0]->getTextTo()}
Synonyms: {$synonyms}
TEXT

echo $output

/**
Initial word: hello
  
Part of speech: noun
Transcription: ˈheˈləʊ

Main alternative: hi
Synonyms: hallo, salut
*/

Supported languages may vary depending on third-party API.

Yandex Dictionary API

Please check the list of supported languages. Yandex Dictionary API is free to use with limitations (up to 10 000 references per day). In order to use the API you need to get the valid API key.

Spelling Check

There are two endpoints. For a single string:

function spellCheck(string $text, string $languageFrom = ''): SpellCheckResponse

and for multiple strings:

function spellCheckBulk(array $texts, string $languageFrom = ''): SpellCheckResponse

As a minimum example you can pass only a text to check:

$corrections = $phpolyglot->spellCheckText('Helo werld', $languageFrom)->getCorrections();

/**
array(1) {
  [0] =>
  array(2) {
    'Helo' =>
    array(1) {
      [0] =>
      string(5) "Hello"
    }
    'werld' =>
    array(1) {
      [0] =>
      string(5) "world"
    }
  }
}
*/

Supported languages may vary depending on third-party API.

Yandex Speller API

Please check the list of supported languages (basically, only English, Russian and Ukrainian are supported at the moment). Yandex Speller API is free to use with limitations (up to 10 000 calls/10 000 000 characters per day). No keys are required.

Speech Synthesis

The main endpoint is PHPolyglot's speak method:

public function speak(
    string $text,
    string $languageFrom,
    string $audioFormat = TtsAudioFormat::AUDIO_MP3,
    array $additionalData = []
): TtsResponse

Only two parameters are required - text for synthesis $text and its source language $languageFrom.

Optional parameters $audioFormat and $additionalData may be omitted. Audio format allows to explicitly specify the required audio format of returned audio. Additional data allows to set API specific parameters for more precise results (voice, pitch, speed, etc.).

The list of audio formats which are currently recognized:

  • TtsAudioFormat::AUDIO_BASIC
  • TtsAudioFormat::AUDIO_FLAC
  • TtsAudioFormat::AUDIO_L16
  • TtsAudioFormat::AUDIO_MP3
  • TtsAudioFormat::AUDIO_MPEG
  • TtsAudioFormat::AUDIO_MULAW
  • TtsAudioFormat::AUDIO_OGG
  • TtsAudioFormat::AUDIO_WAV
  • TtsAudioFormat::AUDIO_WEBM

Please note that not all of them may be supported by your API of choice.

The TTS method returns TtsResponse which has storeFile method to store generated file with required name and extension into the specified directory (or by using default values):

function storeFile(string $fileName = '', string $extension = '', string $directory = ''): string

By default the file name is a simple md5 hash of $text that was used for TTS, $extension is being populated based on content-type header (at least, for IBM Watson API), $directory is based on config setting.

(new PHPolyglot())->speak('Hello world', 'en')->storeFile(); // stores 3e25960a79dbc69b674cd4ec67a72c62.mp3

IBM Watson Text-to-Speech

Please check the list of supported languages and voices. IBM Watson TTS requires API credentials for authorization. Create your TTS project there and get your API-specific credentials. API is free to use with limitations (up to 10 000 characters per month).

Possible ToDos

  • transcribe words;
  • get synonyms, antonyms, derivatives;
  • detect text language;
  • add more configuration flexibility (choose API based on config constraints, like different APIs for different languages).

Useful Tools

Running Tests:

php vendor/bin/phpunit

or

composer test

Code Sniffer Tool:

php vendor/bin/phpcs --standard=PSR2 src/

or

composer psr2check

Code Auto-fixer:

php vendor/bin/phpcbf --standard=PSR2 src/ 

or

composer psr2autofix

Building Docs:

php vendor/bin/phpdoc -d "src" -t "docs"

or

composer docs

Changelog

To keep track, please refer to CHANGELOG.md.

Contributing

  1. Fork it;
  2. Create your feature branch (git checkout -b my-new-feature);
  3. Make your changes;
  4. Run the tests, adding new ones for your own code if necessary (phpunit);
  5. Commit your changes (git commit -am 'Added some feature');
  6. Push to the branch (git push origin my-new-feature);
  7. Create new pull request.

Also please refer to CONTRIBUTING.md.

License

Please refer to LICENSE.

The PHPolyglot does not own any of results that APIs may return. Also, APIs may have their own rules about data usage, so beware of them when you use them.

Notes

Powered by composer-package-template and PHP Nano Rest.

About

PHP package for translation, spelling correction and text-to-speech (TTS) synthesis using external APIs

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages