Automated translations #8

leeper · 2014-08-16T15:23:34Z

Add automated translations using translate or translateR

richierocks · 2017-01-04T01:21:40Z

I've been thinking about the workflow for this. It goes something along these lines:

Update the PO files to ensure that all the latest messages are included. I think this is just a call tomake_translation(), but I may have misunderstood.
If the PO file for the target language does not exist, or the user decides to translate all messages, retrieve them using get_messages().
If the PO file already exists, the user should have the option to only translate message that haven't been translated already. In a PO object, that's all the direct messages where msgstr == "". For countable messages, it's where there is at least one message blank, i.e., vapply(msgstr, function(x) !all(nzchar(x)), logical(1)). Might also need to include messages with fuzzy in its flags_comments field.
The countable message need dummy number values substituting into the message. For example, in most languages, you'd have to substitute 1 and 2 (or any number more than 1).
Send them off to the translation engine.
Match the returned translations back up to their IDs.
Write the new translations into the PO file along with updated metadata.

leeper · 2017-01-04T19:07:15Z

Steps 1 and 2 can be achieved by just doing make_translation(). That will return a "po" object containing the untranslated message (or any translations that already exist in the .po file).

I think that means we basically need two new functions:

Extraction function to extracted untranslated strings from the "po" object (to achieve 3-4)
Assignment function to set the translation somehow based upon the original string (to achieve 6)

Then write_translation() handles 7.

It occurs to me that we may want either of the following:

Give the msgid's some kind of identifier so that they are easier to refer to programmatically, or
Make the "po" objects something like S4 or R6 objects, so that we can embed assignment functions within the object itself do to something like:

po$translate(msg1, "translated string")

What do you think?

richierocks · 2017-01-04T20:13:44Z

An identifier should be reasonably straightforward to add.
You paste the msgid and the msgctxt, then call digest::digest() on each row to create a hash.

Making the po objects R6 object is also possible, but it will likely take me a couple of days, so I won't be able to push to CRAN until this weekend.

leeper · 2017-01-04T20:46:49Z

Hashing is a good idea - much easier. Let's do that.

richierocks · 2017-01-05T04:19:54Z

I've just done a rewrite with R6 and hash values being auto-generated when you read the direct or countable elements. The tests are broken but it should be a quick fix tomorrow. The API is the same as before, so it shouldn't break msgtools (though let me know if you have any problems).

…

On 4 January 2017 at 15:46, Thomas J. Leeper ***@***.***> wrote: Hashing is a good idea - much easier. Let's do that. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#8 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAMD1TNQWe-U4zQ29Z4M0fcDw5vRxWLRks5rPAU5gaJpZM4CYBSL> .

-- Regards, Richie Learning R <http://shop.oreilly.com/product/0636920028352.do> 4dpiecharts.com

leeper added the enhancement label Aug 16, 2014

leeper self-assigned this Aug 16, 2014

leeper modified the milestone: Second CRAN Release Jan 3, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automated translations #8

Automated translations #8

leeper commented Aug 16, 2014

richierocks commented Jan 4, 2017

leeper commented Jan 4, 2017

richierocks commented Jan 4, 2017

leeper commented Jan 4, 2017

richierocks commented Jan 5, 2017 via email

Automated translations #8

Automated translations #8

Comments

leeper commented Aug 16, 2014

richierocks commented Jan 4, 2017

leeper commented Jan 4, 2017

richierocks commented Jan 4, 2017

leeper commented Jan 4, 2017

richierocks commented Jan 5, 2017 via email