Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated translations #8

Open
leeper opened this issue Aug 16, 2014 · 5 comments
Open

Automated translations #8

leeper opened this issue Aug 16, 2014 · 5 comments
Assignees

Comments

@leeper
Copy link
Member

leeper commented Aug 16, 2014

Add automated translations using translate or translateR

@leeper leeper self-assigned this Aug 16, 2014
@leeper leeper modified the milestone: Second CRAN Release Jan 3, 2017
@richierocks
Copy link
Collaborator

I've been thinking about the workflow for this. It goes something along these lines:

  1. Update the PO files to ensure that all the latest messages are included. I think this is just a call tomake_translation(), but I may have misunderstood.
  2. If the PO file for the target language does not exist, or the user decides to translate all messages, retrieve them using get_messages().
  3. If the PO file already exists, the user should have the option to only translate message that haven't been translated already. In a PO object, that's all the direct messages where msgstr == "". For countable messages, it's where there is at least one message blank, i.e., vapply(msgstr, function(x) !all(nzchar(x)), logical(1)). Might also need to include messages with fuzzy in its flags_comments field.
  4. The countable message need dummy number values substituting into the message. For example, in most languages, you'd have to substitute 1 and 2 (or any number more than 1).
  5. Send them off to the translation engine.
  6. Match the returned translations back up to their IDs.
  7. Write the new translations into the PO file along with updated metadata.

@leeper
Copy link
Member Author

leeper commented Jan 4, 2017

Steps 1 and 2 can be achieved by just doing make_translation(). That will return a "po" object containing the untranslated message (or any translations that already exist in the .po file).

I think that means we basically need two new functions:

  1. Extraction function to extracted untranslated strings from the "po" object (to achieve 3-4)
  2. Assignment function to set the translation somehow based upon the original string (to achieve 6)

Then write_translation() handles 7.

It occurs to me that we may want either of the following:

  1. Give the msgid's some kind of identifier so that they are easier to refer to programmatically, or
  2. Make the "po" objects something like S4 or R6 objects, so that we can embed assignment functions within the object itself do to something like:
po$translate(msg1, "translated string")

What do you think?

@richierocks
Copy link
Collaborator

An identifier should be reasonably straightforward to add.
You paste the msgid and the msgctxt, then call digest::digest() on each row to create a hash.

Making the po objects R6 object is also possible, but it will likely take me a couple of days, so I won't be able to push to CRAN until this weekend.

@leeper
Copy link
Member Author

leeper commented Jan 4, 2017

Hashing is a good idea - much easier. Let's do that.

@richierocks
Copy link
Collaborator

richierocks commented Jan 5, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants