Skip to content

Commit

Permalink
i18n: handle string placeholder collisions (#14432)
Browse files Browse the repository at this point in the history
  • Loading branch information
brendankenny committed Oct 6, 2022
1 parent d7d359e commit ba425cd
Show file tree
Hide file tree
Showing 10 changed files with 271 additions and 44 deletions.
4 changes: 2 additions & 2 deletions core/audits/accessibility/aria-meter-name.js
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ const UIStrings = {
title: 'ARIA `meter` elements have accessible names',
/** Title of an accessibility audit that evaluates if meter HTML elements do not have accessible names. This title is descriptive of the failing state and is shown to users when there is a failure that needs to be addressed. */
failureTitle: 'ARIA `meter` elements do not have accessible names.',
/** Description of a Lighthouse audit that tells the user *why* they should have accessible names for HTML elements. This is displayed after a user expands the section to see more. No character length limits. 'Learn More' becomes link text to additional documentation. */
description: 'When an element doesn\'t have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `meter` elements](https://dequeuniversity.com/rules/axe/4.4/aria-meter-name).',
/** Description of a Lighthouse audit that tells the user *why* they should have accessible names for HTML 'meter' elements. This is displayed after a user expands the section to see more. No character length limits. 'Learn how...' becomes link text to additional documentation. */
description: 'When a meter element doesn\'t have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `meter` elements](https://dequeuniversity.com/rules/axe/4.4/aria-meter-name).',
};

const str_ = i18n.createIcuMessageFn(import.meta.url, UIStrings);
Expand Down
4 changes: 2 additions & 2 deletions core/audits/accessibility/aria-tooltip-name.js
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ const UIStrings = {
title: 'ARIA `tooltip` elements have accessible names',
/** Title of an accessibility audit that evaluates if tooltip HTML elements do not have accessible names. This title is descriptive of the failing state and is shown to users when there is a failure that needs to be addressed. */
failureTitle: 'ARIA `tooltip` elements do not have accessible names.',
/** Description of a Lighthouse audit that tells the user *why* they should have accessible names for HTML elements. This is displayed after a user expands the section to see more. No character length limits. 'Learn More' becomes link text to additional documentation. */
description: 'When an element doesn\'t have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `tooltip` elements](https://dequeuniversity.com/rules/axe/4.4/aria-tooltip-name).',
/** Description of a Lighthouse audit that tells the user *why* they should have accessible names for HTML 'tooltip' elements. This is displayed after a user expands the section to see more. No character length limits. 'Learn how...' becomes link text to additional documentation. */
description: 'When a tooltip element doesn\'t have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `tooltip` elements](https://dequeuniversity.com/rules/axe/4.4/aria-tooltip-name).',
};

const str_ = i18n.createIcuMessageFn(import.meta.url, UIStrings);
Expand Down
27 changes: 22 additions & 5 deletions core/lib/i18n/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,11 @@ The collection and translation pipeline:
```
Source files: Locale files:
+---------------------------+ +----------------------------------------------
| ++ | shared/localization/locales/en-US.json |
| const UIStrings = { ... };|-+ +---> | shared/localization/locales/en-XL.json |
| ++ | shared/localization/locales/en-US.json |
| const UIStrings = { ... };|-+ +---> | shared/localization/locales/en-XL.json |
| |-| | +----------------------------------------------+
+-----------------------------| | | ||
+----------------------------| | | shared/localization/locales/*.json |-<+
+----------------------------| | | shared/localization/locales/*.json |-<+
+---------------------------+ | | || |
| | +----------------------------------------------| |
$ yarn | | +---------------------------------------------+ |
Expand Down Expand Up @@ -198,11 +198,11 @@ CTC is a name that is distinct and identifies this as the Chrome translation for
```json
{
"name": {
"message": "Message text, with optional placeholders.",
"message": "Message text, with optional placeholders, which can be $PLACEHOLDER_TEXT$",
"description": "Translator-aimed description of the message.",
"meaning": "Description given when a message is duplicated, in order to give context to the message. Lighthouse uses a copy of the description for this.",
"placeholders": {
"placeholder_name": {
"PLACEHOLDER_TEXT": {
"content": "A string to be placed within the message.",
"example": "Translator-aimed example of the placeholder string."
},
Expand All @@ -211,6 +211,23 @@ CTC is a name that is distinct and identifies this as the Chrome translation for
}
```

### Collisions
Collisions happen when two CTC messages have the same `message`. For Lighthouse, there are two relevant collision types in TC:
- Allowed: the CTC `message`, `description`, and `placeholders` are exactly the same. These collisions are deduped on the TC side and the translation cost is the same as for a single string.
- Disallowed: `message` is the same but one or more of the other properties differ.

When the `message` needs to be the same as another string but another property must differ, that disallowed collision can be fixed by adding a unique `meaning` property to each colliding CTC message. TC will then consider those as separate strings and not a collision.

In Lighthouse, this is done by having a different `description` for the strings, which is then copied to `meaning` in `resolveMessageCollisions()`. `meaning` cannot be manually set.

For instance, the string "Potential Savings" currently refers to both saved KiB and saved milliseconds in different audits. The string is defined twice, each with a different `description` describing the units being saved, in case some locales' translations will use a different word choice depending on the unit.

Internally, TC uses a message ID that's a hash of `message` and `meaning` to check for collisions. Somewhat confusingly, if two messages do have colliding IDs, then `message`, `meaning`, `description`, and `placeholders` are all required to match or an error is thrown. This is why all message properties could cause a collision but `meaning` is the only way to dedupe them.

We treat it as an error if `placeholders` differ between messages in a collision: if there is a need for placeholders to differ, then the strings aren't really the same, and at least the `description` should be changed to explain that context. Placeholders must match in user-controlled data (e.g. if a placeholder has an `@example`, it must be the same example in all instances) and in Lighthouse-controlled data (e.g. the token used to replace it in the CTC `message`, like `$PLACEHOLDER_TEXT$` in the example above).

Finally, identical messages made to not collide by Lighthouse with a `meaning` cost real money and shouldn't be confused with allowed collisions which cost nothing for each additional collision. Fixed collisions are checked against a known list to add a little friction and motivate keeping them few in number. An error is thrown if a collision is fixed that hasn't yet been added to that list.

# Appendix

## Appendix A: How runtime string replacement works
Expand Down
94 changes: 72 additions & 22 deletions core/scripts/i18n/collect-strings.js
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ import {expect} from 'expect';
import tsc from 'typescript';
import MessageParser from 'intl-messageformat-parser';
import esMain from 'es-main';
import isDeepEqual from 'lodash/isEqual.js';

import {Util} from '../../util.cjs';
import {collectAndBakeCtcStrings} from './bake-ctc-to-lhl.js';
Expand Down Expand Up @@ -613,43 +614,90 @@ function writeStringsToCtcFiles(locale, strings) {
}

/**
* This function does two things:
*
* - Add `meaning` property to ctc messages that have the same message but different descriptions so TC can disambiguate.
* - Throw if the known collisions has changed at all.
* TODO: replace by Array.prototype.groupToMap when available.
* @template {unknown} T
* @template {string} U
* @param {Array<T>} elements
* @param {(element: T) => U} callback
* @return {Map<U, Array<T>>}
*/
function arrayGroupToMap(elements, callback) {
/** @type {Map<U, Array<T>>} */
const elementsByValue = new Map();

for (const element of elements) {
const key = callback(element);
const group = elementsByValue.get(key) || [];
group.push(element);
elementsByValue.set(key, group);
}

return elementsByValue;
}

/**
* Returns whether all the placeholders in the given CtcMessages match.
* @param {Array<{ctc: CtcMessage}>} strings
* @return {boolean}
*/
function doPlaceholdersMatch(strings) {
// Technically placeholder `content` is not required to match by TC, but since
// `example` must match and any auto-generated `example` is copied from `content`,
// it would be confusing to let it differ when `example` is explicit.
return strings.every(val => isDeepEqual(val.ctc.placeholders, strings[0].ctc.placeholders));
}

/**
* If two or more messages have the same `message`:
* - if `description`, and `placeholders` are also the same, TC treats them as a
* single message, the collision is allowed, and no change is made.
* - if `description` is unique for the collision members, the `description` is
* copied to `meaning` to differentiate them for TC. They must be added to the
* `collidingMessages` list for this case as a nudge to not be wasteful.
* - if `description` is the same but `placeholders` differ, an error is thrown.
* Either `placeholders` should be the same, or `message` or `description`
* should be changed to explain the need for different `placeholders`.
*
* Modifies `strings` in place to fix collisions.
* @see https://github.com/GoogleChrome/lighthouse/blob/main/core/lib/i18n/README.md#collisions
* @param {Record<string, CtcMessage>} strings
*/
function resolveMessageCollisions(strings) {
/** @type {Map<string, Array<CtcMessage>>} */
const stringsByMessage = new Map();

// Group all the strings by their message.
for (const ctc of Object.values(strings)) {
const collisions = stringsByMessage.get(ctc.message) || [];
collisions.push(ctc);
stringsByMessage.set(ctc.message, collisions);
}
const entries = Object.entries(strings).map(([key, ctc]) => ({key, ctc}));

/** @type {Array<CtcMessage>} */
const allCollisions = [];
const stringsByMessage = arrayGroupToMap(entries, (entry) => entry.ctc.message);
for (const messageGroup of stringsByMessage.values()) {
// If this message didn't collide with anything else, skip it.
if (messageGroup.length <= 1) continue;

// If group shares both message and description, they can be translated as if a single string.
const descriptions = new Set(messageGroup.map(ctc => ctc.description));
if (descriptions.size <= 1) continue;
// For each group of matching message+description, placeholders must also match.
const stringsByDescription = arrayGroupToMap(messageGroup, (entry) => entry.ctc.description);
for (const descriptionGroup of stringsByDescription.values()) {
if (!doPlaceholdersMatch(descriptionGroup)) {
throw new Error('string collision: `message` and `description` are the same but differ in `placeholders`:\n' +
descriptionGroup.map(entry => `key: ${entry.key}\n`).join('') +
'make `placeholders` match or update `message` or `description`s to explain why they don\'t match');
}
}

// If the entire message group has the same description and placeholders,
// they can be translated as if a single message, no meaning needed.
if (stringsByDescription.size <= 1) continue;

// We have duplicate messages with different descriptions. Disambiguate using `meaning` for TC.
for (const ctc of messageGroup) {
for (const {ctc} of messageGroup) {
ctc.meaning = ctc.description;
}
allCollisions.push(...messageGroup);
}
}

// Check that the known collisions match our known list.
const collidingMessages = allCollisions.map(collision => collision.message).sort();
/**
* Check that fixed collisions match our known list.
* @param {Record<string, CtcMessage>} strings
*/
function checkKnownFixedCollisions(strings) {
const fixedCollisions = Object.values(strings).filter(string => string.meaning);
const collidingMessages = fixedCollisions.map(collision => collision.message).sort();

try {
expect(collidingMessages).toEqual([
Expand Down Expand Up @@ -696,6 +744,7 @@ async function main() {
}

resolveMessageCollisions(strings);
checkKnownFixedCollisions(strings);

writeStringsToCtcFiles('en-US', strings);
console.log('Written to disk!', 'en-US.ctc.json');
Expand Down Expand Up @@ -734,4 +783,5 @@ export {
parseUIStrings,
createPsuedoLocaleStrings,
convertMessageToCtc,
resolveMessageCollisions,
};
12 changes: 6 additions & 6 deletions core/test/fixtures/fraggle-rock/reports/sample-flow-result.json
Original file line number Diff line number Diff line change
Expand Up @@ -2532,7 +2532,7 @@
"aria-meter-name": {
"id": "aria-meter-name",
"title": "ARIA `meter` elements have accessible names",
"description": "When an element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `meter` elements](https://dequeuniversity.com/rules/axe/4.4/aria-meter-name).",
"description": "When a meter element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `meter` elements](https://dequeuniversity.com/rules/axe/4.4/aria-meter-name).",
"score": null,
"scoreDisplayMode": "notApplicable"
},
Expand Down Expand Up @@ -2591,7 +2591,7 @@
"aria-tooltip-name": {
"id": "aria-tooltip-name",
"title": "ARIA `tooltip` elements have accessible names",
"description": "When an element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `tooltip` elements](https://dequeuniversity.com/rules/axe/4.4/aria-tooltip-name).",
"description": "When a tooltip element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `tooltip` elements](https://dequeuniversity.com/rules/axe/4.4/aria-tooltip-name).",
"score": null,
"scoreDisplayMode": "notApplicable"
},
Expand Down Expand Up @@ -11172,7 +11172,7 @@
"aria-meter-name": {
"id": "aria-meter-name",
"title": "ARIA `meter` elements have accessible names",
"description": "When an element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `meter` elements](https://dequeuniversity.com/rules/axe/4.4/aria-meter-name).",
"description": "When a meter element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `meter` elements](https://dequeuniversity.com/rules/axe/4.4/aria-meter-name).",
"score": null,
"scoreDisplayMode": "notApplicable"
},
Expand Down Expand Up @@ -11231,7 +11231,7 @@
"aria-tooltip-name": {
"id": "aria-tooltip-name",
"title": "ARIA `tooltip` elements have accessible names",
"description": "When an element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `tooltip` elements](https://dequeuniversity.com/rules/axe/4.4/aria-tooltip-name).",
"description": "When a tooltip element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `tooltip` elements](https://dequeuniversity.com/rules/axe/4.4/aria-tooltip-name).",
"score": null,
"scoreDisplayMode": "notApplicable"
},
Expand Down Expand Up @@ -16505,7 +16505,7 @@
"aria-meter-name": {
"id": "aria-meter-name",
"title": "ARIA `meter` elements have accessible names",
"description": "When an element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `meter` elements](https://dequeuniversity.com/rules/axe/4.4/aria-meter-name).",
"description": "When a meter element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `meter` elements](https://dequeuniversity.com/rules/axe/4.4/aria-meter-name).",
"score": null,
"scoreDisplayMode": "notApplicable"
},
Expand Down Expand Up @@ -16564,7 +16564,7 @@
"aria-tooltip-name": {
"id": "aria-tooltip-name",
"title": "ARIA `tooltip` elements have accessible names",
"description": "When an element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `tooltip` elements](https://dequeuniversity.com/rules/axe/4.4/aria-tooltip-name).",
"description": "When a tooltip element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `tooltip` elements](https://dequeuniversity.com/rules/axe/4.4/aria-tooltip-name).",
"score": null,
"scoreDisplayMode": "notApplicable"
},
Expand Down
4 changes: 2 additions & 2 deletions core/test/results/sample_v2.json
Original file line number Diff line number Diff line change
Expand Up @@ -3442,7 +3442,7 @@
"aria-meter-name": {
"id": "aria-meter-name",
"title": "ARIA `meter` elements have accessible names",
"description": "When an element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `meter` elements](https://dequeuniversity.com/rules/axe/4.4/aria-meter-name).",
"description": "When a meter element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `meter` elements](https://dequeuniversity.com/rules/axe/4.4/aria-meter-name).",
"score": null,
"scoreDisplayMode": "notApplicable"
},
Expand Down Expand Up @@ -3501,7 +3501,7 @@
"aria-tooltip-name": {
"id": "aria-tooltip-name",
"title": "ARIA `tooltip` elements have accessible names",
"description": "When an element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `tooltip` elements](https://dequeuniversity.com/rules/axe/4.4/aria-tooltip-name).",
"description": "When a tooltip element doesn't have an accessible name, screen readers announce it with a generic name, making it unusable for users who rely on screen readers. [Learn how to name `tooltip` elements](https://dequeuniversity.com/rules/axe/4.4/aria-tooltip-name).",
"score": null,
"scoreDisplayMode": "notApplicable"
},
Expand Down

0 comments on commit ba425cd

Please sign in to comment.