Skip to content

futekov/measure-element

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Let’s take some <m>easures

The proposed inline <m> element represents a measurement of a physical attribute such as length, area, volume, mass, speed, temperature, etc.

<m unit="mi" value="20">twenty miles</m>

The purpose of the Web is to render information for us in a human-readable format. Making information also machine-readable will make it easily convertible and thus - even more accessible for everyone.

"I hope we will use the Net to cross barriers and connect cultures." Sir Tim Berners-Lee

Usefulness

Making measurements easier to convert on-page (automatically or on-demand) through an extension or a built-in browser feature can eliminate interruptions of our browsing experience to convert units:

  • for objects with physical dimensions like consumer products, tech hardware, furniture, etc.
  • for recipes that contain cooking measurements such as cups/ounces/liters/grams
  • on weather forecasts that display temperatures in °F or °C
  • of the specifications of vehicles like dimensions, torque, acceleration, weight, fuel efficiency
  • in drawing, CAD, and 3D modelling web applications that require physical dimension upon starting/exporting a project

Browsers aren’t the only consumer of content out there, we also have screen readers, search bots (such as Bing and Google), computational engines (like WolframAlpha), and personal assistants (Alexa, Cortana, Google Assistant, Siri) that can extract information and digest it for its users. Exposing more information to these bots and engines can only broaden their datasets of the real-world physical objects and thus be more useful to their users, this is the idea behind the Semantic Web.

What about Microformats?

A measure microformat has been discussed in the past but the proposals either include a multitude of elements and attributes or severely limits the text content inside each element. A similar HTML element is likely to be more successful since microformats are heavier to write, include more attributes to configure (classes and other attributes) and use a longer tag name than the proposed <m>. The drawbacks of using the microformat proposal linked above are:

  • it includes prices and currencies inside the specification, which are volatile units that change their relation to one another hundreds of times per day
  • will require using non-semantic inline elements with a longer name, which is less readable than a dedicated <m> tag
  • data is inside classes instead of specific attributes, targeting the unit or value as a hook is therefore more difficult and more error-prone
  • the proposal requires the use of more than one element and more than two attributes to mark up the data
  • the web author is strongly encouraged to use SI (metric) units instead of the ones of his/her choice

Why have value & unit instead of parsing text?

Currently no simple program can parse and understand physical measurements inside a casually phrased text because not all measurements consist of a simple number and a unit next to it.

The spirit of the Web isn't about forcing web authors to write in a specific machine-readable format with certain units (e.g. "the car's power is 84 kilowatts"), rather it encourages authors to phrase their text content as they will with the option to mark it up for hooks, semantics, or machine readability.

Hard-to-parse measurements can be marked up and become machine-readable without constraints to the text that web authors use:

  • <m unit="m" value="10">10 m</m> - ambiguous abbreviations
  • <m unit="oz" value="2">two ounces</m> - number words are sometimes used instead of digits
  • <m unit="kg" value="4">4 kilos</m> - shortened or slang words for units
  • <m unit="gal" value="0.5">half a gallon</m> - multiple words
  • <m unit="lb" value="0.95">weights just under a pound</m> - being imprecise in the text
  • <m unit="l" value="96">quatre-vingt seize litres</m> - units and values in a different language
  • <m unit="in" value="71">5′ 11″</m> - feet & inches often use quotes, primes, apostrophes, carets or similar symbols - ′'‘’ & ″"“”

No regular expression or program can be written to translate and understand all the above measurements just from the textual information, but if marked up they can be understood unambiguously by machines and people.

Healthy reminder that the existing <a>, <abbr>, and <time> also expect information for one thing provided in two formats - user-facing text content inside the tag plus an attribute value.

The value attribute can accept any positive or negative number, using a dot for a decimal separator.

The attribute unit will expect a case-sensitive string that should unambiguously define different units across the metric, imperial, and US customary unit systems.

Which standard to use?

Here is an example list of accepted unit abbreviations that can be easily extended in the future:

Type of measurement Metric Imperial & US Customary
Length cm, m, km in, ft, yd, mi
Area cm2, m2, km2, ha sqin, sqft, sqmi, acre
Volume cm3, m3, l cuin, cuft, UKgal USgal
Mass mg, g, kg, t gr, oz, lb, st
Density kg/m3 lb/ft3
Torque Nm lb.ft
Speed km/h, m/s mph, ft/s
Fuel efficiency L/100km mpg
Temperature C F

The options I found for a standard reference to be used for the unit attribute basically look like that:

  • The major international standards that list measurement units and propose abbreviations for them are ISO/IEC 80000 - behind a paywall and focused on Metric units
  • IEEE Std 260.1 that defines "Standard Letter Symbols for Units of Measurement (SI Customary Inch-Pound Units, and Certain Other Units)" but is behind a paywall
  • The UCUM's case-sensitive units which include all units listed in ISO 1000, ISO 2955-1983, ANSI X3.50-1986, HL7 and ENV 12435, and is designed primarily for machine-to-machine communication, many unit codes are not very user-friendly
  • Wikipedia's own "unit-codes" that are designed to be unambiguous and easy-to-use for conversion purposes, they include the most popular units and are made to be easy for human use. Licensed by Creative Commons Attributions ShareAlike 3.0, which is compatible with the "Creative Commons Attribution 4.0 International" license of WHATWG's HTML repository.

User Perspective

"The essential property of the World Wide Web is its universality." Sir Tim Berners-Lee

If implemented, the <m> tag will enable the creation of scripts and browser features that can convert measurements between metric, imperial, and US customary units without the users resorting to unit converters.

Current browser extensions that convert between units are typically either convertor popups, or selection/whole-page parsers, neither of which is both accurate and fast-to-use.

Browsers already offer us translations of a whole web page (Chrome and Safari) - a localized personalization of the content based on the user's declared language preferences. We don't need a great stretch of the imagination to envision a similarly-working "translator" that localizes units for us.

Quickly converting measurements without interrupting the user flow will save both time and bandwidth for all who use this feature.

The demand for unit conversion is apparently quite strong - search queries for unit conversion are so popular that Google added a unit converter UI calculator to their search results more than 9 years ago, but in fact the search engine has been answering conversion queries for at least 17 years - around the time Google became a public company (and before Gmail and Google Translate even existed).

Conversion

Converting units is a task typically delegated to users, who usually interrupt their reading journey by opening another tool, popup, or search engine to convert units. This is clearly inefficient; however, no better alternatives exist currently.

With <m>, web authors will have one more measurement declaration method to pick from:

HTML code Conversion effort by Converted unit availability
61 miles web users only manually from 3rd party tool
61 miles (100 km) author always visible
{{convert|61|mi|km|abbr=on}}
resulting in 61 miles (100 km)
MediaWiki back end always visible
<m unit="mi" value="61">61 miles</m> JS on the front end inline on-demand or automatically

As demonstrated by this table, to avoid the human effort either the 3rd or 4th method should be used for conversion. A new standardized HTML element would be a solution that is always available and back-end agnostic.

Additionally, when using the <m> element the developer can announce whether the measurement is suitable for conversion with the optional attribute convert:

convert="never|auto|eager"

A never option would be suitable for measurements with de-facto standard units such as nanometers for CPU designs and inches for screen diagonals or car tires. Probably no person will ever need to know that a 5nm CPU is 0.00000019685 inches.

The auto option is the default value and is the same as omitting the attribute. It will be up to the script/browser/user to decide whether to convert these automatically or on-demand.

The eager option is for units that will be more understandable if converted to the users' preferred ones. It signals that the value is appropriate for automatic conversion if a user preference/consent has been provided.

For this conversion system to work in a user-friendly way it also needs browser/OS unit option (metric, UK imperial, US imperial, etc) that users can set for their preferred units. This preference can be placed in one of several places:

Just a few years ago the IndieUI: User Context W3C document proposed the ability for a "set of preferences that users can choose to expose to web applications", the listed examples include assistive tech used, preferred typography style, and even a desired subtitle background color. No mention of a preferred "unit language" though.

A richer, more useful HTML

The more recent structuring block-level HTML elements such as <main>, <article>, <aside>, <header>, <footer>, <nav>, and <hgroup> have brought significant benefits for developers, but a huge part of the web is still a collection of blocks of plain text.

The web needs also more inline-level semantics, currently a limited number of such elements are available to authors to mark up their content. The most recent HTML usage data I could find shows a depressingly poor and generic usage of inline elements.

The proposed <m> element's attributes will have an expected and clear structure, allowing them to become a set of hooks for CSS and JS that can be used by developers in creative new ways in the future.

A <dfn> element might serve to identify the physical entity measured if placed inside the same <p> or <section> as the <m> element. Alternatively, an existing ARIA role might be used such as "labeledby" or "describedby".

Can <m> become widespread?

Widely used inline marked data might seem hard to achieve in our vast web but is actually possible as demonstrated by the success of Schema.org.

The unit problem has been "solved" by the Wikimedia Foundation by listing units in multiple formats - the length of the Great Wall of China is listed as "21,196 km (13,171 mi)" which is written in their back end as {{convert|21196|km|mi|abbr=on}}. This convert tag is reportedly used on over a million pages on Wikipedia. Wikimedia Foundation can theoretically update their code to output the same content but wrap the unit in a <m> tag.

Any store, wiki, CMS or other system that uses templates/views might also be able to retroactively update lots of its content.

Conclusion

Localizing data on web pages is not only a step toward smoothing the user journey, it's also saving time, making content accessible to more people and ultimately improves the user experience.

Let's move the web forward. Let's take some measures!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published