Background
During a page view, Article.php is responsible for rendering the parser output.
- The ParserOptions for a page view come from Article::getParserOptions (via WikiPage::makeParserOptions and ParserOptions::newFromContext), where ParserOptions::newFromContext takes care of user preferences such as language variant. Then, before invoking the ParserCache and Parser, Article.php runs the the onArticleParserOptions hook, which may call setTargetLanguage to render the article in a different language code (or variant).
- Today, I believe, when ParserOutput formats its HTML in getText, it adds the mw-parser-output class, but it does not set any lang or dir attributes. It is currently the responsibility of the skin to do this.
At the Skin layer, and in the OutputPage class, there are a handful of calls to Title::getPageViewLanguage() that each independently try to determine what language the parser output is rendered in by taking the page language (from the page source/database) and applying a similar "current User" transformation to it for language variants as ParserOptions::newFromContext.
The "page view language" is then used to set the mw-content-ltr (or rtl) class name, and lang/dir attributes on the <div> that wraps the parser output (in Skin::wrapHTML, called from SkinTemplate). It is also used by OutputPage for mw.config wgPageContentLanguage and wgPageViewLanguage (ref T303375).
Problem
The "page view language" (from Title::getPageViewLanguage) has no influence on the actual rendering of the parser output. It does not factor into ParserOptions, or ParserCache key, or the ParserOutput. Instead, this method merely attempts to reverse engineering it. This is duplicative, risky, and we now know of at least one case where this is wrong (https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaIncubator/+/935091, ref T299369, uses the onArticleParserOptions hook and ParserOptions::setTargetLanguage method).
Ideally, I think, the Skin layer should be informed by OutputPage/ParserOutput/Article.php what the language of the rendered content is. It shouldn't have to reverse engineer or guess anything. Injecting from there into the Skin layer would also reduce computation overhead from re-computing it.
Once the ParserOutput object is available, there is not an obvious way to obtain the target language.
Strawman proposal
- As far as I can tell, the result of Title::getPageViewLanguage is never used to "decide" anything. I suggest we instead let ParserOutput dictate the user/lang attributes the Skin should use when wrapping the parser output in <div class="mw-content-ltr">. The element can remain owned by the Skin so that it can customise this, but the values would be informed by ParserOutput metadata instead of from the Title object.
- The recently introduced ParserOutput::getLanguage might be suitable for this purpose (cc @cscott, change 836297. Otherwise, we may need to introduce a new method to expose this information.
- If the above are both feasible and considered improvements, then we should deprecate and remove Title::getPageViewLanguage. While this is still a relatively new method, it appears by design any caller will be misinformed. It is also a violation of current design principles, because it involves implied global state (it doesn't even take a User or RequestContext parameter currently). The replacement would be to defer to the Skin, which in turn defers to the combined ParserOutput object that Article.php injects into OutputPage.