Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
BBC RussianHomePhabricator
Log In
Maniphest T59302

Suggest case insensitive results when searching for categories to add
Open, LowestPublicFeature

Description

lowercase afc

  • log in to http://en.wikipedia.beta.wmflabs.org
  • enable visual editor, if not already enabled
  • go to any page
  • edit it using visual editor
  • click the hamburger icon
  • click categories
  • categories screen opens
  • type this into text box: AFC
  • two matching categories are found (so far so good)
  • type this into text box: afc
  • I was expecting the same result as I got for AFC
  • but not matching categories were found
  • see attached screen shots

Version: unspecified
Severity: enhancement

Attached:

lowercase.png (821×1 px, 183 KB)

Details

Reference
bz57302

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 2:27 AM
bzimport set Reference to bz57302.

Created attachment 13851
uppercase AFC

Attached:

uppercase.png (821×1 px, 185 KB)

This would require a change to the MW core API module...

(In reply to Alex Monk from comment #2)

This would require a change to the MW core API module...

Hence "low". :-)

Jdforrester-WMF renamed this task from VisualEditor: Suggest case insensitive results when searching for categories to add to Suggest case insensitive results when searching for categories to add.Jan 23 2015, 10:19 PM
Jdforrester-WMF removed Krenair as the assignee of this task.
Jdforrester-WMF lowered the priority of this task from Low to Lowest.
Jdforrester-WMF moved this task from TR6: Visual diffs to Freezer on the VisualEditor board.
Jdforrester-WMF set Security to None.

It is a strange behaviour because we attempt it works like article searching.
The article search engine has been currently improved to became more and more flexible. Whereas the category search engine is very strict.

Is it not possible to use the article search engine with (invisible) category: prefix instead ?

AlexMonk-WMF subscribed.

Is it not possible to use the article search engine with (invisible) category: prefix instead ?

Wouldn't that search for pages in the category namespace, rather than actual categories? Some categories don't have associated pages, and you can create pages in the category namespace for non-existent categories.

Peternybor raised the priority of this task from Lowest to Medium.Nov 3 2017, 1:56 PM

Can this issue be fixed?

Deskana lowered the priority of this task from Medium to Lowest.Nov 3 2017, 2:16 PM
Deskana subscribed.

Please only change task priority if you are responsible for the project in question.

Can this issue be fixed?

This will not be worked on by the Editing team any time soon. As mentioned above, it is a technically challenging project which involves changing MediaWiki core, and the task is not important enough to warrant that amount of effort.

If someone else wishes to work on it, they are welcome to.

To clarify:

  • Our regular search feature (aka "prefix index"), used for the main search field and used for the input field when creating an article link, is case-insensitive in most cases. On Wikimedia wikis this comes from CirrusSearch. On other wikis (and on WMF until recently) this was provided by the TitleKey extension. The search feature has a namespace filter as well. Which would allow us to do case-insensitive search of page titles in the Category namespace.
  • The API for category names is not based on wiki pages existing in the Category namespace. For the same reason that our API for user names is not based on wiki pages in the User namespace. Categories, just like Users, can all exist without there being a description page for them per se.

In order to be able to have case-insensitive search for category names, as well as other normalisation features, we'd need to build a separate search index based on the category table. Or we could maybe hack something up in the current API that combines results from the wikipage search index for the case where a description page does happen to exist - however, it'll be quite difficult to combine those results in a meaningful way. Not to mention paging (page 2, page 3), and query performance (double the number of backend queries).

Is it not possible to use the article search engine with (invisible) category: prefix instead ?

Wouldn't that search for pages in the category namespace, rather than actual categories? Some categories don't have associated pages, and you can create pages in the category namespace for non-existent categories.

Do we have stats? How many categories don’t have a description page? How many pages in Category: namespace are not expected to be categories?
Plus, are these cases normal cases or just rarely issues which are expected to be fixed on wikis?

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:14 AM

Current engine:

  • finds “wanted categories“ (listed by Special:WantedCategories).
  • does not find “unused categories” (listed by Special:UnusedCategories).
  • is wrongly case sensitive.

Namespace-prefix page engine:

  • would find unused categories.
  • would be case insensitive.
  • would not find wanted categories.

Technically a category “exists” if any page links it.
However, for end users, a category “exists” if a category page has been created.

@Editing-team, do we really need to open a RFC to demonstrate a namespace-prefixed page engine would better satisfy editor needs?

I think this (or something more advanced) is really really needed.

commons.wikimedia.org heavily depends on categories for its usefulness, and for image to have categories, user taking a pictures needs to add them.

However, whatever API is used to help user find the categories for their upload, each have severe deficiencies. And even when one tries to make an union of results via multiple API calls, it still doesn't work (as tons of hidden categories are displayed to the users then -- i.e. exclusions don't work with unions, obviously)

From the https://github.com/commons-app/apps-android-commons/issues/3179#issuecomment-2145228188 (also see table of examples in post above that for details)

APIcase-insensitivecategories without Category:* pageignores missing non-alphanumericsmatches partial wordsmatches in the middle of category name / missing wordsskips hiddenskips renamed / moved
Special:UploadWizard action=opensearch✔️✔️
generator=search✔️✔️✔️
generator=allcategories✔️✔️✔️

Ideally, there would be a search which has a row full of green checkboxes in that table. Alas, there is nothing even close it.

Thus, it is simply impossible to allow user to reasonably select categories. I'm puzzled how that issue can have such a low priority.

At least if generator=allcategories had capability to ignore case, that would make it mostly usable to search for categories. Requiring users to know exact case (or requiring them to "brute-force" all combinations of uppercase and lowercase for all words in their search term) as is currently the situation, is not going to work, with result that users often won't find categories for their media, with results that Commons images get uploaded without categories or with inferior (too imprecise) categories - and are much less usable as such.

Technically a category “exists” if any page links it.
However, for end users, a category “exists” if a category page has been created.

Worth considering that this is a bit wiki-dependent, and is more true on the big wikis. Outside of the wikiprojects, it'd also affect people who're running their own small mediawiki instance, who might not have actually created category pages. I'd prefer not to make this trade-off, since there's a decent set of category results that'd effectively be hidden by it.

I'd prefer not to make this trade-off, since there's a decent set of category results that'd effectively be hidden by it.

Could you clarify which trade-off do you mean exactly @DLynch ? If I understood correctly " for end users, a category “exists” if a category page has been created" was just an observation, not an improvement suggestion (if you were replying to that).

Improvement suggestion was to make generator=allcategories case-insensitive - and that could only increase number of search results, not hide some of them, right?

@Mnalis I was quoting @Pols12's suggestion that they'd like to run an RfC to switch to a prefix search in the Category namespace, i.e. having the category-editor only allow search for categories that have a page created. In retrospect this was replying to something from over a year ago, rather than your recent quite-appreciated summary of options, so it was perhaps a bit unnecessary.

I surely appreciate your answer, thank you, even if it is late and opposed to my opinion. 🙂