B test to evaluate impact of mobile DiscussionTools
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	ppelberg
	Dec 21 2021, 12:12 AM

Description

This task represents the work with running an A/B test to evaluate the impact of
disabling the MobileFrontend talk page overlay and introducing the suite of mobile DiscussionTools:

Reply Tool
New Topic Tool
Topic Subscriptions
Usability Improvements

Research Questions

In running this A/B test, we are seeking to learn whether introducing the set of DiscussionTools listed above cause the following to happen?

Junior Contributors are more successful publishing new talk page comments and discussion topics
Junior Contributors intuitively recognize talk pages as spaces to communicate with other volunteers
Senior Contributors can assess the level of activity on a talk page with less effort

Decision to be made

This A/B test will help us make the following decision: Are the set of mobile Talk Pages Project features fit to be made available to everyone, at all wikis, by default?

Decision Matrix

We do not think a single metric / KPI will be sufficient for evaluating the cumulative impact of the set of DiscussionTools we are introducing in this test.

Reason being: we do not think there is a single metric that is likely to: A) move in response to these changes *and* B) for the direction of that movement to indicate a clear improvement or degradation in peoples' user experience.

In line with the above, we will take a "guardrail" approach to this analysis. Meaning, we will base the Decision to be made on the presence of or absence of two unambiguously negative outcomes.

ID	Scenario	Indicator/Metric	Plan of action
1.	People are more likely to make destructive edits	Proportion of published edits that are reverted within 48 hours of being made increases by >10% over a sustained period of time	1) Pause plans for wider deployment, 2) To contextualize change in revert rate, investigate changes in the number of published edits (maybe higher revert rate is a "price" we're willing to "pay" for the increase in good edits), 3) Investigate the type of edits being reverted to understand how the new tools – namely the Reply and New Topic Tools – could be contributing to the uptick
2.	People are less likely to publish the edits they start	Edit completion rate decreases by >10% over a sustained period of time	1) Pause plans for wider deployment and 2) Investigate what patterns exist among the people whose edit completion rate has gone down
3.	People do NOT encounter more difficulty publishing edits and there are no regressions in edit revert and edit completion rates	A) Edit completion rate increases by any percentage or it decreases by <10% over a sustained period of time and B) Edit revert rate decreases by any percentage or it increases by <10% over a sustained period of time	Move forward with opt-out deployment at all Wikimedia wikis

Curiosities

While the scenarios listed in the Decision Matrix section above will guide

Priority	Impact/Outcome	Metric
1.	Junior Contributors intuitively recognize talk pages as places to communicate	Percentage of unique Junior Contributors who visit a talk page and engage with it in some way. //Read: expanding a discussion section, initiating the workflow for starting a new discussion, initiating the workflow for replying to a comment someone else has made, etc.
2.	Senior Contributors can assess the level of activity on a talk page with less effort	Average time duration between from when a contributor views a talk page to the time they first engage with the page in some way
3.	People across experience levels are more successful publishing new talk page comments and discussion topics	A) Average number of talk page new topics or comments people publish during the course of the test and B) Percentage of people that edit a talk page, grouped by number of new topics or comments (e.g. 1-5, 6-10, 11-15, etc) they publish during the course of the test

Wikis

This section will contain the list of wikis participating in the A/B test. See T314950.

NOTE: In aggregate, there should be at least 2,000 people using mobile web talk pages at the wikis included in the A/B test and a minimum of 15 distinct wikis included in the test. Also, to draw conclusions about any individual wiki in the test, there will need to be ~200 unique people using talk pages while the test is running. More in T298271#7641238.

Open Questions

1. Per the question @dchan raised in Editing Scratch, how long do anticipate the A/B test needing to run given the number of people using mobile talk pages and they frequency with which they are using them? See T295180 for more details on mobile talk page usage.
2. Should the A/B test be limited to wikis that have NOT had access to any mobile talk page improvements via T298221 or T298222 to-date? See T297448#7575858 for more context.
- Yes. The wikis involved in this A/B test will be limited to those who have NOT had access to any mobile DT features prior to the test beginning. See Selection Criteria within T314950's description for more details.

Done

A report is published that evaluates the ===Hypotheses above

Related Objects
Search...

Status	Subtype	Assigned	Task
Resolved		matmarex	T319145 Remove MobileFrontend/Minerva code for talk page tools
Resolved		Jdlrobson	T312312 Deprecate MobileFrontend's "About this page" functionality on talk pages
Resolved		None	T278588 Mobile talk page improvements
Resolved	BUG REPORT	matmarex	T326080 User talk page seems to be empty in mobile
Resolved		matmarex	T327047 English Wikipedia WikiProject banners no longer visible on mobile
Resolved		None	T298060 [RELEASE TICKET] Offer Mobile DiscussionTools at All Wikis
Resolved		ppelberg	T298062 [A/B Test] Run an A/B test to evaluate impact of mobile DiscussionTools
Declined		None	T298065 [Impact Analysis] Evaluate Impact of mobile Topic Subscriptions + Usability Improvements
Resolved		• Whatamidoing-WMF	T314950 Identify wikis for mobile DiscussionTools A/B test
Resolved		ppelberg	T302108 Ensure logging is in place to compare MobileFrontend and DiscussionTools new topic and new comment completion rates
Resolved		MNeisler	T301026 Create instrumentation spec for mobile DiscussionTools
Duplicate		MNeisler	T303653 [SPIKE] Investigate how sessionID is used in various mobile talk page schemas
Resolved		DLynch	T303654 [SPIKE] Investigate practicality of making sampling rate consistent across mobile talk page schemas
Resolved		ppelberg	T307640 Add page view token to UIAction schemas
Resolved		ppelberg	T311612 Reduce weight and/or size of headings on talk pages
Resolved		matmarex	T318110 Overflow menu does not appear within sections that do not contain signed comments
Resolved		DLynch	T318302 [Config Change] Enable all DiscussionTools by default at partner wikis (mobile)
Resolved		DLynch	T318871 Implement config that enables us to exclude ja.wiki from receiving mobile visual enhancements/usability improvemets
Resolved		matmarex	T318870 [Config Change] Enable all DiscussionTools by default at ja.wiki (mobile)
Resolved		matmarex	T319148 Reply Tool (mobile) remains visible when section is collapsed
Resolved		matmarex	T320755 "Start a discussion" and "Read as wiki page" buttons have incorrect margins
Resolved		DLynch	T320993 Implement mobile DiscussionTools A/B test bucketing
Resolved		ppelberg	T321961 [Config Change] Start mobile DiscussionTools A/B test
Resolved		Ryasmeen	T323400 The sticky "Add topic" button behaves inconsistently when scrolled to the top of a talk page on iOS
Resolved		Ryasmeen	T320997 Verify suite of mobile DiscussionTools are working as expected prior to A/B test start
Resolved		matmarex	T316175 Make the mobile Add Topic button easier for people to access
Resolved		ppelberg	T321734 Extend the MobileWebUIActions sampling rate to A/B test wiki
Resolved		matmarex	T323171 "Learn more about this page" button doesn't appear as expected in mobile DiscussionTools
Resolved		ppelberg	T312309 Unify experience for making lead section content available on mobile talk pages
Resolved		matmarex	T324686 [Regression ?] The section edit icon on MobileFrontend stops working after posting a reply on production
Resolved		matmarex	T324702 Deemphasize treatment of "Learn more about this page" Link

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

ppelberg updated the task description. (Show Details)Dec 21 2021, 12:17 AM

ppelberg added a subscriber: dchan.

ppelberg mentioned this in T298065: [Impact Analysis] Evaluate Impact of mobile Topic Subscriptions + Usability Improvements.Dec 21 2021, 12:37 AM

ppelberg moved this task from Backlog to Triaged on the DiscussionTools board.Dec 22 2021, 4:07 PM

ppelberg updated the task description. (Show Details)Dec 23 2021, 1:17 AM

ppelberg mentioned this in T297448: [SPIKE] Review approach to improving mobile talk pages.Dec 23 2021, 1:20 AM

ppelberg mentioned this in T298271: [SPIKE] Determine mobile talk page usage volume needs.Dec 23 2021, 9:17 PM

ppelberg updated the task description. (Show Details)Feb 5 2022, 12:08 AM

ppelberg updated the task description. (Show Details)Feb 5 2022, 12:11 AM

ppelberg mentioned this in T301026: Create instrumentation spec for mobile DiscussionTools.Feb 5 2022, 2:09 AM

ppelberg mentioned this in T295490: Make click tracking work on reply tool talk page.Feb 5 2022, 2:14 AM

MNeisler mentioned this in T302999: Create instrumentation spec for Usability Improvements.Mar 10 2022, 8:36 PM

ppelberg mentioned this in T304037: Ensure Editing Team's use of DesktopWebUIActions schema will not interfere with Web Team's plans.Mar 17 2022, 12:43 AM

MNeisler mentioned this in T303654: [SPIKE] Investigate practicality of making sampling rate consistent across mobile talk page schemas.May 18 2022, 6:34 PM

ppelberg mentioned this in T312685: [SPIKE] Decide whether additional config is necessary to isolate mobile and desktop Usability Improvement deployments.Jul 20 2022, 11:05 PM

ppelberg closed subtask T298065: [Impact Analysis] Evaluate Impact of mobile Topic Subscriptions + Usability Improvements as Declined.Jul 27 2022, 9:37 PM

ppelberg mentioned this in T298058: [Impact Analysis] Evaluate Impact of Mobile Reply and New Discussion Tools.Jul 27 2022, 9:46 PM

ppelberg updated the task description. (Show Details)Aug 5 2022, 6:23 PM

ppelberg updated the task description. (Show Details)Aug 17 2022, 7:29 PM

ppelberg updated the task description. (Show Details)

ppelberg updated the task description. (Show Details)Aug 17 2022, 7:43 PM

ppelberg added a subtask: T302108: Ensure logging is in place to compare MobileFrontend and DiscussionTools new topic and new comment completion rates.Aug 17 2022, 8:13 PM

ppelberg added a subtask: T315710: Revisit mobile Topic Container spacing.Aug 19 2022, 11:47 PM

ppelberg mentioned this in T302108: Ensure logging is in place to compare MobileFrontend and DiscussionTools new topic and new comment completion rates.Sep 3 2022, 12:50 AM

ppelberg added a subtask: T311612: Reduce weight and/or size of headings on talk pages.Sep 10 2022, 12:21 AM

ppelberg added a subtask: T316175: Make the mobile Add Topic button easier for people to access .Sep 13 2022, 4:44 PM

ppelberg added a subtask: T318110: Overflow menu does not appear within sections that do not contain signed comments.Sep 19 2022, 6:33 PM

ppelberg mentioned this in T318302: [Config Change] Enable all DiscussionTools by default at partner wikis (mobile).Sep 22 2022, 1:15 AM

ppelberg closed subtask T311612: Reduce weight and/or size of headings on talk pages as Resolved.Sep 22 2022, 4:44 PM

ppelberg mentioned this in T318870: [Config Change] Enable all DiscussionTools by default at ja.wiki (mobile).Sep 28 2022, 9:05 PM

ppelberg removed a subtask: T318868: Remove config that insulated ja.wiki from other mobile DiscussionTools deployments .

ppelberg added a subtask: T319148: Reply Tool (mobile) remains visible when section is collapsed.Oct 2 2022, 3:17 PM

ppelberg mentioned this in T298287: Bucketing does not occur until after page reload.Oct 12 2022, 3:55 PM

ppelberg removed a subtask: T298287: Bucketing does not occur until after page reload.

MNeisler claimed this task.Oct 12 2022, 6:34 PM

MNeisler triaged this task as Medium priority.

MNeisler added a project: Product-Analytics (Kanban).

MNeisler edited projects, added Product-Analytics; removed Product-Analytics (Kanban).

MNeisler moved this task from Triage to Current Quarter on the Product-Analytics board.

ppelberg added a subtask: T320755: "Start a discussion" and "Read as wiki page" buttons have incorrect margins.Oct 14 2022, 6:57 PM

ppelberg mentioned this in T320993: Implement mobile DiscussionTools A/B test bucketing.Oct 17 2022, 6:14 PM

ppelberg mentioned this in T320997: Verify suite of mobile DiscussionTools are working as expected prior to A/B test start.Oct 17 2022, 6:27 PM

ppelberg closed subtask T318302: [Config Change] Enable all DiscussionTools by default at partner wikis (mobile) as Resolved.Oct 18 2022, 12:37 AM

ppelberg closed subtask T318110: Overflow menu does not appear within sections that do not contain signed comments as Resolved.Oct 20 2022, 5:09 PM

VPuffetMichel added a project: TPP-Scaling.Oct 26 2022, 1:29 PM

VPuffetMichel moved this task from Scaling Epics to Deployment Phases on the TPP-Scaling board.Oct 26 2022, 1:33 PM

VPuffetMichel moved this task from Deployment Phases to Mobile DiscussionTools Deployment Phases on the TPP-Scaling board.Oct 26 2022, 2:09 PM

ppelberg moved this task from Mobile DiscussionTools Deployment Phases to Deployment Phases on the TPP-Scaling board.Oct 28 2022, 10:21 PM

ppelberg removed a project: TPP-Scaling.Oct 28 2022, 10:26 PM

ppelberg mentioned this in T321734: Extend the MobileWebUIActions sampling rate to A/B test wiki.Oct 28 2022, 11:03 PM

ppelberg closed subtask T314950: Identify wikis for mobile DiscussionTools A/B test as Resolved.Oct 28 2022, 11:11 PM

ppelberg closed subtask T302108: Ensure logging is in place to compare MobileFrontend and DiscussionTools new topic and new comment completion rates as Resolved.Nov 2 2022, 5:48 PM

ppelberg closed subtask T320755: "Start a discussion" and "Read as wiki page" buttons have incorrect margins as Resolved.Nov 4 2022, 5:05 PM

ppelberg closed subtask T319148: Reply Tool (mobile) remains visible when section is collapsed as Resolved.Nov 4 2022, 11:33 PM

matmarex mentioned this in T322492: Remove wgDiscussionToolsABTest config setting.Nov 6 2022, 9:02 PM

ppelberg closed subtask T320993: Implement mobile DiscussionTools A/B test bucketing as Resolved.Nov 8 2022, 11:05 PM

ppelberg updated the task description. (Show Details)Nov 9 2022, 8:24 PM

ppelberg added a subtask: T312309: Unify experience for making lead section content available on mobile talk pages.Nov 9 2022, 9:02 PM

ppelberg closed subtask T318870: [Config Change] Enable all DiscussionTools by default at ja.wiki (mobile) as Resolved.Nov 16 2022, 4:59 PM

ppelberg added a subtask: T323171: "Learn more about this page" button doesn't appear as expected in mobile DiscussionTools.Nov 16 2022, 6:11 PM

Ryasmeen changed the status of subtask T316175: Make the mobile Add Topic button easier for people to access from Open to In Progress.Nov 18 2022, 2:34 AM

ppelberg closed subtask T312309: Unify experience for making lead section content available on mobile talk pages as Resolved.Nov 19 2022, 1:16 AM

ppelberg removed a subtask: T315710: Revisit mobile Topic Container spacing.Nov 23 2022, 2:08 AM

ppelberg removed a subtask: T320997: Verify suite of mobile DiscussionTools are working as expected prior to A/B test start.Nov 23 2022, 2:15 AM

ppelberg removed a subtask: T316175: Make the mobile Add Topic button easier for people to access .Nov 23 2022, 2:20 AM

ppelberg removed a subtask: T321734: Extend the MobileWebUIActions sampling rate to A/B test wiki.

ppelberg removed a subtask: T323171: "Learn more about this page" button doesn't appear as expected in mobile DiscussionTools.

ppelberg removed a subtask: T320753: Make it easier to relocate a Reply Tool you've opened on mobile.

ppelberg removed a subtask: T312309: Unify experience for making lead section content available on mobile talk pages.

MNeisler updated the task description. (Show Details)Nov 30 2022, 7:56 PM

@ppelberg The metrics identified in the Decision Matrix and Curiosities sections in the Task Description look good to me. I made some small text changes to clarify what we would be measuring and to more closely match with the text outlined in the measurement plan.

Note: We do not currently list a metric to measure the following research question outlined in the measurement plan "People, across experience levels, to receive more timely responses to the talk page comments they post and discussion topics they start."

This metric would look specifically at the impact of the Topic Subscriptions feature being introduced but I think the metrics already identified would provide a better overall assessment of the impact of the suite of features being introduced. I'm fine leaving this metric off unless we identify a specific reason to look at it.

Documenting the outcomes of the conversation @MNeisler and I had offline today, in-line below...

In T298062#8433066, @MNeisler wrote:

@ppelberg The metrics identified in the Decision Matrix and Curiosities sections in the Task Description look good to me. I made some small text changes to clarify what we would be measuring and to more closely match with the text outlined in the measurement plan.

Sounds great.

Note: We do not currently list a metric to measure the following research question outlined in the measurement plan "People, across experience levels, to receive more timely responses to the talk page comments they post and discussion topics they start."

This metric would look specifically at the impact of the Topic Subscriptions feature being introduced but I think the metrics already identified would provide a better overall assessment of the impact of the suite of features being introduced. I'm fine leaving this metric off unless we identify a specific reason to look at it.

Per what Megan and I talked about offline, we will not be adding response time as a key metric that we will base deployment decisions on. However, we will consider response rate as a potential "Curiosity" to review.

I've updated the task description to reflect the above.

ppelberg updated the task description. (Show Details)Nov 30 2022, 11:57 PM

MNeisler moved this task from Current Quarter to Upcoming Quarter on the Product-Analytics board.Dec 9 2022, 7:45 PM

MNeisler mentioned this in T321961: [Config Change] Start mobile DiscussionTools A/B test.Dec 16 2022, 4:59 PM

MNeisler moved this task from Upcoming Quarter to Current Quarter on the Product-Analytics board.Jan 3 2023, 3:00 PM

MNeisler edited projects, added Product-Analytics (Kanban); removed Product-Analytics.Jan 20 2023, 3:55 PM

MNeisler moved this task from Next 2 weeks to Doing on the Product-Analytics (Kanban) board.

ANALYSIS UPDATE

Per what @MNeisler and I talked about offline last week (25 Jan 2023), we'll need to exclude the behavior of people who are logged out from this analysis because of the issue @Ryasmeen detected and @DLynch documented in T321961#8521374.

The "issue" here being that people who are logged out would be bucketed; however, the version of talk pages they would see (read: DiscussionTools enabled or not) would be reset each time they visited a talk page.

I've completed an initial analysis of the DiscussionTools on Mobile AB Test. Please see the report here.

I plan to update this ticket and report with a high-level summary of the results as well but wanted to provide the initial results for review.

cc @ppelberg

A quick high-level summary of some key results:

Edit Completion Rate (percent of edits started that are successfully published)

We observed a higher edit completion rate for users shown the suite of mobile DiscussionTools (test group) than the edit completion rate for users shown either the existing MobileFrontEnd overlay or the ReadasWiki views (control group) overall and on each participating Wikipedia.
- There was a 56% increase (22 percentage points) in edit completion rate for users shown the DT-enhanced view compared to MobileFrontEnd overlay and 61% increase (15.7 percentage points) compared to the existing Read as Wiki view
- Edit completion rate for all three editing workflows on the DT-enhanced view had a higher edit completion rate than edits made by users shown the Read as wiki view. There was a 2x increase in the edit completion rate with the reply tool compared to edits on the Read as Wiki view.

Screen Shot 2023-02-02 at 12.40.52 PM.png (497×1 px, 54 KB)

We also observed an increase in the completion rates when comparing usage of the reply tool and new topic tool to the existing replying and new topic workflows on MobileFrontEnd.

Screen Shot 2023-02-02 at 1.18.08 PM.png (497×1 px, 50 KB)

There was a significantly higher proportion of distinct editors that successfully saved at least 1 mobile talk page edit when shown the mobile suite of DiscussionTools.

experiment_group	Proportion of users that published at least 1 mobile talk page edit
control	15.42%
test	48.3%

The revert rate for mobile talk page edits completed by users in the test group was 4.4 percentage points higher than the revert rate for users in the control group. The initial spike in the revert rate observed on the first-day test was not sustained. (Note: I checked these numbers with the data in the recently available mediawiki_history snapshot and the results were the same)

experiment_group	Number of save edits	Number of reverts	Percent edits reverted within 48 hours
control	2301	87	3.8%
test	3701	302	8.2%

Screen Shot 2023-02-02 at 12.33.29 PM.png (607×1 px, 78 KB)

@ppelberg - Reassigning to you for review. Please let me know if you have any questions.

Next steps

1. @ppelberg to draft summary for publishing on-wiki: https://www.mediawiki.org/wiki/Talk_pages_project/Mobile#Impact
2. @MNeisler to review "1."
3. @ppelberg to publish "2."

ppelberg moved this task from Incoming to Blocked / Needs More Work on the Editing-team (Kanban Board) board.Feb 4 2023, 2:20 AM

In T298062#8586886, @ppelberg wrote:

Next steps

1. @ppelberg to draft summary for publishing on-wiki: https://www.mediawiki.org/wiki/Talk_pages_project/Mobile#Impact

2. @MNeisler to review "1."

3. @ppelberg to publish "2."

https://www.mediawiki.org/wiki/Talk_pages_project/Mobile#9_February_2023

Based on the results @MNeisler shared in T298062#8579981 and T298062#8583003, the Editing Team considers us to be in scenario 3. in the task description's === Decision Matrix: "People do NOT encounter more difficulty publishing edits and there are no regressions in edit revert and edit completion rates."

As such, we are proceeding with plans to offer the suite of mobile DiscussionTools as a default-on feature to everyone (logged in and out) at all Wikimedia wikis in the coming weeks.

We will be tracking these deployments in T298060.

Now, as it relates to revert rate, the rate at which people who were shown the DiscussionTools version of talk pages had the edits they published reverted was 4.4 percentage points higher (3.78% vs. 8.16%) that the edits people made who were shown the existing MobileFrontend experience.

We are comfortable with this increase for the following reasons:

With people publishing more edits, as a consequence of it being easier to do so, we anticipated an increase in revert rate. See Scenario 1.
The absolute number of edits that people are publishing using mobile talk pages combined with the absence of feedback we've heard from experienced volunteers about mobile talk page disruption leads us to think the increase in revert we see as part of this test is not [yet] negatively impacting wikis and the people who moderate them
We think the revert rate of the pre-DiscussionTools state of mobile talk pages might've been suppressed by how challenged people have been publishing edits on talk pages, as evidenced by the 56% increase in the rate at which people who were shown the DiscussionTools version of talk pages by default published edits

ppelberg closed this task as Resolved.Feb 10 2023, 10:31 PM

Restricted Application added a project: User-Ryasmeen. · View Herald TranscriptFeb 10 2023, 10:31 PM

ppelberg mentioned this in T328940: [Config Change] Enable all DiscussionTools as default-on features at Phase 1 wikis (mobile).Feb 13 2023, 7:49 PM

ppelberg closed subtask T321961: [Config Change] Start mobile DiscussionTools A/B test as Resolved.Feb 16 2023, 4:53 PM

	F36685096: Screen Shot 2023-02-02 at 12.33.29 PM.png
	Feb 2 2023, 6:24 PM

	F36686052: Screen Shot 2023-02-02 at 1.18.08 PM.png
	Feb 2 2023, 6:24 PM

	F36685212: Screen Shot 2023-02-02 at 12.40.52 PM.png
	Feb 2 2023, 6:24 PM

[A/B Test] Run an A/B test to evaluate impact of mobile DiscussionToolsClosed, ResolvedPublicActions