Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
BBC RussianHomePhabricator
Log In
Maniphest T210022

Allow access to Data Lake/Hive for Niharika
Closed, ResolvedPublic

Description

I'm responsible for analyzing Eventlogging data for Community Tech projects (TemplateWizard at the moment) and would like to have access to Hive so I can run the queries I need to.

I do already have production access and stat1006 access, if that helps.

SRE Clinic Duty Checklist for Access Requests

Most requirements are outlined on https://wikitech.wikimedia.org/wiki/Requesting_shell_access

This checklist should be used on all access requests to ensure that all steps are covered. This includes expansion to access. Please do not check off items on the list below unless you are in Ops and have confirmed the step.

  • - User has signed the L3 Acknowledgement of Wikimedia Server Access Responsibilities Document. - DONE
  • - User has a valid NDA on file with WMF legal. (This can be checked by Operations via the NDA tracking sheet & is included in all WMF Staff/Contractor hiring.) - existing staff and existing production shell user, done
  • - User has provided the following: wikitech username, preferred shell username, email address, and full reasoning for access (including what commands and/or tasks they expect to perform.
  • - User has provided a public SSH key. This ssh key pair should only be used for WMF cluster access, and not share with any other service (this includes not sharing with WMCS access, no shared keys.) existing user, done
  • - access request (or expansion) has sign off of WMF sponsor/manager (sponser for volunteers, manager for wmf staff)
  • - non-sudo requests: 3 business day wait must pass with no objections being noted on the task
  • - Patchset for access request

Event Timeline

RobH triaged this task as Medium priority.Nov 21 2018, 6:06 PM
RobH updated the task description. (Show Details)
RobH moved this task from Untriaged to Awaiting User Input on the SRE-Access-Requests board.
RobH subscribed.

@Niharika: Please review https://wikitech.wikimedia.org/wiki/Analytics/Data_access#Access_Groups

It seems that Hive has both the public and private data options. I would assume you need private data for more info, but assuming on access requests is typically non-ideal so I rather have you confirm.

What data specifically will you need to access so we can determine if its in the private data or not? (If you know the answer, please share!) Both of these are non-sudo, so once we have an answer its simply a 3 business day wait for objections to possibly be noted.

Please provide feedback and unassign youself when done (and clinic duty will pick it back up.) Also please have your manager approve the expansion on this task.

Thanks!

@RobH Currently I only need access to Eventlogging data for TemplateWizard (As mentioned in task). I don't know if it's public or private - @Milimetric can perhaps confirm.

@DannyH Could you sign off on this?

@Milimetric: I'm assigning to you for feedback on if @nikarika needs the private-data version or not. Please advise and unassign yourself from the task, and it'll be picked back up by clinic duty.

Private. Niharika would benefit from being a part of analytics-privatedata-users, including access to data before it's sanitized.

Private. Niharika would benefit from being a part of analytics-privatedata-users, including access to data before it's sanitized.

Thanks!

Ok, so this is now a request to append @Niharika access to include analytics-privatedata-users group.

If no objections are noted, this can merge live after 3 business days. (Thursday is a holiday in the US), so next Monday, 2018-11-26.

Change 475736 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Add Niharika to analytics-privatedata-users group

https://gerrit.wikimedia.org/r/475736

Change 475736 merged by Ottomata:
[operations/puppet@production] admin: Add Niharika to analytics-privatedata-users group

https://gerrit.wikimedia.org/r/475736