Job openings/Systems Engineer - Data Analytics: Difference between revisions

From Wikimedia Foundation Governance Wiki
Content deleted Content added
m Title change and minor tweaks to JD
Line 2: Line 2:
'''YOU ARE ...'''
'''YOU ARE ...'''


... a person who loves crunching numbers and writing code to help understand and improve complex social systems. Wikipedia and its sister projects seem like a very exciting potential playground to you, and you enjoy the reward of discovery as a result of hard work.
... a person who loves building scalable systems for handling a large amount of data. You enjoy working with a team using data to understand and improve complex social systems. Wikipedia and its sister projects seem like a very exciting potential playground to you, and you enjoy the reward of discovery as a result of hard work.


You are resourceful, inquisitive, and collaborative. You understand how to synthesize meaning from terabytes of unprocessed log files, and know the most effective techniques for giving the Wikimedia Foundation a means to measure the success (or failure) of key initiatives. You are self-sufficient, self-driven, focused, and organized. You are a capable coder and understand how to write robust, maintainable open source statistics and analytics software. You are also able to requirements to developers and helping them to understand the importance of your work, but you are also handy enough with Linux shell usage and scripting to mine your own data.
You are resourceful, inquisitive, and collaborative. You understand large scale systems and how to turn terabytes of unprocessed log files into a sensible data architecture. You are on top of the latest trends in database technology, love to debate the pros and cons of running Hadoop vs MongoDB vs MySQL. You know the most effective techniques for giving the Wikimedia Foundation the tools to measure the success (or failure) of key initiatives. You are self-sufficient, self-driven, focused, and organized. You are a brilliant coder and dream of writing robust, maintainable open source statistics and analytics software. You can lead a new team of data analysts in building a practice of safe, sane and useful model-development.


Above all, you are excited by the opportunity to support Wikimedia's free knowledge projects with data.
Above all, you are excited by the opportunity to support Wikimedia's free knowledge projects with data.
Line 11: Line 11:
'''JOB TITLE'''
'''JOB TITLE'''


Data Analytics Engineer
Systems Engineer - Data Analytics


'''REPORTS TO'''
'''REPORTS TO'''
Line 19: Line 19:
'''JOB PURPOSE'''
'''JOB PURPOSE'''


The Data Analytics Engineer will be responsible for developing systems for gathering key metrics critical for the operation of the Wikimedia Foundation. You will work with the development staff to ensure proper (but unobtrusive) instrumentation of critical infrastructure.
The Systems Engineer - Data Analytics will be responsible for developing systems for gathering key metrics critical for the operation of the Wikimedia Foundation. You will work with the development staff to ensure proper (but unobtrusive) instrumentation of critical infrastructure.


'''JOB SUMMARY'''
'''JOB SUMMARY'''
Line 26: Line 26:
* Develop, test, and deploy new features, improvements and upgrades to Wikimedia's statistics and analytics infrastructure (e.g. [http://stats.wikimedia.org/ Wikistats]) in cooperation with other Wikimedia Foundation engineering staff
* Develop, test, and deploy new features, improvements and upgrades to Wikimedia's statistics and analytics infrastructure (e.g. [http://stats.wikimedia.org/ Wikistats]) in cooperation with other Wikimedia Foundation engineering staff
* Work with Wikimedia volunteers, Wikimedia Foundation research staff, and researcher community in articulating and finding answers to key strategic research questions
* Work with Wikimedia volunteers, Wikimedia Foundation research staff, and researcher community in articulating and finding answers to key strategic research questions
* Define code and architecture standards for data analytics tools, reviewing and approving code from data analysts
* Work with volunteers to augment our analytics capabilities
* Work with volunteers to augment our analytics capabilities
* Recommend new methods for collection and documentation of data, and establish procedures for procurement of data
* Recommend new methods for collection and documentation of data, and establish procedures for procurement of data
Line 35: Line 36:


'''REQUIRED QUALIFICATIONS'''
'''REQUIRED QUALIFICATIONS'''
* 3+ years experience as an analyst or researcher, preferably in a company that analyzes media/Internet usage data or in an academic setting.
* 3+ years experience as an analyst or researcher, preferably in a company that analyzes media/Internet usage data or in an academic setting
* 2+ years experience working in a Linux/Unix server environment
* 2+ years experience working in a Linux/Unix server environment
* 2+ years experience with scripting languages, such as PHP, Perl, Ruby, Python or shell scripting. Experience with low-level programming languages is a plus
* 2+ years experience with scripting languages, such as PHP, Perl, Ruby, Python or shell scripting. Experience with low-level programming languages is a plus.
* 2+ years experience with large database storage systems, with previous experience working with MySQL or similar database in a production environment
* Strong ability to analyze and synthesize quantitative and qualitative data from primary and secondary sources, and proven ability to create simple, meaningful reports
* Must be willing to work collaboratively and discuss methodology and conclusions with the technology team, project team, and the Wikimedia community
* Must be willing to work collaboratively and discuss methodology and conclusions with the technology team, project team, and the Wikimedia community.
* Must be willing to solicit ideas and discuss methodology and conclusions beyond direct contacts, notably the Wikimedia community
* Must be willing to solicit ideas and discuss methodology and conclusions beyond direct contacts, notably the Wikimedia community.
* General experience with database storage systems, with previous experience working with MySQL or similar database in a production environment
* You are able to learn quickly. Relevant hands-on experience and eagerness to learn and try new concepts is more important than having certificates
* You are able to learn quickly. Relevant hands-on experience and eagerness to learn and try new concepts is more important than having certificates
* The ideal candidate will be creative, highly motivated, and able to operate effectively in multiple cultural and technical contexts
* The ideal candidate will be creative, highly motivated, and able to operate effectively in multiple cultural and technical contexts.
* You are able to work independently where needed, and can work remotely as part of a globally distributed team
* You are able to work independently where needed, and can work remotely as part of a globally distributed team
* Must be comfortable using a wide variety of communications/collaboration tools including wikis, mailing lists and IRC
* Must be comfortable using a wide variety of communications/collaboration tools including wikis, mailing lists and IRC.
* You must be comfortable in a highly collaborative, consensus-oriented environment
* You must be comfortable in a highly collaborative, consensus-oriented environment
* You are a proficient English speaker
* You are a proficient English speaker
Line 52: Line 51:
'''ADDITIONAL QUALIFICATIONS'''
'''ADDITIONAL QUALIFICATIONS'''
* Previous experience with the PHP scripting language a plus
* Previous experience with the PHP scripting language a plus
* Experience with Hadoop, Cassandra, or other NoSQL system is a major plus
* Experience with Open Web Analytics or other analytics software is a major plus
* Experience with Open Web Analytics or other analytics software is a major plus
* Any other free/open software development experience highly welcome
* Any other free/open software development experience highly welcome
* Experience with high traffic web site architectures and operations is a plus
* Experience with high traffic web site architectures and operations is a plus
* Experience with wikis and participatory production environments is a plus
* Experience with wikis and participatory production environments is a plus.
* Understanding of the free culture movement is a plus
* Understanding of the free culture movement is a plus
* Active participation as a Wikimedia volunteer would be an asset, though not a prerequisite
* Active participation as a Wikimedia volunteer would be an asset, though not a prerequisite.


{{Job openings footer
{{Job openings footer
Line 67: Line 67:
}}
}}


[[Category:Job Descriptions|Data Analytics Engineer]]
[[Category:Job Descriptions|Systems Engineer - Data Analytics]]

Revision as of 17:00, 28 March 2011

YOU ARE ...

... a person who loves building scalable systems for handling a large amount of data. You enjoy working with a team using data to understand and improve complex social systems. Wikipedia and its sister projects seem like a very exciting potential playground to you, and you enjoy the reward of discovery as a result of hard work.

You are resourceful, inquisitive, and collaborative. You understand large scale systems and how to turn terabytes of unprocessed log files into a sensible data architecture. You are on top of the latest trends in database technology, love to debate the pros and cons of running Hadoop vs MongoDB vs MySQL. You know the most effective techniques for giving the Wikimedia Foundation the tools to measure the success (or failure) of key initiatives. You are self-sufficient, self-driven, focused, and organized. You are a brilliant coder and dream of writing robust, maintainable open source statistics and analytics software. You can lead a new team of data analysts in building a practice of safe, sane and useful model-development.

Above all, you are excited by the opportunity to support Wikimedia's free knowledge projects with data.

JOB TITLE

Systems Engineer - Data Analytics

REPORTS TO

Chief Technical Officer

JOB PURPOSE

The Systems Engineer - Data Analytics will be responsible for developing systems for gathering key metrics critical for the operation of the Wikimedia Foundation. You will work with the development staff to ensure proper (but unobtrusive) instrumentation of critical infrastructure.

JOB SUMMARY

Duties include, but are not limited to the following:

  • Develop, test, and deploy new features, improvements and upgrades to Wikimedia's statistics and analytics infrastructure (e.g. Wikistats) in cooperation with other Wikimedia Foundation engineering staff
  • Work with Wikimedia volunteers, Wikimedia Foundation research staff, and researcher community in articulating and finding answers to key strategic research questions
  • Define code and architecture standards for data analytics tools, reviewing and approving code from data analysts
  • Work with volunteers to augment our analytics capabilities
  • Recommend new methods for collection and documentation of data, and establish procedures for procurement of data
  • Coordinate and participate in the preparation and presentation of reports and analysis, capture progress, trends, and appropriate recommendations or conclusions
  • Assist end users and other developers in identifying and resolving issues with the software and configuration of Wikimedia's analytics infrastructure
  • Provide answers to ad hoc queries and one-time statistics generation requests
  • Configure, customize and develop other web-based and server-side software used to support analytics operations
  • Interface with the Open Web Analytics development community and other outside developers

REQUIRED QUALIFICATIONS

  • 3+ years experience as an analyst or researcher, preferably in a company that analyzes media/Internet usage data or in an academic setting.
  • 2+ years experience working in a Linux/Unix server environment
  • 2+ years experience with scripting languages, such as PHP, Perl, Ruby, Python or shell scripting. Experience with low-level programming languages is a plus.
  • 2+ years experience with large database storage systems, with previous experience working with MySQL or similar database in a production environment
  • Must be willing to work collaboratively and discuss methodology and conclusions with the technology team, project team, and the Wikimedia community.
  • Must be willing to solicit ideas and discuss methodology and conclusions beyond direct contacts, notably the Wikimedia community.
  • You are able to learn quickly. Relevant hands-on experience and eagerness to learn and try new concepts is more important than having certificates
  • The ideal candidate will be creative, highly motivated, and able to operate effectively in multiple cultural and technical contexts.
  • You are able to work independently where needed, and can work remotely as part of a globally distributed team
  • Must be comfortable using a wide variety of communications/collaboration tools including wikis, mailing lists and IRC.
  • You must be comfortable in a highly collaborative, consensus-oriented environment
  • You are a proficient English speaker

ADDITIONAL QUALIFICATIONS

  • Previous experience with the PHP scripting language a plus
  • Experience with Hadoop, Cassandra, or other NoSQL system is a major plus
  • Experience with Open Web Analytics or other analytics software is a major plus
  • Any other free/open software development experience highly welcome
  • Experience with high traffic web site architectures and operations is a plus
  • Experience with wikis and participatory production environments is a plus.
  • Understanding of the free culture movement is a plus
  • Active participation as a Wikimedia volunteer would be an asset, though not a prerequisite.

Template:Job openings footer