← All jobs

Senior Site Reliability Engineer

2K · Austin, Texas, United States

onsitefull-timesenior level

About this role

#LI-Onsite 

Who We Are

Founded in 2005, 2K Games is a global video game company, publishing titles developed by some of the most influential game development studios in the world. Our studios responsible for developing 2K’s portfolio of world-class games across multiple platforms, include Visual Concepts, Firaxis, Hangar 13, CatDaddy, Cloud Chamber, and HB Studios. Our portfolio of titles is expanding due to our global strategic plan, building and acquiring exciting studios whose content continues to inspire all of us! 2K publishes titles in today’s most popular gaming genres, including sports, shooters, action, role-playing, strategy, casual, and family entertainment.

Our team of engineers, marketers, artists, writers, data scientists, producers, thinkers and doers, are the professional publishing stewards of our growing library of critically-acclaimed franchises such as NBA 2K, Battleborn, BioShock, Borderlands, The Darkness, Mafia, Sid Meier’s Civilization, WWE 2K, and XCOM.

At 2K, we pride ourselves on creating an inclusive work environment, encouraging our teams to Come as You Are and do their best work! We are dedicated to diversity and inclusion, and want our community of candidates to reflect this commitment. We encourage all qualified applicants to explore our global positions.

2K is headquartered in Novato, California, and is a wholly owned label of Take-Two Interactive Software, Inc. (NASDAQ: TTWO).

What We Need

We are looking for a Senior Site Reliability Engineer for the Platform Insights team, reporting to the Senior Manager. This role will lead the Observability team that supports our game platform and studios across production and development environments. 

What You Will Do

  • Architect, develop, and evolve our enterprise-wide observability platform to provide deep visibility into infrastructure and application performance
  • Design and implement monitoring solutions leveraging modern metrics and visualization technologies, with support for additional platform integrations
  • Collaborate with application and infrastructure teams to define and implement observability standards and best practices across the software development lifecycle (SDLC)
  • Implement and maintain automation for monitoring configurations using infrastructure as code (IaC) tools
  • Integrate observability standards to unify metrics, logs, and traces across collection pipelines
  • Drive cost optimization initiatives around monitoring and logging, balancing data retention, performance, and value
  • Partner with developers and operations teams to enable self-service observability capabilities
  • Create automation and alerting processes to proactively identify and resolve performance issues before they impact business operations
  • Participate in architectural reviews to ensure observability is embedded into new services and platforms from the outset
  • Contribute to documentation and knowledge sharing around observability tools, processes, and patterns
  • Deliver reports and visualizations tailored for both technical and business stakeholders.
  • Evaluate emerging technologies to evolve 2K’s observability strategy.
  • Drive automation and process improvements to improve system performance, resiliency, and insight quality.
  • Select, configure, and integrate industry-leading monitoring and telemetry tools (e.g., Prometheus, Grafana, ELK, Dynatrace, Datadog).

Who We Think Will Be a Great Fit

  • 5+ years of professional experience in information technology, including 3+ years specializing in observability, monitoring, or SRE engineering.
  • Deep knowledge of monitoring toolsets such as Prometheus, Grafana, ELK, Splunk, Dynatrace, Datadog, or equivalent.
  • Proficiency in Python for automation and tool development.
  • Hands-on experience with Kubernetes, Docker, and cloud platforms (AWS, GCP, or Azure).
  • Strong understanding of networking, infrastructure, and performance optimization.
  • Experience with IaC tools such as Terraform
  • Familiarity with configuration management tools (Ansible, Chef, Puppet) and CI/CD integration.
  • Proven track record designing and delivering dashboards, alerts, and performance reports for multiple audiences.
  • Excellent communication skills, with the ability to translate technical insights into actionable recommendations.

Bonus Points:

  • Experience building an Observability practice from the ground up.
  • Experience with developing software for highly scalable/distributed systems
  • Experience using IaC for highly elastic workloads
  • Experience in gaming or similar industries, combining large-scale internet-facing systems with software development and entertainment services culture
  • Familiarity with common source code repositories and infrastructure as code methodologies

As an equal opportunity employer, we are committed to ensuring that qualified individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform their essential job functions, and to receive other benefits and privileges of employment. Please contact us if you need a reasonable accommodation.

Please note that 2K Games and its studios never uses instant messaging apps or personal email accounts to contact prospective employees or conduct interviews and when emailing, only use 2K.com accounts.

About 2K

Jobb.ai is an independent skill benchmarking platform. Applications are submitted on the employer's official website.