Senior Splunk Engineer (GIC) for Automation and Reliability Engineering Project
Project Summary
- Support Automation and Reliability Engineering project and operations.
- Responsibilities:
- Observability Engineering and Governance
- Architect and maintain enterprise SIEM solutions aligned with operational resilience mandates (e.g., MAS TRM, DORA, APRA CPS 230).
- Lead deployment, configuration, and optimization of Splunk for full-stack visibility across infrastructure, applications, networks, and user experience.
- Define and enforce telemetry data governance standards—metrics, logs, and traces—ensuring consistency, retention compliance, and security.
- Integrate Splunk with incident management, ITSM, and AIOps systems to enable predictive alerting and anomaly detection.
- Act as the SIEM/Splunk subject matter expert (SME) for architecture reviews, platform upgrades, and performance tuning.
- Reliability Engineering and Automation
- Implement and champion SRE frameworks and reliability practices for mission-critical systems.
- Design and automate runbooks, alerts, and self-healing workflows using Python, Ansible, and Terraform.
- Collaborate with Application, Infrastructure, and Cyber teams to embed reliability principles into the delivery lifecycle.
- Conduct resilience, chaos, and capacity testing aligned with business continuity and disaster recovery standards.
- Define and track error budgets, reliability scorecards, and service health indicators for production workloads.
- Cloud & Platform Integration
- Engineer SIEM for cloud-native workloads in AWS and Azure, ensuring visibility across compute, storage, and network layers.
- Integrate Splunk and cloud observability tools into CI/CD pipelines and landing zones to ensure continuous compliance.
- Implement infrastructure-as-code (IaC) models using Terraform and Ansible for consistent, auditable provisioning.
- Collaborate with Cloud, DevOps, and Security teams to ensure telemetry aligns with audit, compliance, and operational risk requirements.
- Operational Excellence and Collaboration
- Drive reduction in incident recurrence, MTTR, and manual intervention through observability-led automation.
- Partner with Service Delivery, Cyber, and Application teams to enable predictive incident prevention and root cause transparency.
- Develop and maintain executive dashboards and reports showcasing availability, reliability KPIs, and operational risk indicators.
- Provide technical leadership during major incidents, post-incident reviews, and audits, ensuring lessons learned are codified into automation and process improvements.
Skillset (Must have)
- Possess a degree in Computer Science, Engineering, or related discipline.
- Minimum 8 years of experience in Infrastructure, Cloud, or Site Reliability Engineering related roles, with at least 5 years of experience specializing in SIEM/Splunk engineering or observability in financial or regulated environments.
- Proven hands-on expertise in the following technical areas:
- SIEM Platforms: Splunk (must), EL/Elastic
- Automation/IaC, Terraform, Ansible, Python, CI/CD tools
- Cloud and other platforms and integrations: AWS (CloudWatch, X-Ray, CloudTrail), Azure (Monitor, Log Analytics, App Insights), Datadog, ServiceNow
- Deep understanding of SRE principles, service health modelling, error budgets, and auto-remediation design.
- Strong analytical and troubleshooting skills, with the ability to perform deep-dive investigations and develop long-term preventive solutions.
- Familiarity with financial sector operational resilience frameworks, regulatory compliance, and incident governance.
- Excellent written and verbal communication skills.
- Strong interpersonal and communication skills to interact with diverse stakeholders.
- Agile, fast learner and able to adapt to changes
Skillset (Good to have)
Preferred Certifications:
- Splunk Certified Power User / Splunk Certified Admin / Splunk Certified Architect
- Terraform / Ansible / Python Certified Expert
- AWS Certified DevOps Engineer / Azure DevOps Expert
- SRE Foundation / Practitioner (DevOps Institute)
- ITIL v4 Managing Professional
Any personal data you share with us during the application process will be processed strictly in compliance with applicable data protection laws and our .