Site Reliability Engineer

Scapia.com

Office

Bengaluru, India

Full Time

About The Company

Scapia! A co-branded credit card that’s out there to make travel happen for people, by converting their everyday expenses into travel experiences. We’re a bunch of passionate people who work together, brainstorm, and debate with each other, and don’t stop until we’re proud of our work. Customer delight tops everything else! We’ve worked hard to create an environment of honesty and passion that sets everyone up for success.

Role Overview

We’re looking for a Site Reliability Engineer to join our engineering team and help build and maintain our infrastructure platform. You’ll design and operate scalable platforms, establish best practices for infrastructure management, and drive reliability and scalability initiatives that shape our platform and engineering practices. You’ll work closely with product and development teams to ensure our systems are reliable, performant, and scalable as we grow and ship features.

Key Responsibilities

Build and maintain platforms and tooling to enable development teams to deploy and operate services efficiently
Design and manage cloud infrastructure using Infrastructure as Code for scalability, reliability, and security
Establish best practices for infrastructure management, reliability, and scalability across the organization
Build and maintain CI/CD pipelines and developer tooling to enhance productivity and reduce deployment friction
Build and maintain observability solutions (monitoring, logging, tracing) to ensure system health and performance
Collaborate with development teams to embed reliability, performance, and scalability into services from the start

Required Qualifications

1-3 years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering
Strong knowledge of Linux system administration, networking, and OS fundamentals
Deep understanding of AWS services
Programming experience in Golang, Bash, or Python
Hands-on experience with Infrastructure as Code (Terraform or similar)
Experience with monitoring tools (Prometheus, Grafana) and observability platforms
Production experience with Docker and Kubernetes (EKS, GKE, or self-managed)
Strong troubleshooting and problem-solving skills for complex distributed systems
Experience with CI/CD tools (GitHub Actions, Jenkins, or similar)
Understanding of security best practices in cloud environments

Nice To Have

Experience with GCP or Azure
Preferred Certifications:
AWS Certified Solutions Architect
Certified Kubernetes Administrator (CKA)
Google Cloud Professional Cloud Architect
AWS Certified Solutions Architect
Certified Kubernetes Administrator (CKA)
Google Cloud Professional Cloud Architect