Site Reliability Engineer
Scapia.com
Office
Bengaluru, India
Full Time
About The Company
Scapia! A co-branded credit card that’s out there to make travel happen for people, by converting their everyday expenses into travel experiences. We’re a bunch of passionate people who work together, brainstorm, and debate with each other, and don’t stop until we’re proud of our work. Customer delight tops everything else! We’ve worked hard to create an environment of honesty and passion that sets everyone up for success.
Role Overview
We’re looking for a Site Reliability Engineer to join our engineering team and help build and maintain our infrastructure platform. You’ll design and operate scalable platforms, establish best practices for infrastructure management, and drive reliability and scalability initiatives that shape our platform and engineering practices. You’ll work closely with product and development teams to ensure our systems are reliable, performant, and scalable as we grow and ship features.
Key Responsibilities
- Build and maintain platforms and tooling to enable development teams to deploy and operate services efficiently
- Design and manage cloud infrastructure using Infrastructure as Code for scalability, reliability, and security
- Establish best practices for infrastructure management, reliability, and scalability across the organization
- Build and maintain CI/CD pipelines and developer tooling to enhance productivity and reduce deployment friction
- Build and maintain observability solutions (monitoring, logging, tracing) to ensure system health and performance
- Collaborate with development teams to embed reliability, performance, and scalability into services from the start
Required Qualifications
- 1-3 years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering
- Strong knowledge of Linux system administration, networking, and OS fundamentals
- Deep understanding of AWS services
- Programming experience in Golang, Bash, or Python
- Hands-on experience with Infrastructure as Code (Terraform or similar)
- Experience with monitoring tools (Prometheus, Grafana) and observability platforms
- Production experience with Docker and Kubernetes (EKS, GKE, or self-managed)
- Strong troubleshooting and problem-solving skills for complex distributed systems
- Experience with CI/CD tools (GitHub Actions, Jenkins, or similar)
- Understanding of security best practices in cloud environments
Nice To Have
- Experience with GCP or Azure
Preferred Certifications:
- AWS Certified Solutions Architect
- Certified Kubernetes Administrator (CKA)
- Google Cloud Professional Cloud Architect
- AWS Certified Solutions Architect
- Certified Kubernetes Administrator (CKA)
- Google Cloud Professional Cloud Architect
