As a Cloud Platform Engineer at CSIT, you will research, explore and adopt modern cloud technologies to continuously evolve and modernize on-premise platform services.
You will join the Runtime Platform team - responsible for delivering secure, reliable, and scalable platform services that abstract away infrastructure complexity, to enable software engineers to deploy and operate cloud-native applications efficiently while maintaining strong security and compliance standards.
Our team operates a multi-tenanted platform services that has included many enhancements customized for our developers' unique use-cases to improve productivity. These include 3rd-party technologies as well as bespoke workflow enhancements
Responsibilities:
Design and build self-service platform, such as Message Queue service, Observability service, etc, that would enable developers to focus on coding, testing, and managing their cloud-native applications.Automate and simplify lifecycle operations, such as provisioning and scaling using Infrastructure-as-Code and GitOps workflow.Implement observability and alerting systems for the platform services, using tools such as Prometheus, Grafana, or Elastic Observability to meet service-level objectives (SLOs).Collaborate with security teams to integrate and enforce security controls and compliance requirements of the platform services.Work with application teams to improve platform usability, streamline onboarding, and reduce operational toil.Respond to incidents and perform post-incident reviews, driving continuous improvement and operational excellence.Contribute to the reliability engineering culture, fostering shared responsibility for system availability and performance.Collaborate with cross functional teams to define platform service requirements.
Requirements (Minimum Qualifications):
Background in Computer Engineering, Computer Science or related field.Strong programming or scripting experience (e.g. HCL, YAML, Javascript, Python, or Bash).Good understanding of Linux systems, containers, networking fundamentals and distributed system operations.At least 1-3 years of hands-on experience operating cloud services in productionFamiliar with CI/CD pipelines, infrastructure-as-code, and configuration management (e.g., Terraform, Ansible, Helm).Experience implementing observability and monitoring of large-scale platforms.
Good to Have :
Knowledge of Kubernetes security concepts such as RBAC, admission controllers, and policy enforcement.Experience with GitOps workflows and deployment tools (e.g., ArgoCD, Gitlab Runner).Exposure to reliability engineering practices, including SLOs, error budgets, and capacity planning.Experience with observability stacks and tools such as Elastic Observability, Kibana, Prometheus, Grafana and Open Telemetry. Knowledge of networking protocols (HTTP, TCP, DNS) and troubleshooting tools. Passion for open-source technologies.
Why join us?:
At CSIT, you will:Build and operate a Cloud platform that supports Singapore's national security missions.Work with talented engineers who take pride in operational excellence, collaboration, and innovation.Be empowered to experiment, improve, and scale modern technologies securely. Have opportunities to deepen your expertise in SRE Practices, and secure platform engineering at scale.