Reliability and consistency has been one of the core propositions of the UrbanPiper platform. We are looking for individuals who are skilled in tasks related devops, who can ensure that the platform’s core propositions are held firm as we scale.
Work closely with the engineering team to understand and devise new or enhance existing workflows to ensure new code deployments don’t result in issues.
Regularly interact with the operations team to grasp the expectations related to stability and incident management.
Setup sophisticated monitoring and alerting mechanisms to notify concerned engineering or operations teams.
Take care of incident management, whereby you’ll provide requisite status updates on any on-going or past incident and engage the engineering team to determine and explain the root cause of an incident.
Institute the best practices for “security” w.r.t both engineering practices and operations practices. This would include enforcing adherence to workflows that protect our infrastructure and the data collected by the platform and that accessed by team members.
Suggest and handle cost optimisation activities for the infrastructure, along with any required capacity planning keeping in mind the growth in the volume of data.
Document best practices and standard operating procedures for the team.
We are looking for someone who has:
2+ years of managing a non-trivial infrastructure landscape built on top of one or more cloud architecture platforms.
Ability to use a wide variety of open source technologies and cloud services (experience with AWS is required)
Considerable maintaining and enhancing CI/CD workflows (experience in Jenkins a plus).
A good understanding of managing a decentralised platform and queues.
Strong experience with SQL and MySQL (NoSQL experience is a plus).
Experience with automation/configuration management using either Ansible, Puppet, Chef or an equivalent.
Familiarity with automated infrastructure provisioning systems like Terraform.
Knowledge of best practices and IT operations in an always-up, always-available service.
Demonstrable experience of learning and deploying infrastructure applications for monitoring and alerting.
The ability to establish "security" focused practices for the team as well as the application infrastructure.
Good communication skills.
Nice to haves:
Experience in a high growth technology company.
A good understanding of networking principles.