SRE Engineer
We're seeking a Site Reliability Engineer (SRE) team member to manage and oversee our cloud services hosted on AWS. This new colleague will play a critical role in ensuring system reliability, adheres to SRE principles, and responding to emergencies in a 24/7 on-call setup.

Responsibilities:
Monitor and manage the SAFEQ Cloud print service, ensuring high availability and reliability within the AWS environment.
Develop and implement tools and practices for automating routine tasks to improve system scalability and resilience. (Terraform, Ansible, Bamboo, Git, Cloudwatch, Kubernetes)
Set up alerts and monitoring metrics for proactive identification and mitigation of system issues. (Cloudwatch, Prometheus, Alertmanager)
Participate in capacity planning and performance tuning to enhance system performance.
Collaborate with software engineering teams to ensure seamless deployment, efficient trouble-resolution, and effective crisis management. (on-calls, “war room”, L4 level bugs consultated)
Conduct root cause analysis following system incidents - post mortems; define corrective actions and preventative measures. (and implement them - Terraform, Ansible improvements)
In the backlog for next quarters: Deployment pipelines, Disaster recovery improvements, Automation of deployment improvements.
Team is remote (CZE, UK, ARG), european standards of work and quality ensured.
SRE team is enabler of environment for product which develop 15 R&D teams, SRE is the 16th.
Possibility to try other teams work for few sprints to get the know-how and spirit.
Requirements:
Senior engineer. You know AWS, you know SRE, you can do it.
Fluent English, good communication skills.
Experience in an SRE role.
Proficiency with AWS and its various services and resources.
Solid understanding of the software development life cycle, CI/CD pipelines.
Problem-solving skills, with the ability to think systematically.
Knowledge of networking, security, and database systems.
Availability for on-call duties in a 24/7 setup. (duty rotates amnog team members**)**
Bonus points for: Python/Go/Bash (deployment scripts)