Remote Site Reliability Engineer – Canada & Americas | Engineering Opportunities
Tyk Technologies
London, United Kingdom Full-time posted 2 days ago in I.T. & Communications-
Job ID 2766684
Job Description
Join Tyk – a pioneering force in the API Management landscape, transforming the way organizations connect their systems and services. From retail to finance, telecoms to healthcare, Tyk is the backbone of innovative digital solutions that power our connected world. If you’ve ever banked online, used a mobile app, or driven a smart car, you’ve experienced the power of APIs—thanks to Tyk. Since our inception in 2015, we’ve expanded globally, with offices in London (UK), London (Ontario), Atlanta, and Singapore, and proudly service a diverse clientele including renowned brands like Lotte, Bell, T-Mobile, RBS, Capital One, and Vinci. Our mission is straightforward yet ambitious: to connect every system in the world through our advanced API Management platform.
As a site reliability engineer, you will play a pivotal role in managing and enhancing our platform’s reliability and performance. We are in search of a naturally curious individual who constantly seeks to innovate and improve our systems. You will be the first line of incident management, guiding our clients through challenges and helping shape our incident response strategy. This position offers you a unique chance to become an essential contributor to Tyk as we steer towards new horizons.
At Tyk, we believe in total flexibility and radical responsibility, embracing a remote-first work culture that supports work-life balance and personal productivity. Join a distributed team of industry experts from around the globe and help us elevate not just Tyk’s Cloud platform, but the entire company as we continue to evolve.
Key Responsibilities:
- Manage and maintain the global Tyk Cloud, ensuring adherence to service level agreements (SL(A/I/O)s).
- Proactively identify reliability issues and collaborate with your squad to devise solutions.
- Introduce innovative metrics and develop insightful dashboards for platform analytics.
- Participate in on-call rotations and incident management to address client needs.
- Collaborate with your squad to enhance the platform’s reach across multi-region and multi-cloud environments.
- Document operational processes and knowledge for team reference.
- Conduct detailed post-incident analyses to foster continuous improvement.
- Automate repetitive tasks to streamline operations and support.
- Champion our ongoing improvement agenda by refining user stories and enhancing communication with teams and customers.
- Ensure the reliability of our new global Tyk Cloud platform and drive operational efficiency without compromising service quality.
- Support penetration testing initiatives through vendor collaboration, technical details, and environment setups.
Desired Experience:
- Strong teamwork and collaboration skills.
- Proven track record in launching and maintaining production-scale Kubernetes clusters.
- Proficient in designing and managing infrastructure on AWS and other cloud providers.
- Experience operating MongoDB or similar document databases and Redis or similar key-value stores.
- Skilled in administering Linux servers and maintaining distributed software systems.
- Familiar with monitoring solutions like Prometheus and Grafana, along with logging collection and analysis systems.
Technical Skills:
- Advanced expertise in Kubernetes and containerization.
- Proficiency in AWS/EKS and Linux administration.
- Solid understanding of Infrastructure as Code (IaC) using Terraform (proficient) and Helm (proficient).
- Familiar with programming in Go and/or Python.
- Knowledge of networking concepts and various protocols.
Nice to Have:
- Experience with GCP or Azure.
- Familiarity with bare metal infrastructure engineering and API management.
- Background in large-scale distributed storage management and familiarity with Rancher.
- Relevant certifications (CKA/CKAD/CKS).
Why Choose Tyk?
- Enjoy unlimited paid holidays that promote work-life balance.
- Benefit from flexible working hours tailored to your productivity peaks.
- Participate in our employee share scheme and take part in generous maternity and paternity leave policies.
- Join exciting company retreats that foster team bonding and creativity.
Our values embody our culture at Tyk:
- Embrace the possibility of failure; often, the most unexpected ideas lead to success.
- Trust is fundamental—initiate it from day one and assume good intentions in all your interactions.
- Strive to improve the things around you and welcome change as a constant force for growth.
Tyk is committed to equality and diversity within the workplace. We welcome applications from all candidates regardless of gender, age, disability, religion, belief, sexual orientation, marital status, or race, ensuring that everyone has a fair chance to contribute to our vision.