Site Reliability Engineer: Sponsorship Available

ZNOX TECHNOLOGIES LTD

Job Overview

We are seeking a skilled Site Reliability Engineer to join our dynamic team. The ideal candidate will be responsible for ensuring the reliability, availability, and performance of our systems and services. This role requires a proactive approach to system monitoring, incident response, and performance optimisation. You will work closely with development teams to implement best practices in system reliability and contribute to the overall improvement of our infrastructure.

Responsibilities

– Monitor system performance and reliability, identifying areas for improvement.

– Collaborate with development teams to gather requirements and ensure systems are designed for reliability.

– Implement and maintain automated testing frameworks to enhance system testing processes.

– Manage configurations using tools such as NGINX and SVN to ensure optimal system performance.

– Participate in Scrum ceremonies, providing insights on system reliability and performance metrics.

– Develop scripts in Python for automation of routine tasks and system monitoring.

– Conduct regular reviews of system architecture and propose enhancements based on best practices.

– Assist in troubleshooting incidents, performing root cause analysis, and implementing corrective actions.

– Maintain documentation related to system configurations, processes, and procedures.

Qualifications

– Proven experience in a Site Reliability Engineer or similar role within a technology-driven environment.

– Strong understanding of system testing methodologies and practices.

– Familiarity with Active Directory management and configuration is advantageous.

– Experience with tools such as crimson for monitoring and alerting systems is beneficial.

– Knowledge of requirements gathering techniques to align technical solutions with business needs.

– Proficiency in Python programming for automation purposes is essential.

– Ability to work collaboratively within a team environment while also being self-motivated.

– Excellent problem-solving skills with a keen attention to detail, particularly in high-pressure situations.

If you are passionate about improving system reliability and enjoy working in a collaborative environment, we encourage you to apply for this exciting opportunity as a Site Reliability Engineer!

What will you be doing?

  • Enhance infrastructure resilience, security, and cost efficiency through tooling and guidance.
  • Assist the team in resolving availability, performance, and scalability issues.
  • Influence Service Level Objectives (SLOs), Non-Functional Requirements (NFRs), and infrastructure needs.
  • Ensure adherence to technology standards and highlight deviations to the TDA.
  • Guarantee the fulfilment of SLOs within your area.
  • Contribute to the creation and promotion of the SRE service catalogue.
  • Uphold and enforce robust security measures.
  • Mentor and support junior team members.
  • Define and connect SLIs to their corresponding SLOs.
  • Develop monitoring tools and dashboards to track SLIs.

What skills do you need to succeed?

  • Cloud Computing: Strong proficiency with Amazon Web Services (AWS)
  • Containerization: Experience with Docker and Elastic Container Registry (ECR)
  • Scripting: Expertise in at least one scripting language (Bash, Python, Go, or Rust)
  • Infrastructure as Code: Familiarity with Ansible and CloudFormation
  • CI/CD: Practical experience with Jenkins or Concourse CI

Desired Skills:

  • Operating Systems: Knowledge of Windows and/or Linux administration
  • Programming: Background in software development (Java is a plus)

Benefits:

  • Additional leave
  • Bereavement leave
  • Canteen
  • Casual dress
  • Childcare
  • Company car
  • Company events
  • Company pension
  • Cycle to work scheme
  • Discounted or free food
  • Employee discount
  • Employee mentoring programme
  • Employee stock ownership plan
  • Employee stock purchase plan
  • Enhanced maternity leave
  • Enhanced paternity leave
  • Financial planning services
  • Flexitime
  • Free fitness classes
  • Free flu jabs
  • Free or subsidised travel
  • Free parking
  • Gym membership
  • Health & wellbeing programme
  • Housing allowance
  • Language training provided
  • Life insurance
  • Matching gift scheme
  • On-site gym
  • On-site parking
  • Paid volunteer time
  • Private dental insurance
  • Private medical insurance
  • Profit sharing
  • Referral programme
  • Relocation assistance
  • Sabbatical
  • Shuttle service provided
  • Sick pay
  • Store discount
  • Transport links
  • UK visa sponsorship
  • Unlimited paid holidays
  • Work from home

Experience:

  • Jenkins: 1 year (preferred)
  • System testing: 3 years (required)
  • Python: 1 year (required)
  • AWS: 3 years (required)
  • Continuous integration: 3 years (required)

Reference ID: SRE-ZN24J001

Closing Date: 15 October 2024

To apply for this job please visit www.glassdoor.com.