Safaricom PLC

Service Availability DevOps Engineer at Safaricom Kenya

00100, Nairobi Kenya
April 23, 2024
Application deadline closed.
Deadline date:
Application deadline closed.

Job Description

Reporting to the Engineering Lead – Service Availability, the position holder will be tasked with monitoring & Observability and improving the operational aspects of all systems in scope within DIT. Drive automation and Dev-ops across the different domains. Foster service monitoring through proactive initiatives like AIOPs, machine learning among other available channels. 

RESPONSIBILITIES

  • Proactively building and implementing monitoring services, including end to end monitoring, scripting and automation, modern tooling and maintenance software. 
  • Use of AI and Machine learning to perform log analysis and create predictive models that will assist in identifying potential failures. 
  • Developing and executing automation scripts and maintenance jobs. 
  • Developing automation around monitoring.
  • Onboarding DIT systems to the service monitoring tools (APMs like ELK).
  • Clearly document any monitoring gaps noted and collaborate with the relevant teams to ensure timely closure. 
  • Performance of Applications error analysis and follow-up to ensure optimal customer experience.
  • Deployment of planned & operational changes on systems in scope. 
  • Support all Digital squads to ensure new products are monitored.
  • Support in Zero touch Operations initiatives.
  • Support in development of collectors and agents

QUALIFICATIONS

  • Bachelor’s Degree in either Computer Science or Information Technology, Electrical and communication engineering or Business Information Systems or in a relevant field in telecommunication.
  • Domain knowledge in at least 2 of the following areas , Sysadmin especially Linux, Orchestration (Kubernetes), Linux Kernel, Open telemetry.
  • Good understanding of back-end programming such us Python & RUST
  • Technical understanding of SRE concepts & DevOps Practices with respect to providing stable services to customers and adhering to availability KPIs, Service Level Objectives, Service Level Indicators & conforming to target monthly error budget. 
  • Be well versed with one or more modern monitoring tools such as ELK, Prometheus, Dynatrace, AppDynamics, New Relic, Splunk etc. 
  • Good understanding of the micro service architecture & appreciation of the traditional/classic SOA
  • Ability to manage a team having leadership skills, ownership of issues been analytical and a problem solver.
  • Being able to implement strict change management policy.
  • Conversant with agile ways of working.