Senior Systems Engineer - Log Management and Monitoring Platform

Job description

Number 1 in Europe!

zooplus AG is Europe’s leading online retailer of pet supplies. With over 6,8 million active customers in more than 30 European countries and 1,524 billion Euro revenue in 2019, zooplus is comfortably the market leader in the online segment. At zooplus, we believe e-commerce to be the sales model of today and for the future. E-commerce has become one of the fastest developing areas of business. Modern logistics centers, and the use of and personalization, mean that online sales can be handled with increasing efficiency.


As a Senior Systems Engineer in the Log Management and Monitoring Platform area, you will continuously improve our services. that allow our 30 DevOps Teams across 5 locations to analyze the status and performance of their applications and get alerted about production-critical issues. Your responsibilities will be:


• Work closely with our Container Orchestration Platform to collect logs and metrics from all production systems.

• Improve our central Logging Platform based on the elastic stack (formerly ELK).

• Design and improve additional services for Monitoring & Alerting, APM and Tracing.

• Drive automation to help our teams to improve their time-to-market. 

• Provide solutions by making use of open-source tools or implementing your own. 

• Support our DevOps teams in using our products and services in the most efficient way. 

• Ensure 24/7 availability of our platform services and practice incident response in our platform on-call duty.

Requirements

• At least 3 Years of Systems Engineering Experience. 

• University degree in Computer Science or equivalent. 

• Systematic problem-solving abilities coupled with distinct communication and collaboration skills. 

• Extensive knowledge in several of the following areas: o Log processing and analysis using the Elastic Stack / Configuring and optimizing elasticsearch clusters in production. o Time series data collection, processing, storage and visualization (e.g. Prometheus, telegraf, influxDB, ...). o Application Performance Monitoring and Tracing (e.g. Jaeger, Elastic APM, ...). o Automating Intrastructure and Software Provisioning with tools such as terraform and ansible. 

• Be familiar with base technologies like git, docker, ... 

• Provide coding skills, preferably in python and bash. 

• Have good understanding of network infrastructure and services such as DNS, DHCP, firewall, load balancing. 

• Ability to debug, optimize code and automate routine tasks. 

• Engage in and improve the whole lifecycle of services: From inception and design - to deployment, operation and refinement. 

• Be open-minded, aligned to business needs and solution-oriented and keen on working in cross-functional teams. 

• Excellent English skills.


What we are offering you:

Become a part of our success story and seize the opportunity to take on a real challenge in a dynamically growing company where there is huge scope for development and short decision-making processes. We are offering you a versatile, international-facing role in our motivated team with colleagues from all over Europe. Our Spanish office is at a great location in Madrid, with great infrastructure links. Additionally, we offer many competitive benefits such as:

  • Competitive salary
  • 28 days’ vacation (plus Dec 24 and 31 when they fall on a labor day)
  • Medical insurance
  • Flexible working hours
  • Free drinks and fresh fruit
  • Discount in zooplus shop

Did we make you curious?

Then send us your application in English via our online application form