Site Reliability Engineering Expert

Robert Bosch Sp. z o.o.

  • offer expired over a month ago
  • contract of employment
  • full-time
  • senior specialist (Senior), expert
  • hybrid work
  • remote recruitment
  • запрошуємо працівників з України
ukrainian-friendly-overlay
Запрошуємо працівників з України
Роботодавець відкритий для працевлаштування громадян України

Robert Bosch Sp. z o.o.

Składowa 35

Śródmieście

Łódź

Technologies we use

Expected

  • Jenkins

  • GitLab

  • Azure DevOps

  • Puppet

  • Bash

  • PowerShell

  • Grafana

  • SRE

  • Java

  • Python

  • Node.js

Optional

  • ServiceNow

About the project

Bosch is seeking a highly motivated individual to step into the senior expert role of Site Reliability Engineer (SRE) in Łódź or Warsaw.

This role will influence IT teams globally to apply a software engineering mindset to the technical operations of IT products and solutions.

This role consist of:

• You will support teams in achieving higher levels of service reliability, scalability, and performance for Bosch IT products, services, and solutions

• You will also design and help codify highly technical automation levels that span CICD pipelines, provisioning, configuration, monitoring/alerting, and system ops (extending to emerging AIops).

• You will design and help to realize monitoring concepts based on Service Level Objectives (SLOs) attached to customer-oriented SLAs (e.g., availability SLA). He/She will work in the context of IT projects and activities, complete tasks with a practical SRE mindset, and help influence teams to learn and adopt such practices.

SRE Expert will work in the context of IT projects and activities, complete tasks with a practical SRE mindset, and help influence teams to learn and adopt such practices.

Your responsibilities

  • Install SRE practices into operational teams that will run the next-generation Bosch IT service automation platform

  • Design and realize an integrated ops model for real-time service monitoring, alerting (on service levels), SRE-oriented incident handling, and on-call procedures

  • Help to establish runbooks and playbooks, ideally enhanced with technical automation

  • Coach team members, work hands-on, and lead by example to concretely install SRE into the daily routine of the team

  • Contribute to the design and realization of automated integrations linked to the new Bosch IT service automation platform (based on ServiceNow)

  • Influence ops models for IT service automation according to SRE principals

  • Engage supporting Bosch IT teams to participate in an SRE-oriented integrated ops model for the global Bosch IT service automation portfolio. This engagement span backend services involving teams who operate Linux, Docker, Kubernetes (or other orchestration technologies), networking services such as load balancers, database backend systems, a variety of hybrid cloud services, etc.

  • Design and work to install release strategies across the service automation portfolio that transition in new service features via progressive deployment techniques (leverages feature flags, canary deployments, automated rollbacks for errors, etc.)

  • Comfortable splitting time coaching teams about SRE practices and remaining proficient with hands-on technical tasks

  • Thought-leader in all things DevOps and SRE

  • Hold a positive attitude and a solid commitment to delivering quality output while leading independently (self, others)

  • Take ownership of projects and gain professional satisfaction in achieving stable operational environments with high degrees of technical automation and remaining within the SRE-calculated error budget

Our requirements

  • 7+ years of experience in Site Reliability Engineering, DevOps, or Software Engineering roles

  • Experience in working with SRE practices

  • Advanced knowledge of following terms like Critical User Journey, Error budget, Blameless postmortems

  • Hands-on technical with many open-source and cloud technologies in the DevOps space

  • Coding skills in one of the following: Python, Node.js, Java (as well as skills with Bash, Ansible coding, and/or PowerShell scripting)

  • Solid understanding of enterprise IT integration architectures (e.g., ETLs, real-time interface endpoints, messaging architectures, technical approaches to scalable message streaming)

  • Experience with various microservice implementation styles that can realize a robust, scalable production configuration

  • Experience with backend integration involving relational databases, NoSQL databases, cache subsystems (e.g., Redis), cloud databases (e.g., Azure CosmosDB), and cloud storage offerings (e.g., Azure storage accounts, S3)

  • Experience designing end-to-end monitoring solutions with log aggregation, metric collections (and visualization), and best practice know-how for alerting and automated incident handling generation

  • Experience with time-series databases for production metrics data collection and management (e.g., InfluxDB, Prometheus)

  • Some level of skill to review IT solution designs for scale and performance concerns, and, as needed, the ability to apply this to trouble-shooting incident scenarios while under pressure

  • Good understanding of what Service Management means in a large enterprise

  • Experience working with large, complex, and diverse IT systems under constant change and with a global ops practice

  • Technical experience with at least one leading cloud platform: Microsoft Azure, AWS, or GCP (cloud certifications are significant pluses)

  • Technical exposure to the ServiceNow platform is a significant plus

  • The personal attributes of a technical leader, coach, and professional consultant (Independent, convincing, committed)

  • Builds close relationships with local/regional DevOps teams and global IT teams

  • Open to business trips (e.g., Germany), including international travel (China, India, Brazil, United States)

  • Proficiency in English

Benefits

  • sharing the costs of sports activities

  • private medical care

  • sharing the costs of foreign language classes

  • sharing the costs of professional training & courses

  • life insurance

  • remote work opportunities

  • flexible working time

  • fruits

  • corporate products and services at discounted prices

  • integration events

  • preferential loans

  • no dress code

  • coffee / tea

  • leisure zone

  • sharing the costs of tickets to the movies, theater

  • christmas gifts

  • employee referral program

  • charity initiatives

  • family picnics

  • Massage services at the office

  • Lawyer consultation

  • Summer and winter activities for children

  • Office in Warsaw, Łódź and Gdańsk

Recruitment stages
1

Phone interview with a recruiter

2

Meeting with a direct manager

3

Meeting with a higher level manager

Technology stack

  • Experience with Continuous Integration and Continuous Delivery toolchains: e.g., Jenkins, ArgoCD, GitLab CI, GitHub Actions, Azure DevOps, AWS CodePipeLine, or similar

  • Experience with infrastructure-as-code (e.g., Terraform, Ansible) and configuration management (e.g., GitOps, Puppet)

  • Solid experience introducing automation to reduce manual ops tasks (or toil) using shell scripting techniques (e.g., Bash, PowerShell), Ansible playbooks and beyond

  • Monitoring design/realization experience with an APM (e.g., AppDynamics) and/or open-source monitoring offerings such as Prometheus, Grafana, and Kabana

Robert Bosch Sp. z o.o.

At Bosch, we shape the future by inventing high-quality technologies and services that spark enthusiasm and enrich people’s lives. Our promise to our associates is rock-solid: we grow together, we enjoy our work, and we inspire each other.

Scroll to the company’s profile