Software-Development Operations

Site Reliability Engineering Technical Lead for SAP Build Process Automation

We help the world run better

Our company culture is focused on helping our employees enable innovation by building breakthroughs together. How? We focus every day on building the foundation for tomorrow and creating a workplace that embraces differences, values flexibility, and is aligned to our purpose-driven and future-focused work. We offer a highly collaborative, caring team environment with a strong focus on learning and development, recognition for your individual contributions, and a variety of benefit options for you to choose from. Apply now!


What you’ll do

We are looking for a seasoned, motivated Site Reliability Engineering Technical Lead to join and lead our expanding team. This key role is integral in optimizing and fine-tuning our operational efficiency, guiding our SRE team to manage, plan and coordinate the software development process. The ideal candidate will work collaboratively to identify and prioritize improvements for our services, develop automated deployment tools, monitor system health, maintain secure and efficient IT architecture, and ensure swift troubleshooting of issues when they arise.

As a Site Reliability Engineering Lead, you will play a critical role in systems operations for our company. As a lead of our SRE team, you will be responsible for reviewing service architecture/ operations with service owners and identifying opportunities for improvement/ automation, driving service deployment, developing and implementing software continuous integration and delivery pipelines, automating operations and development processes, troubleshooting issues, and streamlining application deployment. In addition, you will maintain configuration management solutions and work with various teams to improve DevOps practices throughout the organization. You will also play a vital role in designing and prioritizing infrastructure strategies to ensure reliable, efficient, and secure IT systems – this would also include the development of comprehensive monitoring solutions to provide full visibility to the different platform components using tools and services that integrate with the chosen cloud provider.

Following SRE principles, you will also take active part in assigning the identified improvement/ automation tasks within the SRE team, collaborate with experienced software engineers from around the globe and jointly investigate problems and help improving our products. You will be also expected to share your broad knowledge and experience to educate junior colleagues.


We will count on your strong support with our UPCOMING PROJECTS:

  • Deploy and support the Digital Workspace product on K8S runtime
  • Extend and Automate Security requirements to key services
  • Adopt Multi-AZ requirements - organize, continuously execute and automate chaos testing to ensure service resilience and availability
  • Develop, maintain, and extend monitoring, alerting and remediation tooling for SAP Build process automation services (SBPA)
  • Deploy and support key services on GCP landscapes

Part of this role includes being available for service support out-of-office hours.


What you bring:

With at least 5 years’ relevant work experience, you should have a good knowledge of modern cloud architectures, debugging and profiling tools. You will also have a passion for automation and experience with different tools. We are looking for a team player who can work efficiently in emergency situations and quickly analyse and solve problems in a worldwide team setup. Excellent communication skills are required in these scenarios to ensure information distributed is precise and factual.

You should also have practical experience in at least one of the following areas and good knowledge of the rest:

  • Bachelor’s degree in Computer Science, Information Technology, or a related field
  • Significant previous experience in a DevOps, SRE, System Admin, or similar role
  • Strong experience with cloud services (AWS, GCP, Azure) and architecture
  • Solid experience with DevOps toolchain (Jenkins, Travis CI, Puppet, Chef, GitHub)
  • Solid experience with coding and scripting languages (Python, Go, Bash, Selenium, Groovy)
  • Solid experience with monitoring tools such as Grafana and Dynatrace
    • Experience with performance tuning, monitoring, and system-level debugging
    • Knowledge and experience in automating response to system-related incidents, warnings, and key performance indicators (KPIs) would be a plus
  • Knowledge of containerization (Cloud Foundry, Docker, Kubernetes, ECS, or OpenShift)
  • Understanding of Infrastructure as Code (IaC) using tools like Terraform, Cloudformation or Ansible
  • Excellent problem-solving skills and attention to detail
  • Ability to support 24/7 on-call rota schedule within the team
  • Knowledge of Agile methodology and processes

We would also look for some key soft skills such as:

  • Communication
  • Positive “can do” attitude
  • Teamwork
  • Excellent work ethic – stay focused and complete tasks in a timely manner especially when under pressure
  • Willingness to learn
  • Leadership skills – ability to mentor/ share knowledge with junior team members

Meet your team:

The ADAI (Application Development, Automation & Integration) Site Reliability Engineering team plays a fundamental role in managing and maintaining the organization's cloud computing strategy. They are responsible for overseeing the daily operations and maintenance of cloud applications, ensuring their availability, performance, reliability, and security. This involves monitoring system health, handling software upgrades and deployments, identifying and troubleshooting issues, and ensuring optimal resource allocation. They also work closely with other teams to troubleshoot complex system issues, implement necessary updates, and ensure compliance with industry's best practices and regulations. Furthermore, the SRE team is crucial in disaster recovery planning and execution, as well as creating guidelines and procedures for cloud operations.

The Cloud Ops team makes the SAP ADAI Services run better by providing 24x7 deep technical coverage for Incident Management applying SRE principles. We share a Live Site First culture and care for the business continuity of our customers running mission critical applications on top of the Cloud Platform.


We build breakthroughs together

SAP innovations help more than 400,000 customers worldwide work together more efficiently and use business insight more effectively. Originally known for leadership in enterprise resource planning (ERP) software, SAP has evolved to become a market leader in end-to-end business application software and related services for database, analytics, intelligent technologies, and experience management. As a cloud company with 200 million users and more than 100,000 employees worldwide, we are purpose-driven and future-focused, with a highly collaborative team ethic and commitment to personal development. Whether connecting global industries, people, or platforms, we help ensure every challenge gets the solution it deserves. At SAP, we build breakthroughs, together.

We win with inclusion

SAP’s culture of inclusion, focus on health and well-being, and flexible working models help ensure that everyone – regardless of background – feels included and can run at their best. At SAP, we believe we are made stronger by the unique capabilities and qualities that each person brings to our company, and we invest in our employees to inspire confidence and help everyone realize their full potential. We ultimately believe in unleashing all talent and creating a better and more equitable world.
SAP is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to the values of Equal Employment Opportunity and provide accessibility accommodations to applicants with physical and/or mental disabilities. If you are interested in applying for employment with SAP and are in need of accommodation or special assistance to navigate our website or to complete your application, please send an e-mail with your request to Recruiting Operations Team:
For SAP employees: Only permanent roles are eligible for the SAP Employee Referral Program, according to the eligibility rules set in the SAP Referral Policy. Specific conditions may apply for roles in Vocational Training.

EOE AA M/F/Vet/Disability:

Qualified applicants will receive consideration for employment without regard to their age, race, religion, national origin, ethnicity, age, gender (including pregnancy, childbirth, et al), sexual orientation, gender identity or expression, protected veteran status, or disability.
Successful candidates might be required to undergo a background verification with an external vendor.

Requisition ID: 390032  | Work Area: Software-Development Operations  | Expected Travel: 0 - 10%  | Career Status: Professional  | Employment Type: Regular Full Time   | Additional Locations: #LI-Hybrid.

Requisition ID:  390032
Posted Date:  May 15, 2024
Work Area:  Software-Development Operations
Career Status:  Professional
Employment Type:  Regular Full Time
Expected Travel:  0 - 10%

Sofia, BG, 1407

Job alert

Job Segment: Cloud, Process Engineer, ERP, Testing, SAP, Technology, Engineering