OVERVIEW OF THE DEPARTMENT/SECTION

  • Site Reliability Engineering is responsible for delivering continuous improvement, automation, and self-service offerings to operational teams across Bank International

MAIN PURPOSE OF THE ROLE

  • Responsible for the reliability and efficiency of infrastructure through the delivery of common, repeatable tools and processes that greatly reduce the amount of toil operations must perform
  • Member of L3 Engineering team providing subject matter expertise and ultimate escalation

Primary:

  • Develop software to make infrastructure services self-managing and self-service dashboards.
  • Deliver continuous service improvement by developing Infrastructure as Code
  • Eliminate manual, repetitive, automatable, tactical tasks that are devoid from value
  • Improve system performance, make effective use of resources, distribute load and reduce latency
  • Identify SLO's (Service Level Objectives) to meet availability and latency objectives
  • Develop pro-active monitoring solutions that alert on symptoms and not just on outages
  • Perform detailed root cause analysis (RCA's) on incidents and outages to prevent future
  • Partner with development teams to improve services via rigorous testing and release procedures
  • Identify technical debt and partner with application teams to build remediation plans
  • Develop standard operational procedures and produce effective documentation
  • Analyse workloads and devise suitable cloud migration strategies where appropriate
  • Ensure all project / investment workloads are delivered according to plans and budget defined
  • Liaise with Infrastructure Control and IT Risk teams to satisfy internal and external audit requests
  • Deputise for team lead when required to do so and act-up accordingly
  • Identify cost saving and optimisation opportunities across the group
  • Build strong working relationships across the organisation
  • Adhere to the core values of the bank

Secondary:

  • Perform daily health and compliance checks for all systems as required
  • Ensure all systems are backed up successfully and any issues are promptly resolved
  • Validate monitoring alerts and batch job failures are detected promptly and satisfactorily resolved
  • Ensure sufficient capacity is available to accommodate drive growth
  • Respond to emails sent to the team distribution list / mailboxes in a timely manner
  • Handle incidents and requests with efficiency and a "customer first" mindset
  • Maintain infrastructure in a highly available, reliable, secure and performant manner
  • General Server / Database / Virtualisation Administration maintenance activities
  • Provide technical support to application support and development teams
  • Provide consultancy to application support and development teams
  • Take part in On-Call & weekend work rotation; triaging and addressing production issues as they arise

MUST

Essential:

We are looking for an SRE SME strictly highly skilled in k8s

  • Exceptional skills in Docker/Kubernetes deployment and configuration, scaling and management of containerized applications.
  • Excellent skills in managing, performance optimisation of complex Prometheus, Influxdb and Grafana monitoring stack.
  • Excellent skills in writing/maintaining Grafana Dashboard using PromQL, InfluxQL/Flux.
  • Experience in distributed technologies like Rook, Ceph, Noobaa, Trino, MariaDB Xpand, Dremio, Kibana, KX platform
  • Experience in CI/CD/CT platforms like Git, Ansible, Terraform and TeamCity
  • Serena Deployment Automation (SDA) and Jenkins
  • "Infrastructure as Code" Principles and practices.
  • "Continuous Integration (CI) and Continuous Development (CD)" Principles and practices
  • Agile, Site Reliability Engineering (SRE) and DevOps Principles and practices
  • Scripting and programming languages such as PowerShell, Python, Bash and C#
  • Fluent in Backup and Recovery processes and procedures
  • Advanced knowledge of Clustering, High-Availability, Replication and Disaster Recovery techniques
  • Ability to tune Network, Storage, Server and Virtualisation layers for optimal performance and reliability
  • Excellent Performance Tuning skills, in-depth knowledge of system internals
  • Ability to interpret and implement CIS security hardening recommendations in a controlled manner
  • Acute awareness of Security and Auditing requirements in a regulated environment

NICE TO HAVE

Highly Desirable:

  • RHEL, Oracle Linux, Oracle Solaris and related technologies
  • Microsoft Windows Server and related technologies
  • Microsoft SQL Server, Oracle, Sybase ASE, MongoDB and Snowflake
  • Active Directory, LDAP and Kerberos
  • IBM Tivoli / Netcool
  • Nutanix HCI and VMWare ESX
  • Networking Protocols (TCP/IP, DNS, DHCP, VLAN's)
  • Cloud computing - IaaS, PaaS and SaaS offerings across Azure, AWS, GCP and Oracle
  • Knowledge of data security governance and regulations such as GDPR and SOX

Desirable:

  • Dell EMC PowerStore (SAN) and Isilon (NAS)
  • Rubrik, EMC Networker, Data Domain and IBM Tivoli Storage Manager
  • CyberArk
  • Splunk
  • Qualys
  • Cisco Tetration
  • ServiceNow
  • JIRA and Confluence

Luxoft, a DXC Technology Company is a global digital strategy and software engineering firm with over 17,000 international employees within its 58 offices in 29 countries. It is headquartered in Zug, Switzerland. 

In January 2019, Luxoft was acquired by U.S. company DXC Technology. Luxoft partnered with LG Electronics to create a next-generation Autonomous Mobility concept vehicle that integrates consumers' personalized digital lifestyles into a driving experience. Luxoft enabled Switzerland's first Blockchain based e-vote platform with the City of Zug and Hochschule Luzern's Blockchain Lab.

Luxoft, a DXC Technology Company is a world-renowned company. It has been present on the Polish market for over 13 years. We have offices in Krakow, Warsaw, Wroclaw, and Gdansk. We employ over 2,000 professional experts carrying out projects for over 100 clients from the financial, automotive, medical, tourist industries, etc. We work for many international clients, including the USA, Great Britain, and Switzerland.

So far, Luxoft Poland has made a name for itself as a company that offers work on innovative projects, we offer various experiences in the field of IT, opportunities for rapid development, an extensive training program, and attractive benefits for employees.

At present, 62% of Luxoft Poland employees come from Poland, and 38% from around 50 countries, including Ukraine, Brazil, India, Turkey, Spain, Portugal, Italy, Romania, USA, etc.

At Luxoft, a DXC Technology Company almost 80 percent of employees are experts with the "Senior" experience level, with at least five years of experience. We care about our employees, so every day we try to provide them with the best possible conditions for work and development.

Technology is our passion! We focus on top engineering talent means that you will be working with the best industry professionals from around the world. Because of that, Luxoft is a global family with an epic atmosphere – we love what we do!