Duties and responsibilities:
- Harden platforms before they go live by reviewing their design and implementation, tuning configuration as well as developing auxiliary tools and necessary monitoring of critical health indicators
- Maintain platforms after go live by measuring and monitoring their availability, performance and overall system health
- Recover platforms during production incidents to meet targeted SLO; perform detailed root cause analysis to prevent regressions. 8/5 work model + on-calls
- Proactively seek improvements of non-functional requirements; cooperate with development teams to improve operational aspects of platforms under your responsibility
- Validate readiness and maturity of new rollouts through development, execution and verification of automated smoke test suites
- Provide technical expertise on IDEMIA products and support processes to internal and external customers
The Site Reliability Engineer, as a part of the Idemia’s SRE team, is responsible for providing automated operations and preventive monitoring of SLA-critical production platforms.
SRE teams incorporate their technical background and engineering skillset in order to improve reliability, availability and efficiency of the services they operate on. Effectively, it’s "what happens when a software engineer is tasked with what used to be called operations”, as Ben Treynor stated when setting up SRE teams for Google’s search engine.
Who we're looking for?
- Healthcare package
- Leisure package
- Hot beverages
- Cold beverages