SRE H/F

Licorne Society

ParisLieu

Paris

il y a 2 jours

Date de publication

il y a 2 jours

S/O

Niveau d'expérience

S/O

Temps pleinType de contrat

Temps plein

AutreCatégorie d'emploi

Autre

Licorne Society a été missionné par une startup en pleine croissance pour les aider à trouver leur SRE.

About the Role

We are excited to open a new position for a Site Reliability Engineer to join and strengthen our Engineering team. You will work closely with our current SRE to ensure the reliability, performance, and scalability of our infrastructure, which supports critical financial services for our clients.

Our platform runs entirely on AWS and Kubernetes, managed with Infrastructure as Code using Terraform. Datadog is at the core of our observability stack, enabling us to monitor, detect, and respond to issues quickly to maintain high levels of reliability and performance.

You will help drive operational excellence, optimize infrastructure costs, and enhance the developer experience through improved CI/CD practices, automation, and observability. While infrastructure is the core focus of this role, you will also contribute to our security and compliance efforts (SOC 2, ISO 27001), helping ensure our platform remains trustworthy and secure.

This position is open to on-site or full remote work for candidates based in France.

What You'll Do

Infrastructure Reliability & Scalability

Manage and evolve AWS infrastructure and Kubernetes clusters to ensure high availability, robust performance, and cost efficiency.
Support the deployment and operation of AI workloads and models, adapting infrastructure and automation to meet their requirements.
Leverage Terraform and DevOps best practices to automate and streamline infrastructure deployment and configuration.
Continuously improve infrastructure testing methods and proactively resolve performance bottlenecks or scalability issues.

Observability and Incident Management

Enhance Datadog-based monitoring to proactively detect and alert on issues, focusing on symptom-based alerting to avoid service disruptions.
Lead incident response efforts, reducing Mean Time To Detection (MTTD) and Mean Time To Resolution (MTTR).
Implement robust logging, tracing, and metrics to enable quick issue diagnosis and resolution.

Security and Compliance

Support ongoing compliance efforts with SOC 2 and ISO 27001, integrating security best practices into operations.
Manage and use tools such as AWS Security Hub, GuardDuty, and Datadog SIEM to identify risks, respond to incidents, and strengthen overall security.
Participate in security assessments and audits, recommending and implementing improvements.

Developer Experience & Empowerment

Refine CI/CD pipelines to enable safe, fast, and secure deployments.
Provide tooling, automation, and clear documentation to support developer productivity and satisfaction.
Maintain and optimize development, staging, and sandbox environments for smooth workflows.

What's in It for You

A collaborative, flat-structured environment where all voices are valued
Opportunities for career growth in a scaling company
Flexible remote work policy
A team of experienced engineers to learn and grow with
A culturally diverse and inclusive workplace

Must-Haves

4+ years of experience in SRE, DevOps, or System Engineering
Proven expertise with AWS, Kubernetes, and Terraform
Experience deploying and operating SaaS solutions
Strong knowledge of high-scalability architectures
Comfortable working with Linux/Unix shell
Practical experience with containerized architecture
Familiarity with monitoring tools (e.g., New Relic, Grafana, or similar)
Fluent in English
Strong problem-solving and analytical mindset

Balises associées

RÉSUMÉ DE L' OFFRE

SRE H/F

Licorne Society

Paris

il y a 2 jours

S/O

Temps plein

SRE H/F