Machine Learning Engineer II, Text-To-Speech

SoundHound AI Inc

ParisLieu

Paris

il y a 22 jours

Date de publication

il y a 22 jours

S/O

Niveau d'expérience

S/O

Temps pleinType de contrat

Temps plein

Données / Big dataCatégorie d'emploi

Données / Big data

Job Overview

The Machine Learning Engineer on our Text-To-Speech team plays a crucial role in building and refining the models that define our unique voice experiences. The position is actively involved in the entire development lifecycle, from processing data to training and deploying our core TTS systems. Working alongside senior researchers and engineers, the role helps create high-quality, natural-sounding voices to be integrated into a wide range of products. The position's contributions directly impact our ability to deliver an engaging and seamless conversational AI experience to users worldwide.
What You'll Do

Implement, train, and evaluate state-of-the-art TTS models to generate high-quality, expressive speech targeted for our key products.
Collaborate with language specialists and data labelers to organize the collection and maintenance of essential speech data.
Contribute to the development of core speech synthesis inference engine.
Optimize models for production runtime.
Work with the systems and infrastructure teams to assist in the integration and deployment of TTS models into our production environment.
Analyze model performance and work with product stakeholders to identify areas of improvement. Contribute to the iterative enhancement of our TTS technology.
Stay current with the latest research and advancements in the TTS field and apply new techniques to our systems.

What You'll Bring

3+ years of professional experience in machine learning, with a strong focus or interest in speech-related topics like TTS or ASR.
Excellent programming skills in Python and strong experience with PyTorch. Proficiency in C++ is a big plus.
Strong knowledge of and experience implementing key machine learning concepts such as transformers, speech tokenizers, diffusion, flow-matching, LoRA, GANs.
Familiarity with cloud technologies such as docker and kubernetes.
Experience with torchscript or onnx is a plus.
A track record of working with an entire machine learning pipeline, from data preprocessing to model training and evaluation, in particular for TTS and ASR models.
A collaborative spirit and the ability to work effectively with cross-functional teams.
Drawn to tackling complex technical challenges and eager to learn and grow in the field of speech synthesis.

[Please note that if your application is advanced, the initial step will be an invitation to partake in a pre-assessment.]

This position is available for remote work throughout France and Germany. Employees within a 100-kilometer radius of the Paris office are expected to work from the office on three pre-scheduled, company-wide "core days" per month to encourage in-person cross-team collaboration.

Compensation includes salary, equity, comprehensive healthcare, paid time off, and other benefits. Our recruiting team will provide a specific salary range based on location and years of experience. #LI-PP1 #LI-REMOTE

Balises associées

RÉSUMÉ DE L' OFFRE

Machine Learning Engineer II, Text-To-Speech

SoundHound AI Inc

Paris

il y a 22 jours

S/O

Temps plein

Machine Learning Engineer II, Text-To-Speech