Senior Machine Learning Engineer, Text-To-Speech
SoundHound AI Inc
il y a 20 heures
Date de publicationil y a 20 heures
S/O
Niveau d'expérienceS/O
Temps pleinType de contrat
Temps pleinDonnées / Big dataCatégorie d'emploi
Données / Big dataThe Senior Machine Learning Engineer is at the heart of our mission to build a state-of-the-art, in-house Text-To-Speech (TTS) service, and deploy it into all of our products. The position leads the charge in conducting cutting-edge research and developing the core models that power our next generation of expressive, multi-lingual, and conversationally aware voices, while being very much product-driven. The work involves owning the entire technical stack, from conceptualizing novel architectures to deploying robust, scalable systems that serve millions of users. By collaborating with a cross-functional team of linguists, software engineers, and SREs, the position opens a lot of opportunities to grow and make a mark in the speech synthesis landscape.
What You'll Do
[Please note that if your application is advanced, the initial step will be an invitation to partake in a pre-assessment.]
**We recognize that not every candidate will meet every listed requirement. If you believe your skills and experiences position you to contribute meaningfully in this role, we encourage you to apply. You may offer strengths and perspectives we have not yet considered.
This position is available for remote work throughout France and Germany. Employees within a 100-kilometer radius of the Paris office are expected to work from the office on three pre-scheduled, company-wide "core days" per month to encourage in-person cross-team collaboration.
Compensation includes salary, equity, comprehensive healthcare, paid time off, and other benefits. Our recruiting team will provide a specific salary range based on location and years of experience.
#LI-PP1 #LI-REMOTE
What You'll Do
- Conduct pioneering research on TTS, voice conversion, style transfer, and related speech technologies to drive innovation.
 - Lead the design, implementation, and training of state-of-the-art TTS models, with a focus on creating natural, emotionally rich, and engaging voices.
 - Play a key role in developing a highly efficient production runtime and infrastructure to serve TTS models, and optimizing models for it.
 - Partner with systems and infrastructure teams to engineer production-grade deployments of models at scale.
 - Work closely with language specialists and data engineers to architect and oversee the collection, processing, and maintenance of large-scale, high-quality speech datasets.
 - Mentor junior engineers, foster a culture of technical excellence, and clearly communicate complex concepts to both technical and non-technical stakeholders.
 - Maintain a holistic view of the product, driving the technical vision for both research components and production systems.
 
- A proven track record of conducting cutting-edge research in TTS, demonstrated through publications in top-tier conferences, patents, or significant contributions to the field.
 - 5+ years of experience in academia or industry focusing on TTS or related topics such as voice conversion, ASR, vocoders, Source Separation, Speaker diarization.
 - Excellent proficiency in Python and C++, with a deep understanding of the latest tools, standards, and best practices in software and model development.
 - Strong knowledge of and experience implementing key machine learning concepts such as transformers, speech tokenizers, diffusion, flow-matching, LoRA, GANs.
 - Experience deploying machine learning models at scale, with a strong focus on performance, cost efficiency, and reliability.
 - Experience with torchscript or onnx.
 - Familiarity with cloud technologies and MLOps principles, including Kubernetes and Docker.
 - A holistic mindset, an eagerness to learn with a passion for owning the entire technical stack and working collaboratively.
 
[Please note that if your application is advanced, the initial step will be an invitation to partake in a pre-assessment.]
**We recognize that not every candidate will meet every listed requirement. If you believe your skills and experiences position you to contribute meaningfully in this role, we encourage you to apply. You may offer strengths and perspectives we have not yet considered.
This position is available for remote work throughout France and Germany. Employees within a 100-kilometer radius of the Paris office are expected to work from the office on three pre-scheduled, company-wide "core days" per month to encourage in-person cross-team collaboration.
Compensation includes salary, equity, comprehensive healthcare, paid time off, and other benefits. Our recruiting team will provide a specific salary range based on location and years of experience.
#LI-PP1 #LI-REMOTE
RÉSUMÉ DE L' OFFRE
Senior Machine Learning Engineer, Text-To-Speech
SoundHound AI Inc
Paris
il y a 20 heures
S/O
Temps plein
Senior Machine Learning Engineer, Text-To-Speech