Neural Network Specification Learning
Inria
il y a 2 jours
Date de publicationil y a 2 jours
S/O
Niveau d'expérienceS/O
Temps pleinType de contrat
Temps pleinSystèmes d'information / RéseauxCatégorie d'emploi
Systèmes d'information / RéseauxContexte et atouts du poste
The research engineer will work in the context of an international collaboration with the University of Toronto, Canada. This collaborations aims at introducing novel ways for specifying, verifying, and explaining machine learning models. The overarching research hypothesis behind the project is that reasoning about large, complex, state-of-the-art models can be achieved through proper abstractions built on top of their internal representations.
Mission confiée
The engineer will work on designing and developing a new methodology for learning mathematically rigorous logic-based and expressive specifications over the latent features of a given machine learning model. By building on preliminary results [Gopinath et al. 2019, Geng et al. 2023], they will explore a broader class of neural specifications.
[Gopinath et al. 2019] Divya Gopinath et al. - "Property Inference for Deep Neural Networks" (ASE
2019)
[Geng et al. 2023] Chuqin Geng et al. - "Towards Reliable Neural Specifications" (ICML 2023)
Principales activités
This investigation will focus on innovation and generalization across three key aspects:
1) Neural Predicate Design: Previous studies primarily considered neuron-level abstractions using basic binary predicates (active or inactive), essentially unary relations capturing only single-neuron states. Such unary predicates have limited expressiveness, as they cannot adequately represent dependencies or hierarchical structures involving multiple neurons. To overcome this limitation, we will first investigate predicates with higher arities-binary or ternary relations-that allow explicit representation of interactions and hierarchical relationships among neurons. For transformer-based architectures, neuron-level predicates are likely insufficient due to scalability challenges. To address this scalability issue, we will investigate higher-level abstractions tailored explicitly for transformers by introducing neural predicates defined over attention heads.
2) Specification Structure: The structure of neural network specifications significantly influences their generalization capability, effectiveness, and scalability. Current specification structures, such as those employed in VNN-COMP (https://sites.google.com/view/vnn2025), consist of a reference input paired with a small bound quantifying allowable perturbations. This design is inherently limited, as it specifies only a narrow region around a single input and fails to
generalize effectively even to closely related training inputs. In [Geng et al. 2023], a novel specification structure was introduced, comprising pairs of neural predicate sets - one set indicating neurons that must remain active and another indicating neurons that must remain inactive. This innovative approach successfully enabled, for the first time, comprehensive verification of neural networks across the entire input space of the MNIST handwritten digits. Building upon this success, we will generalize the specification structure further by re-designing it into a set of Horn clauses formulated over neural predicates. Utilizing Horn clauses in combination with richer neural predicates will yield a highly expressive and formally grounded language capable of representing a significantly broader and more flexible range of neural network specifications compared to existing methods.
3) Specification Learning Algorithms: The design of learning algorithms is significantly influenced by both neural predicates and specification structures. We will initially explore a general statistical approach that remains largely agnostic to specific neural predicates or specification structures. The core idea involves collecting inference traces from a dataset (either the original training set or an independent validation set), computing statistical distributions for each neural predicate, and subsequently composing candidate specifications based on these statistics. The complexity of potential neural predicates depends on factors including neuron count, abstract domain size, predicate relation types, and their arities. To ensure that the neural predicate space remains practically manageable (approximately linear with respect to neuron count), we will carefully consider network architecture and introduce pragmatic assumptions, such as locality constraints. Specifically, predicates with higher arities like binary or ternary relations will be restricted to neurons within the same convolution window, the same layer, or neighboring layers. The subsequent step - composition of specifications - is inherently challenging, as it fundamentally involves rule synthesis. Prior work [Geng et al. 2023] partially addressed this challenge through a simple yet effective approach: selecting an appropriate threshold and then conjunctively combining neural predicates that exceed this threshold. Initially, we will adopt this effective strategy to develop a functional prototype. Later, the methodology will perform systematic refinement and reduce redundancy in learned specifications, incorporating insights from formal methods, programming languages, and machine learning disciplines.
Avantages
The research engineer will work in the context of an international collaboration with the University of Toronto, Canada. This collaborations aims at introducing novel ways for specifying, verifying, and explaining machine learning models. The overarching research hypothesis behind the project is that reasoning about large, complex, state-of-the-art models can be achieved through proper abstractions built on top of their internal representations.
Mission confiée
The engineer will work on designing and developing a new methodology for learning mathematically rigorous logic-based and expressive specifications over the latent features of a given machine learning model. By building on preliminary results [Gopinath et al. 2019, Geng et al. 2023], they will explore a broader class of neural specifications.
[Gopinath et al. 2019] Divya Gopinath et al. - "Property Inference for Deep Neural Networks" (ASE
2019)
[Geng et al. 2023] Chuqin Geng et al. - "Towards Reliable Neural Specifications" (ICML 2023)
Principales activités
This investigation will focus on innovation and generalization across three key aspects:
1) Neural Predicate Design: Previous studies primarily considered neuron-level abstractions using basic binary predicates (active or inactive), essentially unary relations capturing only single-neuron states. Such unary predicates have limited expressiveness, as they cannot adequately represent dependencies or hierarchical structures involving multiple neurons. To overcome this limitation, we will first investigate predicates with higher arities-binary or ternary relations-that allow explicit representation of interactions and hierarchical relationships among neurons. For transformer-based architectures, neuron-level predicates are likely insufficient due to scalability challenges. To address this scalability issue, we will investigate higher-level abstractions tailored explicitly for transformers by introducing neural predicates defined over attention heads.
2) Specification Structure: The structure of neural network specifications significantly influences their generalization capability, effectiveness, and scalability. Current specification structures, such as those employed in VNN-COMP (https://sites.google.com/view/vnn2025), consist of a reference input paired with a small bound quantifying allowable perturbations. This design is inherently limited, as it specifies only a narrow region around a single input and fails to
generalize effectively even to closely related training inputs. In [Geng et al. 2023], a novel specification structure was introduced, comprising pairs of neural predicate sets - one set indicating neurons that must remain active and another indicating neurons that must remain inactive. This innovative approach successfully enabled, for the first time, comprehensive verification of neural networks across the entire input space of the MNIST handwritten digits. Building upon this success, we will generalize the specification structure further by re-designing it into a set of Horn clauses formulated over neural predicates. Utilizing Horn clauses in combination with richer neural predicates will yield a highly expressive and formally grounded language capable of representing a significantly broader and more flexible range of neural network specifications compared to existing methods.
3) Specification Learning Algorithms: The design of learning algorithms is significantly influenced by both neural predicates and specification structures. We will initially explore a general statistical approach that remains largely agnostic to specific neural predicates or specification structures. The core idea involves collecting inference traces from a dataset (either the original training set or an independent validation set), computing statistical distributions for each neural predicate, and subsequently composing candidate specifications based on these statistics. The complexity of potential neural predicates depends on factors including neuron count, abstract domain size, predicate relation types, and their arities. To ensure that the neural predicate space remains practically manageable (approximately linear with respect to neuron count), we will carefully consider network architecture and introduce pragmatic assumptions, such as locality constraints. Specifically, predicates with higher arities like binary or ternary relations will be restricted to neurons within the same convolution window, the same layer, or neighboring layers. The subsequent step - composition of specifications - is inherently challenging, as it fundamentally involves rule synthesis. Prior work [Geng et al. 2023] partially addressed this challenge through a simple yet effective approach: selecting an appropriate threshold and then conjunctively combining neural predicates that exceed this threshold. Initially, we will adopt this effective strategy to develop a functional prototype. Later, the methodology will perform systematic refinement and reduce redundancy in learned specifications, incorporating insights from formal methods, programming languages, and machine learning disciplines.
Avantages
- Subsidized meals
- Partial reimbursement of public transport costs
- Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
- Possibility of teleworking (after 6 months of employment) and flexible organization of working hours
- Professional equipment available (videoconferencing, loan of computer equipment, etc.)
- Social, cultural and sports events and activities
- Access to vocational training
- Social security coverage
RÉSUMÉ DE L' OFFRE
Neural Network Specification Learning
Inria
Paris
il y a 2 jours
S/O
Temps plein
Neural Network Specification Learning