Artificial Intelligence 8 min read

NOSE: Enabling AI to Smell with a Unified Molecule‑Receptor‑Semantic Tri‑modal Representation

NOSE introduces a neural olfactory‑semantic embedding that unifies molecular structure, receptor sequences, and natural‑language odor descriptions into a continuous space, achieving state‑of‑the‑art results on eleven tasks and strong zero‑shot generalization for odor and receptor retrieval.

Data Party THU

May 9, 2026

NOSE: Enabling AI to Smell with a Unified Molecule‑Receptor‑Semantic Tri‑modal Representation

Problem

Olfactory perception involves three distinct information streams: the 3‑D chemical structure of a molecule, the amino‑acid sequence of an olfactory‑receptor protein, and natural‑language descriptions of the perceived odor (e.g., “floral”, “minty”, “creamy”). Existing approaches model only one or two of these streams and treat odor prediction as a classification task, which discretizes a fundamentally continuous odor space and discards structural nuances that are important for representation learning, limiting generalization.

Method: NOSE framework

NOSE (Neural Olfactory‑Semantic Embedding) unifies molecular structure, receptor sequence, and odor‑description into a single continuous embedding space. It employs three pretrained encoders: Uni‑Mol for 3‑D molecular conformations, ESM‑2 for receptor‑sequence features, and a LoRA‑fine‑tuned Qwen‑3 embedding model for odor‑description text. Semantic relations among 1,086 odor‑description words are mined with the DeepSeek large language model; pairs such as “lemon” ↔ “citrus” or “sweet” ↔ “honey” are treated as weak positive samples with intermediate weights, converting the discrete label set into a continuous semantic manifold.

Orthogonal injection

Because triplet (molecule‑receptor‑description) data are scarce, NOSE uses molecule‑receptor and molecule‑description pairs separately and bridges them through the molecule as a hub. To prevent interference when injecting receptor and semantic features into the molecular embedding, a dual orthogonal‑injection strategy is applied:

Hard orthogonal step : Gram‑Schmidt projects the adapter outputs of the receptor and description branches onto the orthogonal complement of the molecular representation, guaranteeing linear independence.

Soft orthogonal loss : a regularizer drives the two feature subspaces to remain mutually unrelated during gradient updates.

This yields independent additive increments on the molecular embedding that preserve structural priors while achieving implicit tri‑modal alignment.

Benchmark and results

Six public datasets were aggregated to construct a benchmark covering three perception levels: basic perception (detection threshold, intensity, pleasantness), semantic description (138‑class multi‑label classification and multidimensional regression), and mixture perception (binary‑mixture intensity and pleasantness). Across eleven downstream tasks, NOSE outperforms all baselines on every key metric, establishing state‑of‑the‑art performance.

Zero‑shot retrieval

A strict zero‑shot test set built from PubChem contains molecules that never appear in training. Retrieval is evaluated by percentile rank (lower is better). For an odorless molecule, the term “odorless” ranks first (0.092 %), followed by “slight”, “weak”, and “neutral”, indicating genuine understanding of perceptual attributes rather than reliance on frequency bias.

Receptor activation retrieval

A literature‑derived test set of molecule‑receptor pairs with known activation or non‑activation was used. Activation pairs are ranked within the top 2 % across diverse chemical families; non‑activation pairs fall between 30 % and 80 %, demonstrating clear separation and potential utility for bio‑screening.

Significance

NOSE is the first framework to embed the three olfactory modalities—molecular structure, receptor protein, and human perception—in a unified, searchable, and computable space. The contrastive‑learning‑based alignment can be extended to other chemical domains such as electrolyte solvents or plating additives, offering a new representation paradigm for AI‑driven molecular design.

Paper: https://arxiv.org/abs/2604.10452v1

Code: https://github.com/Xianyusyy/NOSE

Code example

来源：ScienceAI
本文
约2200字
，建议阅读
5
分钟
为AI驱动的分子设计提供新的表征范式。

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

deep learning contrastive learning multimodal representation zero-shot learning molecular design olfaction AI

Written by

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.