Using Field-aware FM (FFM) Models for Unified Recall in Recommendation Systems
This article explores how Field-aware Factorization Machines (FFM) can be employed to replace multi‑path recall strategies in industrial recommendation systems, detailing model principles, embedding construction, integration of user, item and context features, performance considerations, and potential for unifying recall and ranking stages.
Industrial Recommendation System Architecture
A typical industrial recommendation system consists of online, near‑line, and offline components. The online part performs recall (reducing candidate items to a few thousand), optional coarse ranking, and fine ranking, followed by business rules such as diversification and advertising. Near‑line continuously collects user feedback and updates models, while offline processes massive logs to retrain models using distributed machine‑learning platforms.
Multi‑path Recall
In practice, recall is implemented as a multi‑path strategy where dozens of independent recall channels (e.g., interest tags, topics, entities, collaborative filtering, hot items, geographic relevance) each retrieve a fixed number of candidates. The number of candidates per channel is a hyper‑parameter that is usually tuned via A/B testing, but it is not personalized per user.
What Is Field‑aware FM (FFM)?
FFM extends the classic Factorization Machine by learning a separate embedding vector for each feature‑field pair. For a feature belonging to field A, the model uses different embeddings when it interacts with features from field B, field C, etc. This increases expressive power at the cost of a larger parameter count.
Using FFM for Recall
The article investigates whether an FFM model can serve as a unified recall model, answering two key questions: (1) can a single model replace the many recall paths, and (2) can the same model also replace the ranking stage? The discussion covers offline training, embedding generation, and online retrieval using Faiss.
Simplified FFM Recall Model
In the simplest version, user and item features are split into separate sets. After offline training, each user obtains a set of field‑aware embeddings, and each item obtains its own embeddings. User embeddings are stored in an in‑memory store (e.g., Redis) and item embeddings in a Faiss index. At request time, the user embedding is retrieved and inner‑producted with all item embeddings to obtain top‑K candidates.
Embedding Construction Details
Because FFM requires a distinct embedding for every feature‑field pair, the user embedding size becomes M × N × k (M user fields, N item fields, k latent dimension). To keep retrieval fast, the article proposes concatenating the aligned embeddings into a single vector and using Faiss for efficient similarity search.
Incorporating User/Item Internal Interactions
Beyond cross‑field interactions ( ), the model can also embed user‑user and item‑item second‑order terms by adding extra dimensions to the user/item vectors before indexing, allowing these internal interactions to influence the inner‑product score.
Adding First‑order Terms
First‑order (linear) terms are incorporated by appending the sum of a field’s linear weights to the corresponding side of the embedding (or by concatenating the raw linear weight and fixing the counterpart to 1), ensuring they contribute to the Faiss inner‑product.
Context Features
Real‑time context features (e.g., timestamp, device, location) are handled by fetching their field‑aware embeddings on‑the‑fly, splitting them into user‑side and item‑side parts, and adding the resulting inner‑product scores to the recall score. This allows dynamic context to participate without offline pre‑computation.
Speed Optimizations
Because the full FFM embedding can be extremely long (e.g., 25 000 dimensions for 50 × 50 fields with k = 10), two acceleration strategies are proposed: (1) segmenting the embedding into multiple shorter slices and querying separate Faiss shards in parallel, and (2) hybridizing with a plain FM where item‑side embeddings are summed per field, reducing size to M × k while retaining most of FFM’s expressive power.
Unified Recall vs. Multi‑path
Unified recall using a single model yields comparable scores across all candidates, eliminating the need for per‑channel hyper‑parameters and enabling per‑user personalization of the recall distribution. However, adding a new recall channel now requires retraining the model with new features, which is less flexible than simply plugging in a new path.
Integrating Multi‑path into FFM
Most existing recall channels can be expressed as additional features (e.g., geographic tags, hot‑item flags) within the FFM model. Collaborative‑filtering can also be represented by user‑ID and item‑ID fields, though the parameter explosion may require pre‑computed ID embeddings rather than raw IDs.
Can FFM Merge Recall and Ranking?
If the FFM model includes all features used in the ranking stage, it can theoretically produce final recommendation scores directly. The feasibility depends on embedding size: with many fields the resulting vectors become too large for real‑time Faiss lookup, unless dimensionality is reduced or hybrid FM/FFM techniques are used.
Conclusion
The article concludes that FFM offers a powerful, unified framework for recall and, potentially, ranking, but its practical adoption hinges on careful engineering to manage embedding dimensionality and retrieval speed.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.