AI FAAS Solution for Advertising Tools: Architecture, Platform, and Real‑time Feature SQL Production
The team built a lightweight AI‑FAAS platform that migrates C++ ad‑algorithm services to a Java‑centric, cloud‑native environment, encapsulates core operators as DolphinSQL plugins, and defines real‑time feature pipelines via standardized SQL, cutting service creation from weeks to days, enabling rapid iteration, high operator reuse, and near‑instant feature delivery.
Background
With the slowdown of internet user growth and intensified competition, advertisers face a complex product matrix from Alibaba Mama. To improve ad revenue and build a customer ecosystem, B‑side algorithm teams need to provide fine‑grained algorithmic capabilities. However, they suffer from high development and operation costs, low iteration efficiency, poor operator reuse, and delayed feature timeliness.
Key challenges
High development cost: >50 scenarios, many services, many C++ components, long setup time.
Insufficient monitoring: weak ops system, hard to detect failures.
High O&M cost: many services, difficult troubleshooting, expensive upgrades.
Core operator reuse is low because operators are tightly coupled with business logic.
Feature timeliness is low; most features are offline T+1.
To address these issues, the team designed a lightweight AI FAAS solution that lets algorithm engineers focus on business logic while engineering engineers handle the framework.
AI FAAS Solution
The solution tackles three aspects:
Iteration efficiency: migrate C++ services to a Java‑centric cloud‑native FAAS platform, providing an end‑to‑end AI FAAS development platform (coding, gray‑release, monitoring, O&M, experiments, stress testing, logging, debugging).
Operator reuse: encapsulate core operators into Dolphin plugins and expose them via DolphinSQL services, enabling interactive analysis on the Aurora platform.
Feature timeliness: define a standardized real‑time feature pipeline, using SQL to produce features, thus simplifying development and inspection.
FAAS Development Platform
Built on Function‑as‑a‑Service (FAAS) and an MPP unified SQL engine (Dolphin), the platform includes:
R&D control console for one‑click service launch, debugging, logging, and experiments.
FAAS deployment on Alibaba Cloud Function Compute, supporting multi‑region rollout, gray release, auto‑scaling, and no manual resource requests.
Engineering framework based on Spring Cloud containers, pre‑packaged with common middleware (Diamond, HSF, IGraph, Redis, SLS, Dolphin, Sunfire) and utility libraries (LogUtil, Concurrent, etc.), offering a Spring Boot‑like development experience.
Operations tools for accuracy verification, stress testing, and monitoring via Flink‑back‑filled logs to ODPS/HIVE, plus one‑click Sunfire monitoring integration.
DolphinSQL Service
Traditional C++‑centric algorithm services require integrating many client libraries, raising learning and integration costs. DolphinSQL provides a unified SQL interface, hiding underlying components. After migration, an algorithm service only needs the Dolphin client, and the entire computation can be expressed in SQL.
CREATE MODEL rtp_dolphin.dolphin_alime_ctr_v1_model WITH (cm2_cluster='rtp_ads_internal', zk_host='test', zk_root='test', biz='test', out_fmt='xml', debug='false', attribute='["test"]'); SELECT * FROM rtp_dolphin.dolphin_alime_ctr_v1_model WHERE item_list IN (1000128836) AND qinfo='{}' AND context='{"field_names":[],"docs":[]}';
SELECT id, pm_squared_euclidean_distance(feature, '{0.1,0.1,0.1,0.1}') AS distance FROM feature_tb WHERE cate_id IN (1,3,12) ORDER BY distance ASC LIMIT 10;
SELECT * FROM ( SOLVESELECT quality IN (SELECT * FROM solve_db_test WHERE adgroup_id = 461628001 LIMIT 200) AS u MAXIMIZE (SELECT SUM(quality * trade) FROM u) SUBJECTTO (SELECT SUM(ctr_threshold * quality * impression - quality * click) <= 0 FROM u), (SELECT 0 <= quality <= 1 FROM u), (SELECT SUM(quality) <= 20 FROM u) USING solverlp) AS s WHERE quality = 1;
Real‑time Feature SQL Production
Developing real‑time features traditionally requires expertise in streaming, Flink, storage, etc. By leveraging the SQL engine, feature pipelines become SQL‑centric, reducing development time from weeks to hours.
CREATE TABLE test_input ( user_id STRING, tool_id STRING, label STRING, behavior_time STRING ) WITH (bizType='tt', topic='test_input', pk='user_id', timeColumn='behavior_time'); CREATE TABLE test_output ( user_id STRING, tool_id STRING, label STRING, behavior_time STRING ) WITH (bizType='feature', pk='user_id'); INSERT INTO test_output SELECT user_id, concat_id(tool_id, behavior_time, 50) AS tool_id, concat_id(label, behavior_time, 50) AS label, concat_id(behavior_time, behavior_time, 50) AS behavior_time FROM test_input GROUP BY user_id; SELECT user_id, tool_id FROM test_output WHERE user_id IN (1234);
Typical Case: Keyword Recommendation
The solution powers the “Direct‑Train” keyword recommendation scenario, supporting dozens of algorithms (text recall, vector recall, model prediction, relevance, segmentation, normalization). Business teams only need to connect data tables via the Aurora platform, define models/operators, and develop real‑time features. The service can then be launched through DAG configuration.
Effects
Iteration efficiency: service creation reduced from 1‑2 weeks to 1‑2 days; new services can be deployed within an hour.
Operation efficiency: real‑time monitoring of core KPIs, instant alerts, and log analysis via SLS and back‑filled logs.
Operator reuse: core operators unified in DolphinSQL and reused across multiple business scenarios.
Real‑time feature development: from weeks to 1‑2 hours for definition, development, and deployment.
Outlook
Future work includes further optimization of the FAAS kernel, richer asynchronous execution frameworks, enhanced unified SQL capabilities (materialized views, vector execution, more feature operators), and broader support for AI algorithm services.
References
[1] Function‑as‑a‑Service – https://cn.aliyun.com/analyst-reports/forrester-wave-function-as-a-service-platforms-q1-2021
[2] Alibaba Cloud Function Compute – https://www.aliyun.com/product/fc
[3] Spring Cloud – https://spring.io/projects/spring-cloud
[4] ODPS – https://www.aliyun.com/product/odps
[5] Data Lake – https://www.aliyun.com/solution/doc/datalake
[6] Sunfire – https://help.aliyun.com/apsara/agile/v_3_4_0_20200930/rdc/paas-product-introduction/what-is-sunfire.html
Alimama Tech
Official Alimama tech channel, showcasing all of Alimama's technical innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.