Artificial Intelligence 15 min read

Online Learning for Real‑Time Ranking in Alibaba's Home‑Decor Channel

The article details Alibaba’s end‑to‑end online‑learning pipeline for real‑time ranking in the Taobao home‑decor channel, covering UT log parsing, full‑feature extraction, ODL sample creation, xDeepCTR model training, and deployment, which yielded up to 7.8% CTR improvement and demonstrates the value of rapid model adaptation.

DaTaobao Tech
DaTaobao Tech
DaTaobao Tech
Online Learning for Real‑Time Ranking in Alibaba's Home‑Decor Channel

This article is the third part of a series that shares practical experiences on recall, ranking, and cold‑start modules in the "Every Home Every Room" channel of Taobao.

Background : The channel delivers scene‑based content with embedded product anchors. Improving ranking efficiency is crucial for traffic distribution, and online learning can accelerate model adaptation to changing user behavior.

Streaming Sample Generation involves three steps: UT behavior log parsing, full‑feature extraction, and ODL training sample creation.

UT Log Parsing extracts exposure and click events from UT logs and writes them to a TT stream. Example SQL definition:

create table r_ihome_lapp_content_expo (
    pvid                VARCHAR,
    user_id             VARCHAR,
    item_id             VARCHAR,
    server_timestamp    VARCHAR
) with (
    type='tt',
    topic='dwd_ihome_lapp_content_expo_sample',
    accessKey=''
);
INSERT INTO r_ihome_lapp_content_expo
select * FROM XXX WHERE YYY;

Full‑Feature Extraction uses the AMC feature center to capture user, item, and context features for each exposure/click, storing them in TT streams.

ODL Training Sample Generation builds a backbone table by joining the main exposure event with click, detail‑click, and full‑tracking events using keyed_join . The backbone is then transformed into training samples.

pv_event = session.get_table('event.' + EVENT_NAME_EXPO)  # main event
full_tracking_event = session.get_table('event.' + EVENT_NAME_FULL_TRACK)
click_event = session.get_table('event.' + EVENT_NAME_CLICK)
detail_click_event = session.get_table('event.' + EVENT_NAME_DETAIL_CLICK)

wide_table = pv_event.keyed_join(click_event, condition='pv_id=pv_id,item_id=item_id',
                                 join_type='ONE_TO_MANY', left_wait_second=630, tps=100)
wide_table = wide_table.keyed_join(detail_click_event, condition='pv_id=pv_id,item_id=item_id',
                                   join_type='ONE_TO_MANY', left_wait_second=1, tps=100)
output_table = wide_table.keyed_join(full_tracking_event, 'pv_id=pv_id,item_id=item_id',
                                 join_type='ONE_TO_ONE', left_wait_second=-630, tps=3000)
output_table.insert(session.register_table(BackboneSink(Backbone(BACKBONE_NAME))))

After backbone construction, the pipeline extracts required fields, adds label information, and writes samples to both TT (for offline analysis) and Swift (for online training).

backbone = session.get_table(BACKBONE_NAME)
# lg
wide_table = backbone.select('*, event_'+EVENT_NAME_FULL_TRACK+'_features as features')
wide_table = wide_table.join_lateral(Lg('*', lgClass='com.alibaba.pyporsche.ihome.IhomeLappClickLG'))
wide_table = wide_table.filter("features IS NOT NULL")
wide_table = wide_table.with_column(AddLabelToFeatures('features,label', label_name="click_label").as_('features'))
# fg
wide_table = wide_table.select('event_id as uniqueId, features, label, type')
# sink tt
tt_sink = TTSink(TT(topic='ihome_lapp_rank_sample_tt', access_key=''),
                 line_separator='\n', field_separator='\t')
tt_sink = session.register_table(tt_sink)
wide_table.filter('rand() < 0.1').insert(tt_sink)
# swift sink
swift_sink = session.register_table(SwiftSink(Swift('ihome_lapp_rank_sample_event')))
wide_table.insert(swift_sink)

Model Real‑Time Training uses the xDeepCTR framework on the AOP platform. The main entry script specifies the source as an empty string (Swift samples) and initializes parameters from the latest offline model version.

from aop import odps_table, tf_train, AOPClient
if __name__ == '__main__':
    fg_path = './ihome_rank_model_fg.json'
    user_params_path = "./user_params.json"
    algo_conf_path = './algo_conf.json'
    running_config_path = "./running_config.json"

    repo_name = 'xDeepCTR'
    zip_name = './' + repo_name + '.zip'
    model_path = repo_name + '/xdeepctr/models/multitask/mmoe.py'
    model_name = "ihome_rank_demo_mmoe_odl"

    ACCESS_ID = 'XXXX'
    ACCESS_KEY = 'YYYY'

    source = ""  # source empty for Swift
    train = tf_train(source,
                     fg_config=fg_path,
                     model_path=model_path,
                     model_name=model_name,
                     user_params=user_params_path,
                     train_from_model='ihome_rank_demo_mmoe',
                     train_from_version='NEWEST',
                     ps_num=2,
                     worker_num=3)
    with AOPClient(model_name) as client:
        client.add_code(zip_name)
        client.add_resource(algo_conf_path)
        client.add_debug_version("aop_version_tf112")
        client.add_runconf(running_config_path)
        client.run(train)

Configuration for ODL model update (hook) is provided in JSON format, specifying topics, intervals, and thresholds.

{
  "customized_functions": {
    "odl_model_update": {
      "open": true,
      "is_sync": true,
      "rtp_table_name": "ihome_rank_demo_mmoe_odl",
      "rtp_table_topic": "ihome_rank_demo_mmoe_odl_swift_${today}",
      "swift_partition_count": 32,
      "swift_partition_max_buffer_size": 5120,
      "reuse_topic": false,
      "interval_time": 300,
      "dense_send_interval_time": 300,
      "sparse_send_interval_time": 900,
      "first_trigger_time": 600,
      "global_auc_threshold": "0.68",
      "current_auc_threshold": "0.68",
      "part_strategy": "div",
      "check_numerics": false
    }
  }
}

The end‑to‑end pipeline is scheduled in cycles (e.g., weekly) to reload batch model parameters, initialize ODL models, push updated topics to RTP, and start online training.

Business Impact : Online A/B tests show that the ODL model improves CTR by up to +7.83% (daily) and +7.04% (Double‑11), with notable gains in per‑user exposure, clicks, and detail‑page clicks. Full‑feature extraction also contributes additional uplift.

Conclusion : Real‑time learning is critical for capturing user interest shifts. The described ODL pipeline leverages internal platforms (AMC, PyPorsche, AOP, xDeepCTR) to accelerate development and achieve measurable business benefits.

Team Introduction : The Alibaba Taobao Intelligent Team, a data‑and‑algorithm group serving multiple e‑commerce scenarios, invites interested talent to apply.

Alibabarecommendation systemmodel trainingonline learningreal-time rankingstreaming data
DaTaobao Tech
Written by

DaTaobao Tech

Official account of DaTaobao Technology

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.