Artificial Intelligence 12 min read

Practical Approaches to Deploying Machine Learning Models: PMML, Rserve, and Spark in Production

This article shares practical engineering experiences for deploying machine learning models in production, covering three typical scenarios—real‑time small data, real‑time large data, and offline predictions—and detailing how to use PMML, Rserve, Spark, shell scripts, and related tools to meet performance and operational requirements.

Ctrip Technology

Jan 5, 2017

Practical Approaches to Deploying Machine Learning Models: PMML, Rserve, and Spark in Production

Author Bio Pan Pengju, BI manager at Ctrip Hotel R&D, focuses on using machine learning to automate business processes, improve system intelligence, and optimize efficiency.

When a complex model (e.g., GBDT or XGBoost) improves accuracy but is deemed too heavy for deployment, engineers often request a simpler linear model for speed, leading to a trade‑off between performance and engineering constraints.

Three Deployment Scenarios

Real‑time, small‑volume predictions : Use SOA to call Rserve or a Python HTTP service. For up to ~1,000 requests per batch, 95% of responses can be returned within 100 ms. Larger batches may require batching or multithreading.

Real‑time, large‑volume predictions : Convert the trained model to PMML, wrap it in a Java class, and call it directly from Java. This eliminates external dependencies and offers fast, stable inference.

Offline (D+1) predictions : Run a simple Rscript or Python script via a scheduled job (e.g., cron) without additional services.

All three approaches share common data preprocessing steps performed within the SOA layer.

Converting and Using PMML

Most models can be exported to PMML. Useful resources include the jpmml‑evaluator library and example Java code (see GitHub ). PMML supports R, Python, Spark, XGBoost, etc., but not deep‑learning frameworks.

Python Model Deployment

Python‑sklearn and XGBoost models can be exported to PMML.

Missing‑value handling is critical for prediction correctness.

PMML inference typically takes ~1 ms per record.

R Model Deployment

Two main methods are used:

Export the R model to PMML and call it from Java.

Deploy Rserve on a server, expose it via a Java‑written SOA interface, and invoke predictions through Rserve.

Example Rserve usage:

Pred.R <- function(x1,x2,x3){
  data <- cbind(x1,x2,x3)
  # feature engineering
  score <- predict(modelname, data, type = 'prob')
  return(list(score))
}

Rserve requires two files: the model result (e.g., XX.Rdata) and a prediction script (e.g., Pred.R). Java sends input features to Rserve, receives the score list, and logs inputs/outputs for verification.

Spark Model Deployment

Train models (often XGBoost) in Scala, export to PMML if needed, and call the model class from Java.

Alternatively, keep the model in a Spark cluster, package it as a JAR, and invoke it directly.

Simple Shell‑Based Offline Deployment

For quick iteration, wrap an R prediction script in a shell script and schedule it with cron:

# predict.sh example
# Data export
data_filename=xxx
file_date=xxx
result=xxx
updatedt=xxx
cd path
hive -e "USE tmp_xxxdb;SELECT * FROM db.table1;" > ${data_filename}
# Run R script
Rscript path/predict.R $file_date
if [ $? -ne 0 ]; then
  echo "Running RScript Failure"
fi
# Load result back to Hive
list1="use tmp_htlbidb; load data local inpath 'path/$result' overwrite into table table2 partition(dt='${updatedt}');"
hive -e "$list1"

Schedule with cron:

crontab -e
# Run daily at 5 AM
0 5 * * * sh predict.sh

Data Flow and Operational Tips

Separate offline and real‑time data; store both in Redis with appropriate TTLs.

Keep two batches of Redis data to fallback if the latest batch is unavailable.

Use scheduling tools to generate signals for offline data ingestion.

For real‑time, maintain historical (A‑table) and current (B‑table) datasets in Redis and combine them in the SOA layer.

Instrument model inputs/outputs for logging and monitoring (e.g., using Elasticsearch).

Implement disaster‑recovery: on timeout, front‑end should return a safe default value.

The author invites readers to discuss best practices and shares additional reading links on engineering culture, CSS tricks, service discovery, and big‑data risk control.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Machine Learning Model Deployment Spark offline inference PMML real-time serving Rserve

Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.