Apache Kylin Single‑Node Installation Guide and Troubleshooting
This article provides a comprehensive step‑by‑step guide for installing Apache Kylin on a single machine, covering required software versions, environment variable configuration, Spark dependency handling, main Kylin properties, verification steps, and detailed solutions to common errors such as Zookeeper host issues, HTTP 404, Jackson conflicts, MapReduce jobhistory problems, missing Spark classes, HiveConf errors, and YARN shuffle service configuration.
Apache Kylin™ is an open‑source distributed analytical engine that provides SQL query interface and OLAP capabilities on top of Hadoop/Spark, originally developed by eBay.
This article presents a step‑by‑step single‑node installation “pitfall‑record”, including required software versions, environment variable settings, Spark download and dependency packaging, main configuration parameters, and how to verify the runtime environment.
Key version list (Java, Hadoop, HBase, Hive, Kylin, Spark) is shown:
java-1.8.0-openjdk-1.8.0.191.b12.x86_64
hadoop-2.8.5
hbase-1.4.10
hive-2.3.5
apache‑kylin-2.6.3-bin-hbase1x
spark-2.3.2
spark-2.3.2‑yarn‑shuffle.jarEnvironment variables (HADOOP_HOME, HBASE_HOME, HIVE_HOME, KYLIN_HOME, etc.) are exported as shown:
export HADOOP_HOME=/home/admin/hadoop-2.8.5
export PATH="$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH"
export HBASE_HOME=/home/admin/hbase-1.4.10
export HIVE_HOME=/home/admin/hive-2.3.5
export KYLIN_HOME=/home/admin/kylin-2.6.3From Kylin 2.6.1 onward Spark binaries are no longer bundled; the script $KYLIN_HOME/bin/download‑spark.sh must be run, and the resulting jars are packaged into spark‑libs.jar and uploaded to HDFS.
Main Kylin configuration resides in $KYLIN_HOME/conf/kylin.properties; important sections include metadata store, server mode, Spark engine configs, storage settings, job limits, and security profiles.
After configuration, run $KYLIN_HOME/bin/check‑env.sh to verify the environment, then load sample data with $KYLIN_HOME/bin/sample.sh and start the server with $KYLIN_HOME/bin/kylin.sh start.
Common troubleshooting items are listed:
ZK‑UnknownHostException: adjust hbase.zookeeper.quorum and hbase.zookeeper.property.clientPort in hbase-site.xml.
HTTP 404 on Kylin UI: comment out the HTTPS connector in tomcat/conf/server.xml and restart.
Jackson jar conflict: rename or remove jackson-datatype-joda‑2.4.6.jar from Hive lib directory.
MapReduce jobhistory 10020 connection refused: start the JobHistory server and set mapreduce.jobhistory.address and mapreduce.jobhistory.webapp.address in mapred‑site.xml.
Spark class not found errors: copy missing Spark and Scala jars (e.g., spark‑core_2.11‑2.1.2.jar, scala‑library‑2.11.8.jar) to $KYLIN_HOME/tomcat/lib and restart.
Missing HiveConf: add Hive libraries to HBASE_CLASSPATH_PREFIX in kylin.sh.
YARN aux‑service spark_shuffle does not exist: add spark_shuffle to yarn.nodemanager.aux-services and set its class to org.apache.spark.network.yarn.YarnShuffleService in yarn‑site.xml, then distribute the spark‑yarn‑shuffle.jar to NodeManager lib directories.
After applying the fixes, the Kylin server starts successfully and the sample cube can be built.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
