How SPL Boosts iLogtail 2.0: Combining Performance and Flexibility in Log Processing
This article traces the evolution of streaming processing languages, compares iLogtail's native and extended pipeline modes, and demonstrates how the new SPL syntax in iLogtail 2.0 delivers high‑performance, flexible log and time‑series data processing with unified, SQL‑like commands and interactive debugging tools.
Evolution of Stream Processing Languages
Early concepts of stream processing appeared in the 1970s with array‑oriented languages such as APL and the introduction of UNIX pipes, allowing command‑line chaining of output to input. Java added a Stream API in 2014, offering chainable, lazy, internal iteration for collections. Distributed frameworks like Apache Storm, Samza, Flink and Beam later provided sophisticated stream processing features such as event‑time handling and windowed computation, while Beam introduced a unified model for batch and stream processing. SQL‑style streaming query languages (Flink SQL, KSQL) enable complex logic using familiar syntax.
Log and time‑series data, typical semi‑structured sources, inspired languages like KQL (Kusto) and SPL (Splunk Processing Language) that emphasize intuitive search, powerful data manipulation, flexible analysis, real‑time and historical processing, scalability, and ease of use.
iLogtail Pipeline Modes
Native plugin mode parses logs using C++ splitter and parser components, supporting fixed formats (regex, JSON, delimiter) with high performance but limited flexibility.
Extended plugin mode forwards split logs to Golang plugins, allowing arbitrary plugin composition for complex scenarios at the cost of additional serialization overhead and reduced performance.
These modes force a trade‑off between flexibility and performance, and configuring multiple pipelines becomes cumbersome for diverse log formats.
Introducing SPL in iLogtail 2.0
iLogtail 2.0 adds SPL as a parallel processing mode, built on the SLS SPL library. SPL provides a unified, C++‑implemented operator set that approaches native performance while offering the expressive power of a SQL‑like streaming language.
SPL Syntax Overview
Command‑style statements with a pipe (|) for pipeline composition.
Structured data commands: extend to create new fields, where to filter rows.
Field operations: project, project-away, project-rename.
Unstructured extraction: parse-regexp, parse-json, parse-csv.
* | <data-source> | <spl-cmd> -option=<option> ... as <output>, ... | <spl-cmd> ...Example of field extraction:
* | parse-regexp content, '\[([^]]+)]\s+([^}]+})\s+(.*)' as time,json,stack | parse-json json | project-away garbage,json,contentAdvantages of iLogtail 2.0 + SPL
Unified syntax across iLogtail and real‑time consumption reduces configuration duplication.
C++ native operators deliver performance close to native plugins.
Full alignment with SLS SQL functions provides a rich function set.
Interactive SPL preview and intelligent suggestions simplify debugging.
Practical Example: Parsing Mixed JSON and Java Stack Logs
Original log line:
[2024-01-05T12:07:00.123456] {"message": "this is a msg", "level": "INFO", "garbage": "xxx"} java.lang.Exception: exception发生
at com.aliyun.sls.devops.logGenerator.type.RegexMultiLog.f3(RegexMultiLog.java:130)
...Pipeline configuration (multiline, regex, JSON, discard plugins) requires several steps and UI interactions.
SPL configuration achieves the same result with a concise statement:
* | parse-regexp content, '\[([^]]+)]\s+([^}]+})\s+(.*)' as time,json,stack | parse-json json | project-away garbage,json,contentThe SPL preview in the console shows the fields time, message, level while removing unwanted data, demonstrating a simpler and faster workflow.
Open‑Source iLogtail SPL Configuration
enable: true
inputs:
- Type: input_file
FilePaths:
- /home/test-log/test.log
Multiline:
StartPattern: \[\d+.*
processors:
- Type: processor_spl
Script: '* | parse-regexp content, ''\[([^]]+)]\s+([^}]+})\s+(.*)'' as time,json,stack | parse-json json | project-away garbage,json,content'
flushers:
- Type: flusher_stdout
OnlyStdout: trueRunning this configuration parses the sample logs correctly, as shown by the console output.
Conclusion
SPL brings together high performance and flexible data manipulation for log processing in iLogtail 2.0, offering a unified, SQL‑like syntax, rich function support, and interactive debugging that significantly improve both configuration simplicity and runtime efficiency.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Observability
Driving continuous progress in observability technology!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
