Cloud Native 12 min read

How LoongCollector’s OneTime File Collection Transforms Static Log Migration

LoongCollector’s OneTime file collection feature enables fast, reliable migration of historical logs, data back‑filling, and batch processing by scanning files once, using checkpoints for fault tolerance, configurable execution windows, and rate‑limiting to avoid impacting live data streams.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How LoongCollector’s OneTime File Collection Transforms Static Log Migration

OneTime file collection mode

LoongCollector (Alibaba Cloud Log Service’s next‑generation data collector) provides a OneTime mode for scenarios such as historical log migration, data back‑filling, or temporary batch processing where a continuously running collector is unsuitable.

Pipeline types

Continuous : The pipeline stays resident and continuously discovers new data (e.g., input_file).

OneTime : The pipeline runs once, processes all files that match the configured paths at start‑up, then exits (e.g., input_static_file_onetime).

OneTime configuration basics

The essential fields are:

enable: true
global:
  ExcutionTimeout: 3600   # seconds, default 10 min, range 10 min–1 week
inputs:
  - Type: input_static_file_onetime
    FilePaths:
      - /var/log/history/*.log
flushers:
  - Type: flusher_stdout
    OnlyStdout: true
    Tags: true

When global.ExcutionTimeout is present, LoongCollector treats the pipeline as OneTime and calculates an expiration time (start + timeout).

Execution and expiration windows

Config delivery window : Only agents that have reported a heartbeat within a short period (default 5 minutes) after the configuration is created receive it.

Execution window : The pipeline runs for at most global.ExcutionTimeout (default 10 minutes, configurable up to 1 week).

Retention period : The server keeps the configuration for 7 days for troubleshooting or reuse.

Checkpoint mechanism

Two checkpoint files guarantee reliability across restarts:

Config‑level checkpoint ( /etc/ilogtail/checkpoint/onetime_config_info.json) stores config_hash, expire_time, inputs_hash and excution_timeout. It is used to restore the expiration time and decide whether a configuration needs to be re‑run after an update.

File‑level checkpoint (

/etc/ilogtail/checkpoint/input_static_file/{config_name}@0.json

) records per‑file progress (device, inode, signature hash, size, status, timestamps). Example content:

{
  "config_name": "example",
  "expire_time": 1768550944,
  "file_count": 1,
  "files": [
    {
      "dev": 2051,
      "filepath": "/var/log/tmpfs.log",
      "finish_time": 1768550345,
      "inode": 2888304,
      "size": 1282,
      "start_time": 1768550345,
      "status": "finished"
    }
  ],
  "finish_time": 1768550345,
  "input_index": 0,
  "start_time": 1768550344,
  "status": "finished"
}

Resource usage and throughput control

The OneTime input plugin ( input_static_file_onetime) runs in a single thread inside LoongCollector’s StaticFileServer, avoiding uncontrolled concurrency.

Implemented in native C++, it can ingest up to 300 MB/s for single‑threaded text logs.

Sending rate can be limited with flusher_sls.MaxSendRate (bytes per second) to protect network bandwidth and SLS write quotas.

Best‑practice scenarios

Large‑scale backfill : For 1 000 machines each needing to backfill ~10 GB, set MaxSendRate to ≈290 000 B/s (≈0.28 MB/s per machine) and increase ExcutionTimeout to 86 400 s (1 day) to avoid quota exhaustion and ensure completion.

Partial time‑range backfill : Combine the native timestamp filter processor ( processor_timestamp_filter_native) with JSON parsing processors to keep only events within the desired window, preventing duplicate ingestion of already collected data.

Correcting a faulty configuration : If the initial OneTime config produces unexpected data, set ForceRerunWhenUpdate: true to force a re‑run after updating the configuration, then verify the new output. Erroneous data can be removed with Log Service’s soft‑delete feature.

Rate Limitinglog collectionCheckpointLoongCollectorOneTime
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.