Databases 9 min read

Doris 2.0.2 vs 1.2.3: Real‑World Query Performance Comparison

After upgrading a Doris cluster from version 1.2.3 to 2.0.2, the author runs a series of SQL benchmarks—including PK lookups, top‑client queries, distinct counts on low‑ and high‑cardinality columns, minute‑level session analysis, and full‑table deduplication—to measure execution times, revealing mixed performance gains and regressions across the seven test scenarios.

ITPUB
ITPUB
ITPUB
Doris 2.0.2 vs 1.2.3: Real‑World Query Performance Comparison

0. Preparation

The Doris cluster was upgraded from 1.2.3 to 2.0.2. The goal is to compare query efficiency between the two versions using the same data sets and SQL statements.

1. PK Test – Count domains where target_ip is empty (lower‑cased)

select domain, count(domain) as count
from (
  select lower(domain) as domain
  from logs_from_spark01
  where target_ip='""'
) t
group by t.domain;

Result on 1.2.3: 0.21 s . 2.0.2 (average of multiple runs): 0.35 s . The upgraded version is slightly slower.

2. Top 100 client_ip by access count (with location)

select t1.client_ip, t2.nature, t2.province, t2.city, t1.count
from (
  select client_ip, count(client_ip) as count
  from logs_from_spark01
  group by client_ip
  order by count desc
  limit 100
) t1
inner join logs_from_spark01 t2 on t1.client_ip = t2.client_ip
group by client_ip, nature, province, city, count;

Result on 1.2.3: 2.67 s . 2.0.2 (worst of several runs): 2.12 s . The newer version is faster.

3. Count distinct low‑cardinality column (client_ip)

select count(distinct client_ip) from logs_from_spark01;

Result on 1.2.3: 0.25 s . 2.0.2: 0.24 s . No noticeable difference.

4. Count distinct high‑cardinality column (domain, case‑sensitive)

select count(distinct domain) from logs_from_spark01;

Result on 1.2.3: 3.40 s . 2.0.2 (worst of several runs): 3.26 s . The upgraded version is faster.

5. Top 100 client_ip by continuous minute‑level sessions

select client_ip, max(row_num2) as max
from (
  select client_ip, row_number() over (partition by client_ip, sub_date) as row_num2, date_min
  from (
    select client_ip, sub_date,
           row_number() over (partition by client_ip, sub_date order by date_min) as row_num,
           minutes_sub(to_date(date_min), row_num) as sub_date
    from (
      select client_ip, minute_floor(time) as date_min
      from logs_from_spark01
      where length(time)=14 and time like '20220730%'
    ) t
  ) A
  where A.row_num = 1
) B
group by client_ip
order by max desc
limit 100;

Result on 1.2.3: 17.57 s . 2.0.2 (worst of several runs): 13.26 s . The newer version is faster.

6. List all countries, provinces, cities, and operators per client_ip

select nature, province, city, operator
from logs_from_spark01
group by nature, province, city, operator;

Result on 1.2.3: 0.68 s . 2.0.2 (average of runs): 0.81 s . Slightly slower after upgrade.

7. Count total rows without duplicates (full‑row deduplication)

SELECT count(*)
FROM (
  SELECT *, row_number() over (partition by client_ip, nature, province, city, operator, domain, time, target_ip, rcode, query_type, authority_record, add_msg, dns_ip) as row_num
  FROM logs_from_spark01
) t
WHERE t.row_num = 1;

Result on 1.2.3: 33 s . 2.0.2 (average of runs): >40 s. The upgraded version is slower.

Overall Conclusion

The author selected the worst‑case execution time for queries that showed improvement and the average time for queries that regressed, arguing that this approach fairly reflects the real impact of the upgrade. Across the seven benchmark scenarios, Doris 2.0.2 delivers mixed results: some queries run faster, while others become marginally slower or significantly slower.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLperformance benchmarkquery optimizationDatabase Upgradedoris
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.