Mastering Entity Queries in UModel: Fast, Cross‑Domain Retrieval and Analysis
This article explains how UModel’s Entity query, built on the USearch engine, enables fast, precise, and cross‑domain retrieval of runtime entity data, outlines its storage architecture, query syntax, scoring mechanisms, performance tips, and real‑world use cases for observability operations.
Background
In observability systems, UModel defines a unified data model (Schema). While UModel queries focus on metadata, Entity queries target concrete runtime entity instances such as services, Pods, or hosts. Entity queries leverage the USearch engine to provide full‑text search, exact lookup, conditional filtering, and cross‑entity‑type joins.
Problems Addressed
Typical observability scenarios require:
Quick entity location : Find entities by keyword or attribute.
Cross‑domain retrieval : Search across APM, K8s, cloud resources, etc.
Precise ID lookup : Retrieve specific entities by known IDs.
Conditional filtering : Apply complex attribute‑based filters.
Statistical analysis : Aggregate and compute over entity data.
Entity queries solve these pain points with a unified interface, eliminating the need for multiple systems.
Three Types of Queries in EntityStore
UModel query : Operates on the unified schema.
Entity query : Retrieves runtime entity data.
USearch query : Provides the underlying search capabilities.
Storage Architecture
USearch uses a three‑layer storage structure to ensure logical isolation and efficient queries:
Workspace layer : Top‑level isolation; workspaces are completely independent.
Domain layer : Business‑logic grouping (e.g., apm, k8s, acs).
EntityType layer : Concrete entity types (e.g., apm.service, k8s.pod).
Workspace: my-observability
├── Domain: apm
│ ├── EntityType: apm.service
│ ├── EntityType: apm.host
│ └── EntityType: apm.instance
├── Domain: k8s
│ ├── EntityType: k8s.pod
│ ├── EntityType: k8s.node
│ └── EntityType: k8s.service
└── Domain: acs
├── EntityType: acs.ecs.instance
└── EntityType: acs.rds.instanceData Storage Features
Uniqueness guarantee : __entity_id__ ensures unique entities within an EntityType.
Columnar storage : Supports multi‑row, multi‑column tables; SPL can perform statistical analysis.
Index optimization : Full‑text index tuned for search scenarios, supporting multi‑keyword ranking.
Time‑series support : Query by time range and retrieve historical entity states.
Core USearch Capabilities
Multi‑type joint search : Query across domains and entity types with unified scoring.
Multi‑keyword scoring : Relevance based on term weight, field weight, document length, and inverse document frequency.
Smart tokenization : Automatic segmentation and relevance scoring.
Query Syntax
Basic structure:
.entity with(
domain='domain_pattern', -- domain filter
name='type_pattern', -- entity type filter
query='search_query', -- full‑text condition
topk=10, -- result limit
ids=['id1','id2'] -- exact ID list (optional)
)Key Parameters
domain : Supports fnmatch wildcards (*, ?). Example: domain='ac*' matches domains starting with "ac".
name : Entity type pattern, e.g., name='*instance'.
query : Full‑text expression; can specify field limits, phrases, logical operators, and special characters.
Field‑limited search: query='description:"error handling service"' Phrase search: query='"opentelemetry.io/name-fraud-detection"' Logical combos: query='service_name:web AND status:running' Special characters must be quoted, e.g., query='description:"ratio is 1:2"' topk : Controls the number of returned rows; set according to actual needs.
ids : Exact ID lookup for batch retrieval.
Scanning Mode
Beyond search, USearch supports a scanning mode that reads raw data and applies SPL for complex filtering and computation.
# Example: count Hong Kong apps in APM domain
.entity with(domain='apm', name='apm.service')
| where region_id='cn-hongkong'
| stats count=count() by language
| project language, count
| sort count descScoring and Sorting
USearch combines multiple factors for relevance scoring:
Term frequency : Frequency of the keyword in a document.
Field weight : Importance of fields (e.g., name > description).
Document length : Shorter documents receive higher scores.
Inverse document frequency : Rare terms are weighted higher.
Default sorting is by descending relevance; ties are broken by timestamp. Custom sorting can be added via SPL, e.g.,
.entity with(query='kubernetes pod') | sort __last_observed_time__ desc | limit 50.
Practical Use Cases
Scenario 1 – Fast Entity Location
Problem: An alert references an entity ID; operators need the full entity details quickly.
# Precise ID query
.entity with(domain='apm', name='apm.service', ids=['4567bd905a719d197df','973ad511dad2a3f70a'])Alternative approaches include keyword search or field‑limited queries for flexible troubleshooting.
Scenario 2 – Cross‑Domain Joint Retrieval
Problem: Find all entities containing "error" across APM, K8s, and cloud resources without switching systems.
.entity with(domain='*', name='*', query='error', topk=50)Scenario 3 – Conditional Filtering & Data Analysis
Problem: Identify Java‑based APM services and aggregate counts per cluster.
.entity with(domain='apm', name='apm.service')
| where language='java'
| stats count=count() by clusterOther examples include environment‑based filters and multi‑field sorting.
Performance Optimization Tips
Prefer field‑limited queries over full‑text search for speed.
Avoid leading wildcards; trailing wildcards ( service*) are faster than *service.
Use simple AND conditions rather than complex OR chains.
Set topk to the minimal needed result set.
Conclusion
Entity query, as the core interface of EntityStore, provides powerful retrieval and analysis capabilities for observability. It enables rapid entity location, cross‑domain searches, precise lookups, and deep data analytics via SPL, making it indispensable for daily operations, incident investigation, and data‑driven insights.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
