Imaging Methods and Image Analysis Overview
The talk by Tencent Cloud senior researcher Ji Yongnan reviews imaging modalities, low‑ to high‑level processing, the rise of deep‑learning CNNs over traditional handcrafted methods, and their deployment in classification, detection, segmentation, moderation, medical analysis, OCR, and other real‑world applications.
"AI is coming" invited Tencent Cloud senior researcher Ji Yongnan to discuss various aspects of image analysis.
Since 2012, deep learning has revolutionized image recognition, classification, object detection, and semantic segmentation, surpassing traditional methods. While recognition and detection are the most challenging tasks, real‑world image problems often combine multiple sub‑problems rather than being a single task.
Ji Yongnan, a Ph.D. from the University of Nottingham and a Marie Curie Fellow, now serves as a senior researcher at Tencent Cloud's Big Data AI Product Center.
Imaging Methods
Understanding how an image is formed is the first step in solving image‑related problems. The most common imaging methods are optical cameras (DSLR, phone cameras). Other modalities include X‑ray, infrared, microscopy, remote sensing, and structured light. For example, in medical CT images, pixel values represent the attenuation of X‑rays by the material at that location; bone and metal yield high values, air yields low values, and the dynamic range can span from –1024 to 1024.
Image Processing
Image processing can be divided into three levels:
Low‑level: enhancement, denoising, edge extraction, basic compression.
Mid‑level: classification, object detection, localization, segmentation, semantic segmentation.
High‑level: tasks such as automatic image captioning, face pose recognition, autonomous driving.
Complex high‑level tasks are usually decomposed into a series of mid‑ and low‑level sub‑tasks. For instance, face‑based identity verification involves face detection, landmark localization, feature extraction, and matching. Traditional image analysis relies heavily on handcrafted filters (e.g., edge detectors, SIFT, HOG) and requires expert knowledge to select appropriate ones.
Deep Learning
Convolutional Neural Networks (CNNs) have largely outperformed traditional methods on mid‑level tasks such as classification, detection, and segmentation. CNNs can be viewed as learnable filters that require large amounts of data but little manual design. Different network architectures are used for different tasks: Fully Convolutional Networks (FCN) for segmentation, Faster R‑CNN for detection, etc.
As deep learning models become more accurate and widely applicable, benchmark tasks like ImageNet classification have become routine, and face recognition has scaled from small datasets to hundreds of millions of identities.
Applications
Tencent Cloud’s image recognition capabilities are deployed in multi‑label classification, sensitive image moderation (pornography, violence, political figures), medical imaging analysis, structured recognition of people/vehicles/objects, and OCR. Sensitive image moderation achieves up to 99% accuracy. In healthcare, collaborations with over 100 hospitals support early screening for lung and esophageal cancers. OCR is widely used in finance, hospitality, logistics, and ID verification.
Further Reading
1. Rafael C. Gonzalez and Richard E. Woods. 2006. Digital Image Processing (3rd Edition). Prentice‑Hall.
This book is a classic for understanding the fundamentals of imaging and image analysis.
2. CS231n: Convolutional Neural Networks for Visual Recognition (by Fei‑Fei Li et al.).
The course provides comprehensive materials on CNNs and their evolution, with valuable assignments.
3. Various open‑source projects (search “image analysis” on GitHub).
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.