Tag

Clip

0 views collected around this technical thread.

Tencent Cloud Developer
Tencent Cloud Developer
Oct 30, 2024 · Artificial Intelligence

Comprehensive Survey of AIGC Research: Papers, Resources, and Technical Overview

This survey acts as a comprehensive portal that organizes AIGC research across seven domains—text, image, and audio generation, cross‑modal association, text‑guided image and audio synthesis, and supporting resources—detailing seminal models such as GPT, Diffusion, CLIP, DALL·E, Stable Diffusion, MusicLM, and key papers that shaped each field.

AIGCClipComputer Vision
0 likes · 19 min read
Comprehensive Survey of AIGC Research: Papers, Resources, and Technical Overview
Bilibili Tech
Bilibili Tech
Aug 27, 2024 · Artificial Intelligence

Multimodal Video Scene Classification for Adaptive Video Processing

The paper presents a multimodal video scene classification system that leverages CLIP‑generated pseudo‑labels and a fine‑tuned image encoder to automatically identify nature, animation/game, and document scenes, enabling more effective adaptive transcoding, intelligent restoration, and quality assessment for user‑generated content on platforms such as Bilibili.

Bilibili multimediaClipComputer Vision
0 likes · 17 min read
Multimodal Video Scene Classification for Adaptive Video Processing
Sohu Tech Products
Sohu Tech Products
May 21, 2024 · Artificial Intelligence

OPPO Multimodal Pretrained Model Deployment in Cloud-Edge Scenarios: Practices and Optimizations

OPPO details how it deploys multimodal pretrained models on resource‑constrained edge devices by compressing CLIP‑based image‑text retrieval, adapting Chinese text‑to‑image generation with LoRA and adapters, and lightweighting diffusion models through layer pruning and progressive distillation, achieving sub‑3‑second generation while preserving cloud‑level quality.

ClipLoRAOPPO
0 likes · 18 min read
OPPO Multimodal Pretrained Model Deployment in Cloud-Edge Scenarios: Practices and Optimizations
Ximalaya Technology Team
Ximalaya Technology Team
Feb 1, 2024 · Artificial Intelligence

Understanding AI Image Generation: Diffusion Models, CLIP, and Control Techniques

This guide explains how AI image generators such as Stable Diffusion and DALL·E 3 turn text prompts into pictures by using diffusion models, CLIP‑aligned embeddings, and optional controls like negative prompts, fine‑tuned LoRA checkpoints and ControlNet conditioning, highlighting their differences, workflow, and practical customization.

AI image generationClipControlNet
0 likes · 18 min read
Understanding AI Image Generation: Diffusion Models, CLIP, and Control Techniques
Zhuanzhuan Tech
Zhuanzhuan Tech
Nov 29, 2023 · Artificial Intelligence

Applying CLIP and Milvus for Image Similarity Search in E‑commerce Risk Control

The article explains how an e‑commerce risk‑control team leverages OpenAI's CLIP model to generate image and text embeddings and stores them in the Milvus cloud‑native vector database to enable fast, scalable similarity searches for compliance verification and risk detection.

AIClipMilvus
0 likes · 11 min read
Applying CLIP and Milvus for Image Similarity Search in E‑commerce Risk Control
DataFunTalk
DataFunTalk
Nov 24, 2023 · Artificial Intelligence

Open Vocabulary Detection Contest 2023: Summary of Winning Teams' Technical Solutions

The article reviews the Open Vocabulary Detection Contest organized by the Chinese Society of Image and Graphics and 360 AI Institute, describing the competition setup, dataset characteristics, and detailed winning approaches that combine Detic, CLIP, prompt learning, and multi‑stage pipelines to achieve strong few‑shot and zero‑shot object detection performance.

ClipComputer VisionOpen Vocabulary Detection
0 likes · 17 min read
Open Vocabulary Detection Contest 2023: Summary of Winning Teams' Technical Solutions
360 Tech Engineering
360 Tech Engineering
May 6, 2023 · Artificial Intelligence

Open‑Vocabulary Object Detection: Overview of OVR‑CNN, RegionCLIP, and CORA

This article reviews the evolution of open‑vocabulary object detection, describing the OVR‑CNN paradigm, the RegionCLIP enhancements, and the CORA model with region prompting and anchor pre‑matching, and discusses their impact on future multimodal AI systems.

CORAClipMultimodal Models
0 likes · 14 min read
Open‑Vocabulary Object Detection: Overview of OVR‑CNN, RegionCLIP, and CORA
IT Services Circle
IT Services Circle
Jun 6, 2022 · Artificial Intelligence

AI Image Generation Showdown: Google Imagen vs OpenAI DALL·E on the "Tiger Wearing VR" Prompt

The article compares Google’s Imagen and OpenAI’s DALL·E by feeding them the whimsical "Tiger Wearing VR" prompt, showcasing each model’s visual style, underlying architecture—including CLIP, diffusion, and T5‑XXL—and community reactions to the resulting AI‑generated artwork.

AIClipGoogle Imagen
0 likes · 5 min read
AI Image Generation Showdown: Google Imagen vs OpenAI DALL·E on the "Tiger Wearing VR" Prompt
DaTaobao Tech
DaTaobao Tech
May 24, 2022 · Artificial Intelligence

GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection

GEN‑VLKT introduces a Guided‑Embedding Network with position‑ and instance‑guided embeddings to remove costly post‑processing and leverages CLIP‑based visual‑linguistic knowledge transfer for interaction understanding, achieving state‑of‑the‑art HOI detection performance and zero‑shot capability, now deployed in Alibaba’s Taobao services.

ClipComputer VisionHOI detection
0 likes · 7 min read
GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection
Efficient Ops
Efficient Ops
Oct 19, 2015 · Operations

Step-by-Step Guide to Installing and Using Clip Server and SDK on Linux

This article provides a comprehensive tutorial on installing the Clip Server (Apache, PHP, MySQL), configuring its virtual host, setting up the Clip SDK with Python, and using various Clip commands to manage IP relationships, all illustrated with command examples and screenshots.

ClipInstallationLinux
0 likes · 12 min read
Step-by-Step Guide to Installing and Using Clip Server and SDK on Linux