Artificial Intelligence 17 min read

Building a Juejin Author Profile Bot with Coze: Data Collection, Processing, and AI Summarization

This tutorial walks through creating a Coze bot that fetches a Juejin author's information and articles via web scraping, processes the data to generate a concise author profile, identifies expertise domains, ranks hot articles, and finally publishes the bot for interactive use.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Building a Juejin Author Profile Bot with Coze: Data Collection, Processing, and AI Summarization

1. Introduction

The article introduces a Coze bot project that automatically generates a Juejin author profile by collecting the author's basic information and all article metadata, then summarizing the data with AI.

2. Data Acquisition

2.1 User Information

Using browser developer tools to locate the API that returns user details, a Python requests_async call is written to fetch fields such as username, description, registration time, follower count, article count, digg count, and view count.

2.2 Article Information

The article list API is identified, and a POST request with JSON payload is used to retrieve paginated article data. A code node iterates through pages, extracts title, brief content, view/collect/digg/comment counts, tags, and constructs a list of ArticleInfo objects.

2.3 Parallel Crawling

To avoid timeout when an author has many articles, the crawling task is split into multiple parallel code nodes, each handling a subset of cursors, and a final node merges the results.

3. Data Processing

3.1 Author Profile Generation

The collected article abstracts are sent to a large language model (e.g., Zhipu AI) via an HTTP request to generate a concise author portrait (max 200 characters).

3.2 Expertise Domain Analysis

A script counts tag occurrences across all articles, calculates percentages, merges low‑frequency tags into an "Other" category, and formats the top four domains with their share and article count.

3.3 Hot Article Ranking

A custom heat score H = 0.15*R/100 + 0.25*L + 0.35*C + 0.25*F (R: views, L: likes, C: comments, F: collections) is computed for each article, sorted descending, and the top ten are formatted as markdown‑style links.

4. Final Output Assembly

The three result strings (author portrait, expertise domains, hot articles) are concatenated with the basic user info to produce a comprehensive response, which is then returned by the bot.

5. Bot Publication

The completed workflow is saved as a Coze bot, configured to trigger when a user sends a Juejin author URL, and published on the Juejin platform for interactive queries.

PythonAIData ProcessingCozeWeb ScrapingBotJuejin
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.