Backend Development 3 min read

How to Log Baidu Spider Visits in ThinkPHP6

This article explains how to add a base controller method in ThinkPHP6 that detects search engine spider user‑agents, builds the request URL, obtains the real client IP, and stores the spider name, URL, and IP into a BaiduLog model for logging purposes.

php中文网 Courses
php中文网 Courses
php中文网 Courses
How to Log Baidu Spider Visits in ThinkPHP6

In ThinkPHP6 you can record visits from search engine spiders such as Baidu, Google, Bing, and others by adding a method to a base controller that checks the user‑agent string and stores the data.

The initialize method calls parent::initialize() , checks if the site is closed, and then invokes baiduLog() . The baiduLog method retrieves the user‑agent, builds the current URL, obtains the real IP, determines the spider name, and inserts a record with title, href, and IP into the BaiduLog model.

<code>public function initialize()
{
    parent::initialize(); // TODO: Change the autogenerated stub
    if ($this->Config['web_status'] == 0) {  // 判断是否关闭网站
        die('网站已经关闭');
    }
    $this->baiduLog();
}
protected function baiduLog()
{
    $useragent = strtolower($_SERVER['HTTP_USER_AGENT']);
    $url = $this->request->controller() . "/" . $this->request->action();
    $param = input("param.", "", "htmlspecialchars");
    $url = (string) url($url,$param);
    $ip = get_real_ip();
    $title = "";
    if (strpos($useragent, 'googlebot') !== false){
        $title =  'Google';
    } elseif (strpos($useragent, 'baiduspider') !== false){
        $title =  'Baidu';
    } elseif (strpos($useragent, 'msnbot') !== false){
        $title =  'Bing';
    } elseif (strpos($useragent, 'slurp') !== false){
        $title =  'Yahoo';
    } elseif (strpos($useragent, 'sosospider') !== false){
        $title =  'Soso';
    } elseif (strpos($useragent, 'sogou spider') !== false){
        $title =  'Sogou';
    } elseif (strpos($useragent, 'yodaobot') !== false){
        $title =  'Yodao';
    } else {
        // $title = $useragent; // 不怕数据大的话可以取消注释,记录所有访问日志
    }
    if (!empty($title)) {
        BaiduLog::create(["title"=>$title,"href"=>$url,"ip"=>$ip]);
    }
}
</code>

The helper function get_real_ip() returns the client’s real IP address.

Backend DevelopmentloggingphpspiderThinkPHPWeb Crawlers
php中文网 Courses
Written by

php中文网 Courses

php中文网's platform for the latest courses and technical articles, helping PHP learners advance quickly.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.