How to Build a Python Scraper for Mikan Anime Site and Download Torrents

This tutorial walks through building a Python web scraper that fetches anime torrent links from the Mikan Project site, handles pagination, bypasses anti‑scraping measures, parses pages with lxml, and saves the .torrent files locally, complete with code snippets and screenshots.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
How to Build a Python Scraper for Mikan Anime Site and Download Torrents

Project Background

The Mikan Project is a next‑generation anime streaming site that provides the latest anime resources and daily curated recommendations.

Project Goal

Automatically obtain anime torrent links from the site and save them to local files.

Libraries and Target Site

Target URL pattern: https://mikanani.me/Home/Classic/{} Key Python libraries used:

requests

lxml

fake_useragent

IDE: PyCharm

Project Analysis

To crawl multiple pages, the URL for the next page increments the number after Classic/. By replacing the variable part with {} and iterating with a for loop, we can request each page sequentially.

Anti‑Scraping Measures

Use realistic HTTP request headers.

Generate random User‑Agent strings with fake_useragent.

Implementation Steps

1. Class Definition and Imports

import requests
from lxml import etree
from fake_useragent import UserAgent

class Mikan(object):
    def __init__(self):
        self.url = "https://mikanani.me/Home/Classic/{}"

    def main(self):
        pass

if __name__ == '__main__':
    scraper = Mikan()
    scraper.main()

2. Main Method – Loop Through Pages

start = int(input("start :"))
end = int(input(" end:"))
for page in range(start, end + 1):
    url = self.url.format(page)
    print(url)

3. Random User‑Agent Generation

for i in range(1, 50):
    self.headers = {
        'User-Agent': ua.random,
    }

4. Send Request and Get Response

def get_page(self, url):
    res = requests.get(url=url, headers=self.headers)
    html = res.content.decode("utf-8")
    return html

5. Parse First‑Level Page with XPath

parse_html = etree.HTML(html)
one = parse_html.xpath('//tbody//tr//td[3]/a/@href')
for li in one:
    detail_url = "https://mikanani.me" + li

6. Parse Second‑Level Page for Torrent Links

tow = parse_html2.xpath('//body')
for i in tow:
    title = i.xpath('.//p[@class="episode-title"]//text()')[0].strip()
    torrent_path = i.xpath('.//div[@class="leftbar-nav"]/a[1]/@href')[0].strip()
    torrent_url = "https://mikanani.me" + torrent_path
    print(torrent_url)

7. Save Torrent File

dirname = "./种子/" + title[:15] + title[-20:] + '.torrent'
content = requests.get(url=torrent_url, headers=self.headers).content
with open(dirname, 'wb') as f:
    f.write(content)
    print("
%s下载成功" % title)

8. Execute Workflow

html = self.get_page(url)
self.parse_page(html)

Effect Demonstration

Running the program prompts for start and end pages, then prints each generated URL:

Successful download messages appear in the console:

The .torrent files are saved locally:

To open a torrent file, upload it to a cloud storage service (e.g., Baidu Cloud) and then double‑click to start the download:

Conclusion

Avoid excessive crawling to prevent server overload.

This guide demonstrates Python techniques for scraping the Mikan Project, handling anti‑scraping measures, and downloading torrent files.

It also covers string concatenation, list type conversion, and practical debugging tips.

Hands‑on practice is essential for deeper understanding.

The Mikan Project also offers daily anime recommendations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Pythonfake_useragentlxmlanimeMikan
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.