Can AI Learn to Write Like a Chinese Novelist? Exploring Deep Learning in Literature
This article examines how deep‑learning‑based AI models, from symbolic and statistical NLP methods to Karpathy's recurrent network, progressively learn to generate Chinese wuxia novels, poetry, and web fiction, revealing both their surprising advances and inherent limitations.
Robot Writing Approach
Language distinguishes humans from machines, and the Turing test uses language to assess intelligence.
NLP
Natural Language Processing (NLP) studies how machines can understand and produce human text. Two main approaches exist: the symbolic method, which builds explicit grammatical and lexical rules, and the statistical method, which lets machines discover patterns by ingesting large text corpora.
Historically, limited computing power favored symbolic methods, but the rise of deep learning in the past decade has enabled statistical methods to dominate.
Karpathy Model
Andrej Karpathy’s 2015 recurrent neural‑network model, released on GitHub, is a compact yet powerful example. With only a few thousand lines of code and no predefined grammar or vocabulary, it learns from raw text input, analyzing character relationships to generate new sequences.
Robot’s Path to Becoming a Novelist
"All‑Mediocre" Learns Jin Yong
The robot first studies Jin Yong’s The Legend of the Condor Heroes (≈800 k characters). After a few seconds of training it produces gibberish, then gradually improves: recognizing frequent characters, adding punctuation, and eventually forming simple sentences with correct word order and basic subject‑verb‑object structure.
Even after millions of repetitions, the output remains limited, resembling the speech of a one‑year‑old child.
"Ji‑Long" Learns Gu Long
Training on Gu Long’s complete works (≈17 M characters) yields more fluent text. The robot captures stylistic differences:
Gu Long’s paragraphs are shorter, often a single sentence.
His style features more psychological description and modern language.
He uses concise, “cool” dialogue.
Although the robot lacks true understanding, it reproduces these subtle stylistic cues.
Robot Poet
When trained on the Complete Tang Poems , the robot generates verses that follow tonal patterns and exhibit reasonable imagery, though rhyme is often missing.
Robot Writes Web Novels
Training on the popular web novel Dou Po Cang Qiong (≈6 M characters) produces relatively coherent prose, because the language is simple and repetitive, which suits statistical learning.
Robot Limitations and Human Trainers
The Karpathy model demonstrates the ceiling of purely statistical learning: it can mimic surface style but cannot convey deep meaning or emotion. Real‑world commercial language models are far more complex.
Effective language acquisition requires interactive feedback. The Microsoft chatbot Tay, trained only on one‑way user input, quickly devolved into profanity, illustrating the need for balanced, bidirectional training.
Large tech companies leverage massive user bases to provide continuous feedback, turning everyday users into inadvertent trainers for ever‑improving AI.
Author: _dailu_ – Original title: When AI Talks About Writing, What Is It Actually Saying?
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
