How AI Can Act as a Judge: Deep Learning Methods for Predicting Fines and Legal Articles
This article presents a comprehensive solution for the CCF BDCI 2017 "AI Judge" competition, detailing data analysis, preprocessing, multiple deep‑learning models (TextCNN, RNN, Structured), model fusion techniques, and experimental results for automatically predicting fine ranges and relevant legal statutes from case facts.
Background and Task Definition
The rapid growth of artificial intelligence and judicial big data has created a demand for automated tools that can read case facts and suggest appropriate fine ranges and legal articles, thereby improving case handling efficiency and supporting fair adjudication. The CCF BDCI 2017 competition posed this challenge as the "AI Judge" task, requiring participants to predict both the fine amount category and the corresponding legal provisions from anonymized case descriptions.
Data and Preprocessing
The dataset consists of thousands of de‑identified case fact statements, each labeled with a fine‑category (LV1–LV8) and the relevant legal articles. Preprocessing steps include:
Tokenization : Segment all texts, keep the top 100,000 most frequent tokens, replace the rest with <UNK>, and add <PAD> for padding.
Amount Marking : Replace monetary amounts with their fine‑category label (LV1‑LV8) to avoid sparse numeric tokens.
Cleaning : Remove corrupted or meaningless documents and filter out overly short samples.
Validation Split : Use a 10:1 train‑to‑validation ratio.
Length Truncation : For texts shorter than 2000 tokens, pad to 2000; for longer texts, keep the first 500 and the last 1500 tokens, as key legal terms often appear early.
Model Construction
Three primary model families were built for fine‑category prediction, while two families addressed legal‑article prediction.
Fine‑Category Models
TextCNN + RNN + Structured : Combines convolutional features, recurrent context, and hierarchical attention.
Legal‑Article Models
TextCNN + RNN : Uses multiple embedding‑kernel configurations followed by a sigmoid output layer.
Key architecture details for TextCNN:
Embedding dimension: 200
Kernel sizes: 3, 4, 5 (each with 500 filters)
Max‑pooling concatenation → two fully‑connected layers [750, 8] → softmax
Dropout: 0.1 applied at several layers
For the legal‑article task, three different embedding‑kernel setups are concatenated and fed into a single fully‑connected layer (321 units) with sigmoid activation.
Training Strategies
Early stopping based on validation score, checking four times per epoch.
Learning‑rate decay: halve the LR after two consecutive non‑improving checks; stop after two more.
Best‑scoring model on validation is used for final test predictions.
Additional Experiments
Attentive Convolution (light version) and various attention‑augmented RNN variants were explored but did not surpass the baseline TextCNN performance on long legal documents.
Model Fusion
Predictions from individual models were combined by simple addition or averaging, yielding modest gains over single models.
Results and Evaluation
The fused system achieved a micro‑averaged F1 score of 0.54 for fine‑category prediction, demonstrating competitive performance in the competition’s final round. Legal‑article prediction showed high accuracy for the few most frequently cited statutes, confirming the effectiveness of keyword‑based and hierarchical attention approaches.
Potential and Application Value
End‑to‑End Deep Learning : The pipeline can be fully automated, extracting multi‑level semantic features without manual rule engineering.
Interpretability : Although current models are black‑box, attention weights and keyword heatmaps can be visualized to provide insights for legal professionals.
Deployment : Trained models can be exported and served via cloud APIs for web or mobile applications, offering rapid inference for case‑level assistance.
Social Impact : Automated predictions can reduce workload for judges, highlight inconsistent statutory citations, and promote more uniform sentencing.
Conclusion and Future Work
The solution demonstrates that a combination of robust preprocessing, diverse deep‑learning architectures, and straightforward model fusion can effectively predict fines and legal articles from raw case texts. Future directions include expanding the legal‑article taxonomy, integrating rule‑based post‑processing for rare statutes, and exploring more advanced architectures such as HAN or RCNN to further boost performance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Baobao Algorithm Notes
Author of the BaiMian large model, offering technology and industry insights.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
