Hidden Prompt Scandal: How AI Was Coerced to Give Positive Paper Reviews
A recent controversy reveals that a research team embedded a hidden prompt in a paper to force AI reviewers to give only positive feedback, sparking intense debate about academic integrity, AI ethics, and the need for stricter peer‑review policies.
Recent allegations show that a paper from Xie Saining's team secretly included a white‑on‑white prompt instructing AI reviewers to ignore previous instructions and give only positive evaluations.
This is unethical. All co‑authors share responsibility for any problematic submission; there are no excuses.
The hidden instruction is invisible to human readers but can be detected by AI, which then generates a favorable review.
After the leak, the academic community reacted strongly, and Xie publicly apologized, acknowledging that such behavior by students is unacceptable.
Honestly, I only discovered this after public outcry. I would never encourage my students to do this—if I were a field chair, any paper containing such prompts would be rejected immediately.
The incident is more complex than a simple student mistake; it required AI‑assisted review to be effective.
1. Background
In November 2024, researcher @jonLorraine9 tweeted about using prompt injection to manipulate AI reviewers, highlighting that large language models (LLMs) can be tricked when PDFs are fed directly to them.
Many agreed that using LLMs in peer review threatens the integrity of the scholarly process, prompting conferences like CVPR and NeurIPS to ban AI‑assisted reviews.
2. Our Situation
A short‑term visiting student from Japan took the tweet seriously and embedded the hidden prompt in an EMNLP submission, unaware of the ethical implications.
The same prompt also appeared in the arXiv version, and the oversight was missed because it fell outside routine ethical checks.
3. Next Steps
The student has revised the paper and contacted the ARR for formal guidance; we will follow their recommendations.
4. Greater Significance
This episode highlights the need to rethink academic norms in the AI era, as prompt injection represents a new form of misconduct distinct from data fabrication.
It underscores the importance of educating researchers about AI ethics and responsible research practices.
The author of the original “poison‑pill” strategy agrees it is unethical, but argues that blaming the student alone is exaggerated.
Given the growing power of large models, integrating them into the review process seems inevitable, yet human review remains the safest approach for now.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
