How to Evaluate OpenAI's Super Conversational Model ChatGPT?
This article compiles three highly upvoted Zhihu answers that examine OpenAI's ChatGPT, discussing its breakthrough impact on NLP, visual in‑context learning, reinforcement‑learning‑from‑human‑feedback, and the broader implications for AI research and development.
This article compiles three highly upvoted Zhihu answers that examine OpenAI's ChatGPT, discussing its breakthrough impact on NLP, visual in‑context learning, reinforcement‑learning‑from‑human‑feedback, and the broader implications for AI research and development.
👨💻 Cao Yue notes that many older‑time NLP researchers are still stuck in the BERT era, while newer entrants after GPT‑3 have a broader view of large language models. He highlights the difficulty for Chinese researchers to access GPT‑3 APIs, describing a "bottleneck" that limits domestic progress compared to OpenAI.
Cao also reflects on his own misunderstanding of in‑context learning, recognizing it as an emergent property of large autoregressive models, and discusses challenges of applying it to vision tasks, mentioning works like pix2seq, unified‑io, and UVIM that attempt to unify task representations.
He further describes the evolution after GPT‑3, including WebGPT, InstructGPT, and alignment research, emphasizing the shift toward new loss signals and the use of reinforcement learning from human feedback (RLHF) to align model outputs with human expectations.
👨💻 Trinkle shares personal experience participating in ChatGPT training, suggesting promising directions such as re‑applying RL to language models, improving reward‑model and policy training efficiency, and building a highly optimized RLHF library to replace existing tools.
He also lists practical observations: the importance of dataset quality and diversity, the completeness of dialog as a carrier for any content, and speculative ideas about AGI‑era productivity gains where a single model could replace entire development teams.
👨💻 Gh0u1L5 points out that ChatGPT is wrapped in a sophisticated lock, with engineers deliberately restricting certain capabilities. He demonstrates how the model can be coaxed around political, religious, or dangerous‑behavior restrictions, and notes that early coding bans were quickly lifted after positive public reaction.
The article includes several illustrative images showing ChatGPT interactions, a virtual machine example inside ChatGPT, and screenshots of the model’s restriction bypass attempts.
Finally, a concise list of current ChatGPT limitations is provided, covering sensitive political and religious topics, role‑playing specific personalities, instructions for dangerous actions, moral dilemmas, and queries requiring internet access.
The author concludes that while many restrictions aim to prevent misuse, they also reflect concerns about public perception and potential panic, underscoring the delicate balance between AI capability and societal impact.
Sensitive political topics and figures
Sensitive religious topics and figures
Role‑playing specific personalities
Instructions for dangerous behavior
Moral or subjective dilemma questions
Queries requiring real‑time internet access
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.