How Fake GitHub Stars Are Bought, Detected, and Why Investors Care
GitHub star‑buying has become a black‑market industry, with prices ranging from under a yuan to six yuan per star, prompting investors to use star counts as a metric, while researchers develop simple and advanced clustering algorithms to detect fake stars and expose the practice.
Fake Star Market
GitHub star‑buying has appeared repeatedly, even affecting well‑known open‑source projects from large companies. Prices vary: cheap services cost as low as 0.4–0.88 CNY per star, while premium accounts with a year‑old history can cost up to 6 CNY per star. An entrepreneur spent 20 EUR (≈156 CNY) to purchase 25 high‑quality stars, only to see them disappear after a month when the accounts were banned. Low‑cost providers often offer free re‑stars after removal.
Fake Star Detector
Fraser Marlow, a growth lead at Dagster, discovered the black‑market and collaborated with spam‑filter experts to build a fake‑star detector. The system uses two algorithms: a simple rule‑based filter that catches obvious patterns (e.g., many accounts starring the same two projects without contribution history) and a more sophisticated supervised clustering algorithm that groups accounts with similar activity signatures. Normal users exhibit diverse, sparse activity, whereas fake accounts cluster tightly. The team validated the approach by creating a target repository, purchasing stars, and achieving near‑100 % match rate in tests, with 98 % precision and 85 % recall on real‑world data.
Applying the detector to the GitHub Archive dataset revealed extreme cases: the project okcash had 759 stars, but the simple algorithm flagged only one suspicious star, while the clustering method identified 97 % of them as fake. The analysis focused on stars added after 2022‑01‑01, meaning many older fraudulent stars remain undetected. Dagster’s own product and several peers showed low fake‑star rates, suggesting the data‑pipeline sector is relatively healthy.
Investors: We Love Stars
Open‑source teams often buy stars to attract venture capital. Investors such as Pratima Aiyagari view star count as a quick proxy for project traction, creating metrics like the ROSS index that ranks projects by star‑growth rate. However, as investors increasingly rely on this metric, its reliability erodes. Scholars like Stuart Geiger note that over‑time such indicators become self‑defeating, echoing Campbell’s Law (metrics become manipulable) and Goodhart’s Law (once a metric is targeted, it ceases to be a good measure). The practice extends beyond funding: job seekers use star‑rich projects to impress recruiters, and many companies evaluate candidates based on GitHub metrics.
One More Thing
The rise of large AI models makes detecting fake accounts harder. Earlier fake‑star schemes relied on simple account features, but with tools like ChatGPT, perpetrators can generate realistic, varied comments, blurring the line between genuine and synthetic activity.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
