Is GitHub Copilot Copying Protected Code? Unpacking AI Licensing Risks

A Texas A&M professor alleges that GitHub Copilot frequently suggests copyrighted code without attribution, sparking debate over AI‑generated code licensing, open‑source license compliance, and the effectiveness of Copilot’s public code filter, while GitHub engineers defend the tool and call for scalable solutions.

21CTO
21CTO
21CTO
Is GitHub Copilot Copying Protected Code? Unpacking AI Licensing Risks

Tim Davis, a professor from Texas A&M University’s Computer Science and Engineering department, claimed on Twitter that the AI‑powered programming assistant GitHub Copilot has generated large amounts of copyrighted code without proper attribution or compliance with LGPL licenses.

He provided examples where Copilot reproduced code snippets that closely matched his own previous work, including comments such as: //sparse matrix transpose and /* place A(I,j) as entry C(j,i) */ Davis argued that the similarity was unlikely to be coincidental, noting that another developer using Copilot observed comparable results, and he posted evidence of the copied code.

GitHub’s chief engineer and Copilot co‑creator Alex Graveley responded that the problematic code differed from the examples, acknowledging some similarity but stating the code was not identical, and he welcomed scalable solutions to the issue.

The discussion highlighted that the original code in question is open source under the LGPL 2.1 license, and that open source does not equate to a lack of copyright. Various open‑source licenses impose different permissions, and copying code without respecting those licenses can violate the terms.

System76 chief engineer Jeremy Soller warned that unauthorized reuse of open‑source code by Copilot could breach license requirements, especially when incompatible licenses are combined in a project.

Copilot claims to have a public code filter that compares suggested code against public GitHub repositories, suppressing suggestions that match roughly 150 characters of existing code. However, Davis reported that even after disabling the “allow GitHub to use my code” option and rejecting public‑code matches, the same issues persisted.

The core problem is that open‑source code can appear in many projects, leading developers to inadvertently incorporate copyrighted material, whether or not Copilot is involved. Copilot’s FAQ states that code written with its assistance belongs to the developer, who is responsible for it, and that most suggestions are novel, though about 1 % may closely match training data.

This raises the question of whether Copilot is truly generating code autonomously or merely retrieving existing open‑source snippets from the web.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AI code generationGitHub Copilotopen source licensingCopyrightSoftware Compliance
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.