The generative AI models that power ChatGPT, Copilot Gemini, and other assistants were built with mountains of training data. Microsoft will now begin using interactions with GitHub Copilot as another source of that information, unless you specifically opt out of the collection.
GitHub, the popular coding platform owned by Microsoft, announced today that interactions with GitHub Copilot will be used to “train and improve our AI models.” GitHub Copilot is the AI code support tool built into Visual Studio Code, the GitHub website, the Copilot CLI tool (which competes with Claude Code), and other services. That includes input or output data, code snippets, comments and documentation, file names, repository structure, and other information.
If you’ve never used GitHub Copilot in the first place, this won’t change anything. However, if you used code completion in Visual Studio Code, asked Copilot a question on the GitHub website, or used another related AI feature, your interactions and code snippets could be collected.
Importantly, automatic data collection applies to both free and paid accounts. This includes Copilot Free, Copilot Pro and Copilot Pro+ users, but not Copilot Business and Copilot Enterprise accounts.
The blog post explained that the initial AI models for GitHub Copilot were “built using a combination of publicly available data and handcrafted code samples” (that didn’t go down well with everyone), and the company has seen positive improvements by incorporating data from Microsoft employees. Now, GitHub hopes the service will get even better with more interactions used as training data.
GitHub said in the announcement: “This approach aligns with established industry practices and will improve model performance for all users. By participating, you will help our models better understand development workflows, provide more accurate and confident code pattern suggestions, and improve their ability to help you detect potential bugs before they reach production.”
How to unsubscribe
You can pause data collection from the Copilot Features Page in your GitHub account settings. After you log in to your account, there is a setting “Allow GitHub to use my data for AI model training” in the Privacy section.
You just need to set that dropdown to “Off” and you’re done. If you have multiple GitHub accounts, make sure you do this for each of your accounts.
Fountain: GitHub Blog





