A Microsoft came out publicly to clarify that documents from users of the Microsoft 365 are not used to train large language models (LLMs). The statement was made after a controversial post went viral on platform X (formerly Twitter), raising suspicions about the company’s data collection.
The initial alert was published by user @nixcraft, who stated:
“Be aware: Microsoft Office, as well as several other companies in recent months, have maliciously enabled an optional feature that scans your Word and Excel documents to train their internal AI models. This option is enabled by default, and you need to manually uncheck it to disable it.”
Heads up: Microsoft Office, like many companies in recent months, has slyly turned on an “opt-out” feature that scrapes your Word and Excel documents to train its internal AI systems. This setting is turned on by default, and you have to manually uncheck a box in order to opt… pic.twitter.com/wUfhBjcMOR
— nixCraft (@nixcraft) November 24, 2024
The publication cited the function “Connected Experiences” (Connected Experiences), accessible in the Microsoft 365 settings menu, suggesting that this tool would be responsible for sending data from users’ documents to Microsoft.
Microsoft Official Response
In the M365 apps, we do not use customer data to train LLMs. This setting only enables features requiring internet access like co-authoring a document. https://t.co/o9DGn9QnHb
— Microsoft 365 (@Microsoft365) November 25, 2024
With more than 536,000 views, the post generated enough buzz for Microsoft to respond. In a position published on its X account, the company reinforced:
“In Microsoft 365 apps, we don’t use user data to train language models. The mentioned configuration only activates features that depend on an internet connection, such as real-time co-authoring of documents.”
The agenda on data collection in the field of artificial intelligence continues to generate a lot of debate and dual opinions on the topic. Last year, for example, Japan reached the conclusion that AI training does not qualify as copyright infringement. Already this year, some countries filed complaints against X on the use of data from platform users for AI training. Quickly, Elon Musk’s network complied with the determination of European legislation regarding the interruption of data collection.
Source: https://www.hardware.com.br/noticias/microsoft-desmente-rumor-dados-do-office-365-nao-treinam-modelos-de-ia.html