Imagine that you’re in a Zoom meeting and the conversation is going gangbusters. Now imagine that every word your group utters is being used to train large language AI models. Last week, Zoom users found that the fine print of their Terms of Service Agreement stated that Zoom could use your conversations to train its AIs.
StackDiary broke the story on August 11th, though the terms and conditions had been part of Zoom’s contracts for months. The news did not sit well with Zoom users. The downright scary word-for-word wording in the Zoom Terms of Service reads:
“You agree to grant and hereby grant Zoom a perpetual, worldwide, non-exclusive, royalty-free, sublicensable, and transferable license and all other rights required or necessary to redistribute, publish, import, access, use, store, transmit, review, disclose, preserve, extract, modify, reproduce, share, use, display, copy, distribute, translate, transcribe, create
derivative works, and process Customer Content and to perform all acts with respect to the Customer Content,” including for the purpose of “machine learning” and “artificial intelligence” for the “improvement of the services, software, or Zoom’s other products,
services, and software.”
Yikes! Zoom responded swiftly and emphatically, reiterating in no uncertain terms that Zoom does not use data from your conversations to train its AIs. The company blamed an “internal process flaw” (which is shorthand for someone
F&**#$ up) for the problem and remedied it immediately.
Zoom does use your activity to gather what’s called “service-generated data.” It’s pretty standard practice in cloud-based conferencing. Telemetry data (such as which machines are logging onto a Zoom call), product usage data, diagnostic data, and more help improve the product.
The back and forth between Zoom and concerned users that went on ping-pong style for most of the week is abating, like most communications crises do. Where it stands now is
that Zoom will not collect your data to train its AI models unless you opt in to one or both of its optional AI features using Zoom IQ Meeting Summary and Zoom IQ Team Chat Compose. When you fire up the AI summarizer or AI chat assistant, you (as the meeting administrator) have the option to disable “Data Sharing,” the feature that sends your call data (among other data points) to Zoom for possible AI training. As for
your guests, the contract is between you and Zoom only.
Zoom became the poster child for AI training, begging the larger question of what other companies are doing about capturing your conversations to train their AI algorithms. MS Teams just clarified what it
will do with the AI data captured in your meetings. The New York Times preemptively hung out a shingle saying AIs will not be trained on its content, while Google granted itself permission to use public data to train its AIs.
AI hungers for human input in order to learn. That’s the AI rub. It needs your conversations to make itself smarter, but who knows whether you’ve said something true or not, and your words could be ingested at the expense of your privacy or even in violation of a confidentiality agreement.
What should you do?
- Time to start reading those Terms of Service agreements more carefully. Terms of Service writers need to cut the legalese and spell out what they are and aren’t doing.
- Address what happens when the meeting convener opts in to offer their conversation to train AIs, but the meeting attendees are clueless.
- It’s time to start discussing fair compensation. If we opt-in to train a company's (like Zoom’s) AIs, should we receive some compensation? (Remember Andrew Yang’s Universal Basic Income? It was an idea before its time but it’s now time to have a serious discussion on how those that feed the LLM engines can reap some of the profits.) See the Worldcoin story below.
- Enterprises and plain old consumers may be treated differently when it comes to training AI; Regulated verticals such as banking may need to have a different TOS than the rest of us.
At the very least, Zoom’s cautionary tale is a wake-up call. Finally, it’s dawning on us that we fuel large language models. It took us years to recognize that we had fueled social media ad revenues. Hopefully, this new revelation and resolution won’t take as long. |