Amazon will pay the FTC a $25 million penalty as well as “overhaul its deletion practices and implement stringent privacy safeguards” to avoid charges of violating the Children’s Online Privacy Protection Act to spruce up its AI.

Amazon’s voice interface Alexa has been in use in homes across the globe for years, and any parent who has one knows that kids love to play with it, make it tell jokes, even use it for its intended purpose, whatever that is. In fact it was so obviously useful to kids who can’t write or have disabilities that the FTC relaxed COPPA rules to accommodate reasonable usage: certain service-specific analysis of kids’ data, like transcription, was allowed as long as it is not retained any longer than reasonably necessary.

It seems that Amazon may have taken a rather expansive view on the “reasonably necessary” timescale, keeping kids’ speech data more or less forever. As the FTC puts it:

Amazon retained children’s recordings indefinitely—unless a parent requested that this information be deleted, according to the complaint. And even when a parent sought to delete that information, the FTC said, Amazon failed to delete transcripts of what kids said from all its databases.

Geolocation data was also not deleted, a problem the company “repeatedly failed to fix.”

This has been going on for years — the FTC alleges that Amazon knew about it as early as 2018 but didn’t take action until September of the next year, after the agency gave them a helpful nudge.

That kind of timing usually indicates that a company would have continued with this practice forever. And apparently, due to “faulty fixes and process fiascos,” some of those practices did continue until 2022!

You may well ask, what is the point of having a bunch of recordings of kids talking to Alexa? Well, if you plan on having your voice interface talk to kids a lot, it sure helps to have a secret database of audio interactions that you can train your machine learning models on. And that’s how the FTC said Amazon justified its retention of this data.

FTC Commissioners Bedoya and Slaughter, as well as Chair Khan, wrote a statement accompanying the settlement proposal and complaint to particularly call out this one point:

The Commission alleges that Amazon kept kids’ data indefinitely to further refine its voice recognition algorithm. Amazon is not alone in apparently seeking to amass data to refine its machine learning models; right now, with the advent of large language models, the tech industry as a whole is sprinting to do the same.

Today’s settlement sends a message to all those companies: Machine learning is no excuse to break the law. Claims from businesses that data must be indefinitely retained to improve algorithms do not override legal bans on indefinite retention of data. The data you use to improve your algorithms must be lawfully collected and lawfully retained. Companies would do well to heed this lesson.

And so today we have the $25 million fine, which is of course less than negligible for a company Amazon’s size. It’s clearly complying with the other provisions of the proposed order that will likely give them a headache. The FTC says the order would:

  • Prohibit Amazon from using geolocation, voice information, and children’s voice information subject to consumers’ deletion requests for the creation or improvement of any data product;
  • Require the company to delete inactive Alexa accounts of children;
  • Require Amazon to notify users about the FTC-DOJ action against the company;
  • Require Amazon to notify users of its retention and deletion practices and controls;
  • Prohibit Amazon from misrepresenting its privacy policies related to geolocation, voice and children’s voice information; and
  • Mandate the creation and implementation of a privacy program related to the company’s use of geolocation information.

This settlement and action is totally independent from the FTC’s other one announced today, with Amazon subsidiary Ring. There is a certain common thread of “failing to implement basic privacy and security protections,” though.

In a statement, Amazon said that “While we disagree with the FTC’s claims and deny violating the law, this settlement puts the matter behind us.” They also promise to “remove child profiles that have been inactive for more than 18 months,” which seems incredibly long to retain that data. I’ve followed up with questions about that duration and whether the data will be used for ML training, and will update if I hear back.