Data Mining

DATA MINING….Suddenly, data mining is everywhere. Not only do we have the NSA’s domestic spying program, which most likely involves data mining of some kind, but apparently good ‘ol TIA is back too ? it’s just changed its name. The funny thing, though, is that it’s not clear what it’s changed its name to. Here is Mark Clayton of the Christian Science Monitor describing a program called ADVISE:

A major part of ADVISE involves data-mining ? or “dataveillance,” as some call it…..What sets ADVISE apart is its scope. It would collect a vast array of corporate and public online information ? from financial records to CNN news stories ? and cross-reference it against US intelligence and law-enforcement records.

….ADVISE “looks very much like TIA,” Mr. Tien of the Electronic Frontier Foundation writes in an e-mail. “There’s the same emphasis on broad collection and pattern analysis.”

And here is Michael Hirsh of Newsweek describing a program called Topsail:

Today, very quietly, the core of TIA survives with a new codename of Topsail….It is in programs like these that real data mining is going on and ? considering the furor over TIA ? with fewer intrusions on civil liberties than occur under the NSA surveillance program.

Hirsh suggests that data mining is a pretty useful technology, and I agree. He also suggests that there’s a lot of lousy data mining going on, largely because our intelligence community is broken and doesn’t have anything better to do. That may be right as well.

In the end, though, data mining is just a technology, neither inherently good nor bad. It’s the details that matter:

  • What information gets sucked into the system? Public information is fine; personal information that’s supposed to be private isn’t.

  • How is the information used? Data mining has a high error rate, which means that it should be used only to produce leads that are followed up by professionals. No one should ever be placed on a watchlist or a no-fly list based solely on what the system spits out.

  • Does it work? Or, as Hirsh suggests, do these systems become billion-dollar black holes of unworkable technology?

  • What kind of oversight is there? Security professionals are just doing their jobs when they create data mining systems, but the potential for abuse is high even if they think they’re following the rules. Congress needs to be intimately involved.

This is a hot topic. I expect to see it crop up a lot in the news this year.