Data Aggregators

learn data

(Also know as web scrapers)

What is data aggregation?

There's a lot of data out there. You probably have so many accounts that you can't keep track of them all. Computers, however, are very good at keeping track of things.

Data aggregation is the process of scraping every accessible site for data in the hopes to link the data to true identities. Each site might know a little more, or a little different information about you. These companies put all the information from all these different sites into a single place.‍

What's the point?

Combining all your data from many different sources creates a much more complete picture of who you are. This adds desirability to those who wish to target you, creating demand, allowing them to charge a price.

They would tell you they're creating value out of nothing. But does it feel right to you?

Example: You sign up for facebook which asked for your name and birthday. Now you sign up for amazon which requires your home address, and google which wants your gender. You used the same email for all of them. A data aggregator would collect data from all three, and associate them under a single identity, and now anyone can purchase your name, birthday, address, and gender from a single source if they simply have your email.‍

How to disrupt aggregation of your data

Unsurprisingly, if you reviewed the last lesson, data aggregation depends on cross-site identifiers. If you can break these down, the aggregator won't be able to connect your multiple accounts as belonging to the same true identity, and will create multiple separate entries.

By using unique identifying information on every site you use, you can eliminate much of the aggregator's power.