After social media networks found out about Deep Social’s practices, they banned their APIs from collecting data, but many other companies continue this type of operation. According to Comparitech, the data contained names, contact information, personal information, images and statistics. A few hours after the incident was reported, Social Data took the databases down.
It’s easy to assume that a user might not have sensitive information in one social media profile, but scraping reveals multiple sources for one person. Compiling data from different sources creates a clearer image of the digital persona, showing trends, preferences, spending habits, political preferences, location and other information.
The Xiaoxintong database contains more than 340,000 records of:Mobile numbers, addresses and GPS locationsMobile numbers and names of users’ relatives and other “Guardians”Location tracks (including addresses and GPS coordinates)Hashed passwordsSOS records and SOS record locationsPersonal IDsMost of these (about 285,000) were for addresses, GPS coordinates and personal IDs. The second database (possibly from Shanghai Yanhua Smartech).
Besides the legal aspect, the biggest issue is that social media networks prohibit this kind of data gathering as it violates their user policies. This hasn’t stopped companies from gathering data, but it’s challenging to identify traffic from organizations such as Social Data.
While Social Data denies collecting data not already available online, the simple act of scraping and matching public data is not allowed.“Anyone could phish or contact any person that indicates telephone and email on his social network profile description in the same way even without the existence of the database,” said Social Data’s spokesperson in an email to Comparitech. “Social networks themselves expose the data to outsiders – that is their business – open public networks and profiles. Those users who do not wish to provide information, make their accounts private.”