"It’s bad that someone had this whole thing wide open," Troia says. "This is the first time I've seen all these social media profiles collected and merged with user profile information into a single database on this scale. From the perspective of an attacker, if the goal is to impersonate people or hijack their accounts, you have names, phone numbers, and associated account URLs. That's a lot of information in one place to get you started."
Troia found the server while looking for exposures with fellow security researcher Bob Diachenko on the web scanning services BinaryEdge and Shodan. The IP address for the server simply traced to Google Cloud Services, so Troia doesn't know who amassed the data stored there. He also has no way of knowing if anyone else found and downloaded the data before he did, but notes that the server was easy to find and access. WIRED checked six people's personal email addresses against the data set; four were there and returned accurate profiles. Troia reported the exposure to contacts at the Federal Bureau of Investigation. Within a few hours, he says, someone pulled the server and the exposed data offline. The FBI declined to comment for this story.
Of Unknown OriginThe data Troia discovered seems to be four datasets cobbled together. Three were labeled, perhaps by the server owner, as coming from a data broker based in San Francisco called People Data Labs. PDL claims on its website to have data on over 1.5 billion people for sale, including almost 260 million in the United States. It also touts more than a billion personal email addresses, more than 420 million LinkedIn URLs, more than a billion Facebook URLs and IDs, and more than 400 million phone numbers, including more than 200 million valid US cellphone numbers.
PDL cofounder Sean Thorne says that his company doesn't own the server that hosted the exposed data, an assessment Troia agrees with based on his limited visibility. It's also unclear how the records got there in the first place.“The owner of this server likely used one of our enrichment products, along with a number of other data enrichment or licensing services," says Sean Thorne, cofounder of People Data Labs. "Once a customer receives data from us, or any other data providers, the data is on their servers and the security is their responsibility. We perform free security audits, consultations, and workshops with the majority of our customers."
Last week, security researchers Bob Diachenko and Vinny Troia discovered an unprotected, publicly accessible MongoDB database containing 150 gigabytes-worth of detailed, plaintext marketing data—including 763 million unique email addresses. The database, owned by the "email validation" firm Verifications.io, was taken offline the same day Diachenko reported it to the company.
Troia thinks it's unlikely that People Data Labs was breached, since it would be simpler to just buy data from the company. An attacker on a budget could also sign up for a free trial PDL advertises that offers 1,000 consumer profiles per month. "One thousand profiles to 1,000 burner accounts and you've got pretty much all of it," Troia points out.One of the other data sets is labeled "OXY" and every record in it also contains an "OXY" tag. Troia speculates that this may refer to Wyoming-based data broker Oxydata, which claims to have 4 TB of data, including 380 million profiles on consumers and employees in 85 industries and 195 countries around the world. Martynas Simanauskas, Oxydata director of business to business sales, emphasized that Oxydata hasn't suffered a breach, and that it does not label its data with an "OXY" tag.