June 5, 2018

MyHeritage, an Israeli-based genealogy and DNA testing company, disclosed today that a security researcher found on the Internet a file containing the email addresses and hashed passwords of more than 92 million of its users.

MyHeritage says it has no reason to believe other user data was compromised, and it is urging all users to change their passwords. It says sensitive customer DNA data is stored on IT systems that are separate from its user database, and that user passwords were “hashed” — or churned through a mathematical model designed to turn them into unique pieces of gibberish text that is (in theory, at least) difficult to reverse.

MyHeritage did not say in its blog post which method it used to obfuscate user passwords, but suggested that it had added some uniqueness to each password (beyond the hashing) to make them all much harder to crack.

“MyHeritage does not store user passwords, but rather a one-way hash of each password, in which the hash key differs for each customer,” wrote Omer Deutsch, MyHeritage’s chief information security officer. “This means that anyone gaining access to the hashed passwords does not have the actual passwords.”

The company said the security researcher who found the user database reported it on Monday, June 4. The file contained the email addresses and hashed passwords of 92,283,889 users who created accounts at MyHeritage up to and including Oct. 26, 2017, which MyHeritage says was “the date of the breach.”

MyHeritage added that it is expediting work on an upcoming two-factor authentication option that the company plans to make available to all MyHeritage users soon.

“This will allow users interested in taking advantage of it, to authenticate themselves using a mobile device in addition to a password, which will further harden their MyHeritage accounts against illegitimate access,” the blog post concludes.

MyHeritage has not yet responded to requests for comment and clarification on several points. I will update this post if that changes.

ANALYSIS

MyHeritage’s repeated assurances that nothing related to user DNA ancestry tests and genealogy data was impacted by this incident are not reassuring. Much depends on the strength of the hashing routine used to obfuscate user passwords.

Thieves can use open-source tools to crack large numbers of passwords that are scrambled by weaker hashing algorithms (MD5 and SHA-1, e.g.) with very little effort. Passwords jumbled by more advanced hashing methods — such as Bcrypt — are typically far more difficult to crack, but I would expect any breach victim who was using Bcrypt to disclose this and point to it as a mitigating factor in a cybersecurity incident.

In its blog post, MyHeritage says it enabled a unique “hash key” for each user password. It seems likely the company is talking about adding random “salt” to each password, which can be a very effective method for blunting large-scale password cracking attacks (if implemented properly).

If indeed the MyHeritage user database was taken and stored by a malicious hacker (as opposed to inadvertently exposed by an employee), there is a good chance that the attackers will be trying to crack all user passwords. And if any of those passwords are crackable, the attackers will then of course get access to the more personal data on those users.

In light of this and the sensitivity of the data involved, it would seem prudent for MyHeritage to simply expire all existing passwords and force a password reset for all of users, instead of relying on them to do it themselves at some point (hopefully, before any attackers might figure out how to crack the user password hashes).

Finally, it’s astounding that 92 million+ users thought it was okay to protect such sensitive data with just a username and password. And that MyHeritage is only now getting around developing two-factor solutions.

It’s now 2018, and two-factor authentication is not a new security technology by any stretch. A word of advice: If a Web site you trust with sensitive personal or financial information doesn’t offer some form of multi-factor authentication, it’s time to shop around.

Check out 2fa.directory, and compare how your bank, email, Web/cloud hosting or domain name provider stacks up against the competition. If you find a competitor with better security, consider moving your data and business there.

Every company (including MyHeritage) likes to say that “your privacy and the security of your data are our highest priority.” Maybe it’s time we stopped patronizing companies that don’t outwardly demonstrate that priority.

For more on MyHeritage, check out this March 2018 story in The Atlantic about how the company recently mapped out a 13-million person family tree.

Update, June 6, 3:12 p.m. ET: MyHeritage just updated their statement to say that they are now forcing a password reset for all users. From the new section:

“To maximize the security of our users, we have started the process of expiring ALL user passwords on MyHeritage. This process will take place over the next few days. It will include all 92.3 million affected user accounts plus all 4 million additional accounts that have signed up to MyHeritage after the breach date of October 26, 2017.”

“As of now, we’ve already expired the passwords of more than half of the user accounts on MyHeritage. Users whose passwords were expired are forced to set a new password and will not be able to access their account and data on MyHeritage until they complete this. This procedure can only be done through an email sent to their account’s email address at MyHeritage. This will make it more difficult for any unauthorized person, even someone who knows the user’s password, to access the account.”

“We plan to complete the process of expiring all the passwords in the next few days, at which point all the affected passwords will no longer be usable to access accounts and data on MyHeritage. Note that other websites and services owned and operated by MyHeritage, such as Geni.com and Legacy Family Tree, have not been affected by the incident.”


36 thoughts on “Researcher Finds Credentials for 92 Million Users of DNA Testing Firm MyHeritage

  1. The Sunshine State

    Thanks for the heads up on this breach !

  2. Kallen Web Design

    “In light of this and the sensitivity of the data involved, it would seem prudent for MyHeritage to simply expire all existing passwords and force a password reset for all of users…”

    Absolutely! This would pretty much assure that all this personal information stays “safe”. Why would companies who’ve experienced a breach resist this?

    1. Security Guy

      The reason any company would resist this kind of global change is that it could flood their support desk with customer’s who cannot login for one reason or another. If even .1% of their customers have login problems after a global password reset that would be 92,000 calls.

    2. Barry Greene

      Forcing a reset on all passwords would only work if the system was built with that feature. Given the number of incidents, an ability to force reset on all would be a prudent BCP.

    3. Petrus 1849

      Hacked, taken by disgruntled employee or sold by currently unhappy employee?

  3. Harry Stoner

    “Finally, it’s astounding that 92 million+ users thought it was okay to protect such sensitive data with just a username and password”

    There is nothing astounding about this. Every user of most web sites expects/wants nothing more than a userid and password. Yes, a small subset will seek out two factor solutions if provided.

    However Brian, as a public person with a big target on your back, I would expect you to use every means at your disposal to protect your accounts.

    1. SkunkWerks

      “Every user of most web sites expects/wants nothing more than a userid and password.”

      And that’s only about 50% true in 65-ish% of cases, I’d wager.

      User: “It says I need a username and password, not this sh** again! Well, I mean lessee, my username’s already my email, so now I need a password…”

      [Enters ‘password’ in the password field]

      Website: “Your password is not strong enough” please use a capital letter.”

      [User sighs, enters ‘Password’ instead.]

      Website: “Your password is not strong enough” please use one or more digits (0-9).”

      [Audible growling is heard, user tries ‘Password1234’]

      Website: “Your password is not strong enough” please use one or more special characters ( { ! , # ).”

      [A keyboard is slapped, user backspaces and tries ‘Password!1234’]

      Website: “Success, your account has been created, a verification email will be sent to your email address with further instructions.”

      User: “Hah! Showed that site!”

      [The sound of Smug reverberates for several minutes thereafter]

      I get what Brian is saying though. In 92-million accounts, you’d expect to find at least a few folks who thought securing this better was important. That said I think attrition is reached when you realize the system doesn’t offer that kind of protection, and the people in account security you just emailed replied back with the textual equivalent of lookign at you like you have three heads.

      Absolute attrition is reached when you realize most of the adjacent market reacts in the same way- because of the 92-million accounts, I’m pretty sure the “someones” who thought this could be done better- and voiced that concern- account for less than a fraction of a percentile.

      Basically put: the market responds to demand. There’s very little of that, because most people either don’t understand, don’t care, or more likely, both.

  4. Dennis

    Fair deal. But tbh, none of them support 2FA — 23andme doesn’t support it, so is ancestry.com.

    I’ll try to tweet at them and include this story. Maybe that will rock things a little bit.

  5. Donny

    One important thing getting lost is that only about 1.4 million users on MyHeritage have DNA profiles, not nearly all of those breached. Also, e-mail authentication is required for downloading raw DNA data, as it is on all of the major genetic genealogy sites with the exception of Family Tree DNA. Unfortunately 23andMe will let you view reports and individual variants without 2-factor authentication.

    MyHeritage’s announcement comes on the same day as the data breach, and is heavily on PR damage-control mode. Who hacks a DNA testing service, downloads login credentials just one time and leaves it hanging where others can find it for over half a year, without anybody even trying to use them in any way?

    Absent from the announcement is statement that the leak has been found and plugged, instead there’s just a statement that they’re in progress of retaining a firm to investigate the potential means and scope of the breach. In this context it’s perhaps important to stress that two-factor authentication doesn’t protect from database breach, it only protects your login credentials.

  6. vb

    The “chief information security officer” sounds like a marketing/PR person. If he’s unable to state the hashing algorithm or if salt was used, that is not a good sign. It sounds like he knows a couple of words like “hash” and “hash key”.

    1. Anon404

      Thats not true. He’s likely “unwilling” to say the hashing algorithm.

      1. Bob Brown

        Mitigating in favor of a PR person (or a clueless person) is the confusion between “hash key” and “salt.”

        One would expect, at the least, that they say, “a cryptographically secure hashing algorithm,” and trying to keep the algorithm secret is security by obscurity.

        1. Donny

          Not to go all lawyer on this, but the GDPR which the announcement specifically refers requires in Art. 34(2) about disclosure “The communication to the data subject referred to in paragraph 1 of this Article shall describe in clear and plain language the nature of the personal data breach […]”

          Consequently, the disclosure to users must not use jargon or technical terms that the average reader would not understand (From GDPR interpretations). This is a very good idea, as the intended audience aren’t computer security professionals. Recital 88 further limits disclosures that affect ongoing investigation, obviously.

  7. jimmy

    The human genome is so large and so unexplored yet these ancestry testing companies only use a teeny tiny bit of it for their ancestry tests. Within these companies, your DNA is simply compared to other known samples – not really uniquely explored or identified. Therefore, these ancestry tests can and often will be wildly inaccurate and are more akin to high-tech tea leaves than definitive information.

    For example, if you send your DNA to 3 different companies, you may easily get 3 different results. Plus, now we know that no data is secure, including the data produced from your DNA sample.

    As inaccurate as these tests are for ancestry, they are deemed accurate enough to implicate you in a crime.

    TL;DR These tests are not accurate enough to tell you about your real ancestry but accurate enough to implicate you in a crime. No data is safe. Don’t volunteer your DNA to these companies.

    1. Donny

      The main reason many people use these sites is genealogy, the ability to find distant cousins who are lost to the books of history, which isn’t easy to do without a database to search against (Though doing just that is a hot topic of computer research). I expect the issue will quickly fade though, as we’re leaving our DNA everywhere we go, and at present time it would be easier and cheaper (and in many cases legal) for an adversary to sequence your DNA from a water bottle you discarded than obtain it via most data breaches. The real value in genetic data is other data that’s attached to it. You could say genealogical data itself is the risk, without which it would be impossible to connect one DNA profile to another.

      I’m sure we all love our privacy here, but thrashing the well-established science of population genetics (https://www.ncbi.nlm.nih.gov/pubmed/26857625 for example – no need to understand that, just know it exists) seems misplaced. Since DNA is inherited in large continuous tracts, the “teeny tiny bit” these companies test is actually too much for recent genetic ancestry, and needs to be thinned down for that use. There are no biologically forced populations, “races”, however and therefore the populations are defined differently by different companies.

      The argument is equivalent to “Don’t volunteer your data to navigator companies, they don’t even give the same route from each company”. Yet, for the company’s algorithms to work their magic properly, you usually need to share your location, even if it won’t guarantee you the best possible route plan in the industry. Of course, when people are additionally knowingly consenting to medical research, as many are doing, your admonishment becomes much closer to “Never donate blood to a company, they can detect diseases, sequence your genome and frame you with your blood while getting rich on it”.

      There are privacy and ethical concerns for sure, but as human beings we owe it ourselves to get those right. So let’s continue to keep companies accountable, and push for best technical solutions, instead of hiding our heads in the sand and saying that only governments deserve and should have access to our genome. Or perhaps you were thinking they will stop using it too?

      1. Russell Imrie

        *like* “‘The argument is equivalent to ‘Don’t volunteer your data to navigator companies, they don’t even give the same route from each company’. “

    2. Reader

      +1, the points you make are valid and thoughtful

  8. Robert Waters

    DNA, considered the final human data, has become healthcare data, business data and government data. Almost every science is seeking how it works but the social sciences already have leaped ahead by attributing “social behavioral theories” to DNA’s quality per person (disease probability, intelligence, wealth creators, hiring metrics, etc.) If H.R.1313 ever passes from committee’s into Congress, allowing business to get ahold of DNA from employee’s and family members – then we’ll have problems unimaginable.

  9. Reader

    Brian,

    I think MyHeritage ought to be commended for their rapid disclosure. It’s unusual for an announcement to be made one day after a breach is discovered. That was pretty responsible of them.

    Regarding expiring passwords, do you have evidence that this company failed to use the best practices you advocate, i.e. salt, hash, bcrypt? I read this article twice and I’m unsure why so feel they should choose the nuclear option.

    Next, is it fair to criticize a company for not yet implementing 2fa, when most of its competitors also don’t use 2fa? Your own articles have pointed out how most commercially available 2fA methods have shortcomings.

    Finally, if they did expire all passwords, what could customers use to reset their password and prove their identities that hasn’t already been sold and traded on the dark web? Isn’t it worse to tell 92,000,000 customers to click on a link in their email?

    1. Bobby Jr

      Of course it’s fair to ask them to use 2FA when their competitors don’t. The argument your making probably didn’t work when you were a kid and so it shouldn’t now: “But Mooooommm, Sally didn’t eat her broccoli either!” was never and will never be a valid argument. What’s right is right.

      1. Reader

        Bobby,

        If Sally has a good reason to skip the broccoli, it would be wise to ask why.

        Perhaps broccoli provides inadequate nutrition to justify the nausea it induces. Plenty of kids absolutely hate broccoli and will skip the entire meal because of that fetid, vile vegetable.

        Sally may know more about the broccoli preparation. Perhaps it’s undercooked or poorly seasoned. Perhaps the broccoli is rotten.

        There’s plenty of evidence that 2fA frustrates the casual criminal, but little evidence it can stop focused, high-tech thieves.

        There’s plenty of evidence that 2fA leads to higher customer service costs and has no effect on marketing or sales.

        If Sally vomits every time she’s forced to eat broccoli, will you stop feeding her broccoli?

        Why, then, should a company implement 2fA when their competition doesnt?

        1. Bobby Jr

          I dunno, to offer more protection than their competitor thereby differentiating them from the herd? To better protect their customers?

          1. Reader

            Bobby,

            Standing out from competitors is a good business stategy, only if it increases profit. I am not aware of any examples where a consumer-oriented company increased profits by implementing 2fA.

            These genealogy companies are all selling a single product to a small portion of the population with questionable pedigree and disposable income.

            On what basis do they compete for DNA kits? Price.

            Could they compete based on having more complex security? No.

            Time and again, the public has made it abundantly clear that they don’t care about security of data.

            PASSWORD is consistently among the most used passwords.

            Most “social media” users and Gmail and Hotmail users don’t use 2fA, despite it being available for free, nor do they choose their providers based on 2fA. Banking customers overwhelmingly reject 2fA, when available, because they don’t want the hassle.

            2fA in the genealogy market will just cost them more in customer service, an expense. In a limited market, implementation of 2fA is a poor business strategy.

  10. M E Emberson

    I like your blog. However I would prefer not to read it in gmail just now. I have tried and tried for months to cancel using the cancel section in the notice of a new topic you send to gmail .

    It invariably tells me that my gmail address is not on the list BUT This reply is sent to gmail. to my address
    To sum up
    So I get a message to an address which does not exist on your list and therefore I cannot cancel it because it doesn’t exist… then another message appears again when there is a new topic.

    1. Bob Brown

      Did you perhaps use the “plus-hack” when you subscribed?

  11. Nick

    I called them and asked what hashing/salting algorithm they used and they wouldn’t tell me unless I gave them an account ID #. (Fortunately I don’t have one.)

    But perhaps if one of their customers calls and asks we may get the answer to this question?

  12. Andrew F Read

    I’ve had the opposite problem with 23andMe. I have been with them since 2011, but this year they locked me out of my own account saying my password is not recognized — even though it is the same one I’ve been using with them for 6+ years. And their customer ‘support’ is a computer program which can not handle whatever the problem is.

    So now my own data are secure from me. If someone can hack their system and finds what 23andMe thinks my password is, please send it to me.

  13. Bard of Bumperstickers

    I’m more concerned about that X-Files episode with everyone’s DNA, captured during inoculation at public school, stored in that remote hollowed-out mountain in West Virginia. Sounds like predictive programming. Don’t trust nothin’ . . .

    1. Bard of Bumperstickers

      “US Intelligence Developing Better Storage To HOARD YOUR DATA Based On Human DNA”:

      http://www.shtfplan.com/headline-news/us-intelligence-developing-better-storage-to-hoard-your-data-based-on-human-dna_06132018

      What could possibly go wrong Uncle “Nurse Ratched” Sam having this info? The revolving door between Washington, K Street, Wall Street, Big Pharma, Big Insurance. Military-Industrial Complex, Lucy Van Pelt, etc., would NEVER pull the football away yet again, as John Q. “Charlie Brown” Citizen kicks trustingly and forgivingly . . . yet again! };^D

Comments are closed.