Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

And half the DNA of all of the siblings and parents of the people that submitted their DNA, a quarter of their grandparents and grandchildren and so on. That's what I really hate about these companies, they get people to submit their DNA and the customers do not realize it isn't a decision that affects just them.


In terms of medical data the amount of leakage is somewhat limited by the nature of DNA, e.g because you get a random mix you can't conclude anything about parents, etc medical status.

By far the biggest practical knock-on effect is if you match someone who's doesn't know their parentage (adoption/illegitimate children/etc) who can figure out their parentage as a result of that match.

Familial DNA crime searches are probably the next biggest, but they're still very rare at the moment and many of the DNA platforms don't allow them (GEDMatch was one of the few that do).


> and many of the DNA platforms don't allow them

I’m assuming that the stolen information doesn’t have this limitation.


> I’m assuming that the stolen information doesn’t have this limitation

Stolen information has provenance problems that make it difficult to use as evidence of any crime other than theft itself in any system with even rudimentary due process protections and presumption of innocence.

I mean, it's hardly as if you are going to be able to get the people who handled the data between the people who had it lawfully and the time it got to the police on the stand to attest to it's integrity.

(That doesn't prevent its use in investigations, but it means that it would only lead to convictions in a contested case where the police used it to locate proof that was legally sufficient without the use of the DNA as evidence.)


Law enforcement is trained in information laundering and courts consider it a legitimate tactic.


Its easy to use this information to search for suspects, but not bring it up in court once you find other evidence.

https://en.wikipedia.org/wiki/Parallel_construction


Most law enforcement typically use a handful of commercial agencies (like Parabon NanoLabs, etc) for these kind of searches, it seems highly unlikely that any of them would risk using illegally obtained data because it would put their entire business at risk.

(obviously if your threat model includes intelligence agencies, etc. then your calculus might be different)


And of course law enforcement has never before used illegally obtained evidence to construct a new trail that was plausible:

https://en.wikipedia.org/wiki/Parallel_construction


It's not about law enforcement using the data, it's about the viability of running a business that provides illegal hacking services for law enforcement.


Works for Hacking Team and NSO.


Honestly all I see is upside for the business. Are they even obliged to "show their work" for how they produce an identity and distance?


I’m cackling at the idea of GDPR compliant data thieves. “This data is only to be used for the purposes of: anything. The data controller is: whoever.”


If a child has 2 copies of a variant you know both parents have at least 1 copy.

You know what parent a male's X and Y came from.

You can use phasing and linkage to reconstruct parental haplotypes.


> You know what parent a male's X and Y came from.

You can identify which parent any chromosome came from. They're all marked, and the same genetics may do sharply different things depending on whether it was inherited from the father or the mother.

Inability to recover this data has nothing to do with "the nature of DNA" -- the data is very much present in the DNA. It's unrecoverable because when we summarize DNA, we leave it out.


> You can identify which parent any chromosome came from. They're all marked, and the same genetics may do sharply different things depending on whether it was inherited from the father or the mother.

I did not know this. This sounds interesting! Can you provide any google search terms (or a link) where I can read more about this? (e.g. a name of what they are marked with) This surprises me. I thought that there was a process by which portions of the two copies of a chromosome get switched between the two. Is that right? How does that fit together with these markings?

(If these questions would be answered by searching for whatever search term or reading whatever link you provide, I would consider providing said search term or link to be answering these questions)


> Can you provide any google search terms (or a link) where I can read more about this? (e.g. a name of what they are marked with)

The term I know related to this is "methylation". https://en.wikipedia.org/wiki/DNA_methylation . I don't know all that much about it; I would not want to claim that methylation is the only such mechanism, or that this is the only information expressed by DNA methylation.

> I thought that there was a process by which portions of the two copies of a chromosome get switched between the two. Is that right? How does that fit together with these markings?

Yes, that's correct. "Crossing over" does not occur during ordinary cell division ("mitosis"), in which one of your cells divides into two of your cells -- your chromosomes should stay the same (except for new mutations) through your life.

But it does occur during meiosis, the process by which one of your cells divides into four sperm or four eggs (these are "gametes", and in terms of chromosome content they are only half-cells, not full cells). Your children's chromosomes may therefore differ from yours.

So the interaction between parental marking and crossing over would broadly look like:

1. You are going to produce four gametes.

2. Remove the parental marking (indicating the sex of the gamete's grandparent) from the cell undergoing meiosis.

3. Do the crossing over.

4. Apply parental marking indicating your own sex (the gamete's parent, rather than grandparent).

5. Divide into four cells.

I don't actually know where the unmarking and remarking occur in the process; maybe reality is more like 2435, or 3254. But both crossing over and applying correct parental marking are part of meiosis -- since meiosis produces a cell that belongs to your child rather than a cell that belongs to you, it's easy to know what kind of marking should be applied.


Ahhh, cool, thank you! That makes sense now, thanks!


Yes, but there's generally not much medical data you can infer from those.

You're right that you could reconstruct parental haplotypes, but that reveals a fairly limited amount of data, typically you'll share haplotypes with many millions of people.


Yes, but there's generally not much medical data you can infer from those.

Not yet.


What would insurance companies do with the data though? If they knew you were predisposed to obesity and cancer due to this data, would they be kind enough to ignore that info?


Federal law prohibits health insurers from using DNA data for underwriting and pricing.


And if it didn't the insurance companies could just demand a dna test before underwriting any policies.


If we’re hypothetically considering what they could do if they weren’t one of the most regulated industries, they have exponentially better options for limiting their risk than requesting DNA.


It would be a sound business decision (who takes on unnecessary risk or costs willingly?), and yet another reason to support universal healthcare.


It’s not only medical data that’s of concern, but also nation states Could try to use the data to identify embedded foreign agents/spies implanted in their country. Those are the ones without diplomatic cover


Ignorant question here. How is this not regulated through HIPAA? Shouldn't these board members of this company face prison? DNA, a prosecutor could argue is a unique health identifier.

"Access to equipment containing health information should be carefully controlled and monitored."

https://en.wikipedia.org/wiki/Health_Insurance_Portability_a...


People think of HIPAA as a generic cover-all medical privacy law for some reason.

It's not, not even close, It's a law that very narrowly applies mainly to insurance companies and healthcare entities that accept medical insurance.

As a general rule - if insurance is never involved HIPAA doesn't apply.

If you got a DNA test prescribed by your doctor for a diagnosis or even for genetic counseling then HIPAA applies. It's not the nature of the data, it's the nature of the organization dealing with the data.

I have no idea where this mass misunderstanding came from


"if insurance is never involved HIPAA doesn't apply."

No. This is just plain false.

HIPAA applies when personally identifiable health information is shared/exchanged. And it applies whether the data is electronic or physical (paper).

(I am NOT saying DNA falls within the HIPAA guidelines.)


No, personally identifiable health information can be shared/exchanged without HIPAA applying. For example if I email my grandma information about my cancer diagnosis, Gmail isn't HIPAA compliant and doesn't need to be just because some people might use it to talk about their health. Grandma is also free to share my health information with impunity, she is free to, say, forward it to my boss because grandma doesn't have to abide by HIPAA either because she's a grandma.


Correct, you can personally share whatever information you like.

But a covered entity may not. And there are many covered entities which are not insurance related. That is all I was trying to say.


The privacy rule only applies covered entities. If a covered entity works with cloud provider, they sign a BAA. The cloud provider is not a covered entity.


HIPPA only applies to a specific list of covered entities... health providers, insurance, etc.

DNA services are not currently considered covered entities.

They should be, IMO, but I believe Congress would have to act.


More accurately "Dna services for funsies" are not covered entities. Medical labs that sequence DNA in the realm of actual healthcare (and accept medical insurance) are covered entities.


if they construe their DNA data as not health information, but instead information like finger prints?


"I'm standing here in this chalk circle where HIPAA does not apply, can't touch me, nyah nyah!" Sounds like that would work against a 5-year-old sibling, but that's rarely the case...


It's not covered. The ones at which we should be most angry are law enforcement officers using this information. This is simply a first step to the state collecting DNA on all citizens (see what it's done with fingerprints as an example.)


This is exactly how I feel about my friends/family having Facebook apps on their phone. I didn't consent to giving my contact info to Facebook. I wasn't given a choice.


Agreed. But at least I don't leak their data in return. I figure Facebook must have 99%+ coverage of the world's social graph by now, including all the holdouts. You may not have an account, but they know you exist, what you look like, what your phone number is and probably where you are just by observing the nodes that are still 'blank'. Shadow profiles should be assumed to be just as detailed as the rest. It's one reason why there are very few photographs of me online (or elsewhere). I'd like the option to go rogue one day to be open to me ;)


It is impossible to have an "expectation of privacy" over the DNA of your relatives. You need to live with it, and resolve your feelings to the reality of that situation.

You don't get a choice if your uncle, grandmother, aunt, niece or son share their DNA with law enforcement.


I'm not sure it's just feelings to be resolved. The problem is that even DNA matching, especially the SNP genotyping which tends to be used by consumer ancestry and heritage services, is not perfect. So your daughter can end up false-positively matching to a crime 20 years from now due to your uncle submitting his own DNA last year without you ever knowing. I'm not sure how anyone sufficiently familiar with the implications can just get over it and accept the reality of the situation. It is by no means an easy problem to solve.


That’s very true but it’s more of a criminal justice problem than a privacy problem. The same issue could happen (and has) with fingerprints or other biometric data.


Do they not realize? It's just served as a convenience for family trees and cure together services. Same mechanism with any "social app" that asks access to all your phone contacts to "easy friends discovery": you may have never used Tiktok or Facebook but they already know quite a few things about you thanks to an acquaintance. I know DNA stuff seems scarier in the long run but data from an address book are more easily exploitable.


So I should get the consent of my entire extended family before I ever submit my DNA to a service for analysis?


With likely very dire results, yes I think you should. If your mothers insurance rate goes up, since you got one of these dna tests for Christmas, she should be involved in the decision to publish this data in the first place.


In the US, Congress has passed a law that explicitly makes that specific practice illegal: https://en.m.wikipedia.org/wiki/Genetic_Information_Nondiscr...

What workarounds insurance companies come up with to circumvent the spirit of the law and how well it can be enforced will be interesting.


And George W. Bush, a Republican, signed this into law. I remember thinking that strange at the time because I would have thought insurance companies would want to be able to use DNA information and the Republicans being more of a "big business" party would have supported that.

Also, I found out last time this discussion came up on HN that the law prevents it being used for regular insurance but does not apply to life insurance.


And life insurance could just simply demand a dna sample from you before underwriting a policy just like they might demand a physical so the whole "concern" is entirely moot.

Insurance is highly regulated, insurance companies have specific legal ways to underwrite policies, the idea that life insurance companies are going to secretly use stolen data of uncertain provenance in their underwriting instead of just making you submit a dna sample is, quite frankly, silly.


What workarounds insurance companies come up with to ... will be interesting

If there's enough money to be made, I'm sure the Usual People will be persuaded to bend the law until it gives way.


I think so, yes. Otherwise you're sending large portions of their personally identifiable information to some sort of database without their consent.


Do you also think I should consult my identical twin for permission before uploading photos of myself to the internet? Why / why not?


Hardly a fair comparison. However the few identical twins I know have been very mindful of how their individual behavior affects the other.


Of course I realised the comparison would be somewhat controversial, it was actually the point of bringing it up. However, if you have the time I would appreciate it if you tried to articulate why you think the comparison is unfair, instead of just a general dismissal.


Your (twin’s) photo is unlikely to be used for:

* Identifying future medical risk factors

* Solving 30-year-old cold cases where DNA is the only evidence

* Identifying parentage in adoption cases


But my (twin's) photos could likely be used used for:

* Linking them to the location of a crime using Clearview AI and similar scraping facial recognition services

* Creating fake but believable defamatory photos and videos, such as deepfakes

* Being scraped and used in fake profiles by spambots and other nefarious actors

* Being exploited as a tool in identify theft and identify fraud, via various kinds of social engineering.

Do you not consider some of these scenarios worthy of a similar amount of consideration?


I consider them to be unavoidable, barring some extreme off-the-grid efforts. Your photo is out there. Your DNA doesn’t have to be.


I'd argue not, since a single photography contains much, much less information than a full DNA fingerprint.


Not for identifying or incriminating you it doesn't given the practical risk of how the information can be used.


That's a fair point. However, I'm not entirely sure I buy the premise. With the advent of deepfakes and internet scraping facial recognition, I think a public photo collection of your entire likeness could be considered least somewhat at risk for abuse, when compared to the risk that a confidential fingerprint with ~25% of your genes is leaked and then used against you.


Your data, your rules. I put my 23andme raw data in github (https://github.com/sbassi/MiGenomaSbassi) for the world to see and use without asking anybody in my family.


I strongly disagree. I consider it comparable to something like financial administration. In which an "expense" or "exchange" has two sides. Me, paying you, you receiving the money.

It is not up to me to decide to just release such data. Because it encodes other people's data too. If I were to release my financial records because "it's my data", i'd be exposing a lot of people, organisations and companies who I had interaction with.


But it is up to me to decide to release my financial records. All the parties I've dealt with have to expect the possibility (unless there is some signed agreement that prevents disclosing them).

With DNA I'm not so sure.


I'm pretty sure that if <insert ecommerce platform here> were to leak all their financial transactions, that is considered a large data-breach and would be considered a privacy infringement.

I am aware that "an ecommerce platform" is something else than "your personal finance", but the principle is the same: X shouldn't release other people's financial transactions just because those were done with X.


The federal Genetic Information Nondiscrimination Act does prohibit insurers from asking for or using your genetic information to make decisions about whether to sell you health insurance or how much to charge you. But those privacy protections don't apply to long-term-care policies, life insurance or disability insurance.

https://www.npr.org/sections/health-shots/2018/08/07/6360262...


The point is, it's not just your data. It's shared with other people who may not want it being publicised to all and sundry.


That's ridiculous. Should I not use my surname because I identify my parents, my brothers and some of my cousins? I think bodily autonomy applies here.


You are giving far more away with DNA information; it's not remotely the same.

The point is, at the very least it's a grey area, so to dismiss the counter points so airily as you have done on such a serious subject indicates - at best - a lack of reflection and respect for the rights of others.


No, but for having the results stored at some company.


This is really another example of a claim of "genetic exceptionalism", that genetic information has a special status among other sorts of personal information, that mostly is not true. Your personal information, broadly, is informative about your relatives, your friends, etc. This includes your personal health information, your personal financial information, your online habits, etc. Any time you share personal data, you are disclosing information about people associated with you, without their consent, that might be used against them. And often these other classes of personal data are more informative than genetic information.


I remember hearing about some dude in his 80's , arrested out of the blue, for a murder he committed ~30 years ago.

The police used the crime scene's partial DNA and compared it to somebody's 23andMe sample.

Thanks a lot, grandson!


That's probably the story you are referring to: https://www.sciencemag.org/news/2018/10/we-will-find-you-dna...


Oh, wow yep that's the one. Thanks for source!


Yeah, that was my biggest fear with these services. How do I stop my family members from falling for it? In the end, I can't and just have to live with their mistake (if they used these services).


Data breaches happen, doesn't mean using a service is a mistake.


Bringing criminals to justice is a positive outcome.

Negative outcomes include:

1. Racist people persecuting people based on their ancestry, as determined from DNA data.

1. Police performing incorrect DNA database searches and falsely accusing people of crimes. Example: https://www.pbs.org/newshour/show/a-father-took-an-at-home-d...

1. Police misconstruing DNA evidence and falsely accusing people of crimes. For example, a person's DNA can appear at a crime scene if they rode in a Lyft before a perpetrator.

1. Criminals extorting parents of sperm-donor children: Pay us or we'll reveal to your kids that he's not their dad.

1. Criminals extorting unfaithful parents: Pay us or we'll tell him that the kid isn't his. Pay us or we'll tell her about the child born from your affair. Pay us or we'll tell your religious group about your child born not to your spouse.

1. Criminals extorting people about their expected health outcomes: Pay us or we'll tell the shareholders about your 50% chance of getting disease X in the next 5 years. Pay us or we'll tell her that you're likely infertile. Pay us or we'll tell your kid that they will probably die by age 30.

1. Criminals extorting folks who have changed their identities: asylees, stalking victims, protected witnesses, etc.

1. Oppressive governments persecuting relatives of escaped asylees: Your brother who disappeared actually went to country X. We can't punish him so we're punishing you.


That's an argument from utility, which is not how you should approach matters of ethics.


Your comment is instructive. Would you care to expound?


Here's a thought:

"This is a GDPR erasure request. Your site contains my PII by way of that of my father. Please erase this information and indicate that you have complied within 30 days."

Shall I try it?


Yes, please!


I wouldn't consider DNA to be secret information. Given that you leave them every where you go.


There is a world of difference between

1. Having your DNA already in the database

2. Your DNA being out somewhere on the street where it could only be linked to you by name through a targeted reconnaissance effort


I think this is somewhat analogous to the privacy issues around Google Street View. Almost nobody thought the image of the front of their house was really private, but the idea of it being catalogued and searchable bothered more than a few. Removing the barrier of someone having to physically do the work to get that information at least made them feel more vulnerable.

Has Street View been a problem for the world in that way? I haven't personally experienced that. That's probably why the DNA database idea doesn't scare me. If you want to live in the world it's essentially impossible to keep your DNA a secret. It seems to me that eventually someone will pick it all up and organize it.


Your street view doesn’t contain your entire genetic record (including propensities towards disease, mental and physical, which could very easily be used to discriminate against you). So they’re not really comparable whatsoever.

And what is with this “this terrible thing X will happen eventually, so why not have it happen now?” argument I keep seeing nowadays? Your argument was quite literally: “Eventually someone will collect all your DNA”, so who cares if it’s now or later?


> Your street view doesn’t contain your entire genetic record (including propensities towards disease, mental and physical, which could very easily be used to discriminate against you).

Isn't this a form of victim blaming? How is this different than saying Black people should try to hide their skin color since in many cases they will be discriminated against because of it? We should be working to suppress the discrimination at it's source, not it's target.


You're right, working to reduce discrimination at source is undoubtedly worthwhile. But data does not exist in a vacuum - it is collected on behalf of, and used by, people.

Until we reach zero intolerance nirvana, you can't ignore that personal data collection at scale simplifies discrimination, and also opens up new methods for discriminating. Will there be benefits to society from personal data collection at scale? Of course. But there are also costs. There are plenty of examples of people whose ideas or products became used in unforeseen ways and regretted their actions.

Discrimination should be suppressed at source and systems that simplify its manifestation in the real world should be handled extra carefully.


I'm a little confused about what exactly the point of debate is here.

* Is your DNA a secret? I think the fact that you leave it everywhere means no.

* Should people be allowed to aggregate that information? It literally cannot be stopped so I think the point is moot.

I guess what I'm missing is any addressing of the reality of the situation. I'm guessing from the content of your reply that you think that the practice of cataloging DNA should be banned. Great. What happens when they do it anyway?


> Should people be allowed to aggregate that information? It literally cannot be stopped so I think the point is moot.

Just because you can't stop something doesn't mean you shouldn't even try. Otherwise we might skip having laws altogether.


I'm just looking for a helpful, actionable response. All I've seen so far is "X is bad" (not actionable) and "Let's ban X" (not helpful).

What good will it do you that there's an international ban on DNA databases when corporations use the impossible-to-stop one anyway to discriminate against and target you or the police use it anyway to throw you in prison.

The most helpful course of action imo is to learn how best to cope with this new reality. How should we set our expectations when our DNA is public and searchable? Are there behaviors that would once be safe but will not be in the future? I think those are the more relevant questions.


To your first point, you can go out to the street and bring home someone’s random dna, but there is no way you’d ever be able to know who’s dna it was.

... unless you were to look it up maybe, in this leaked dna database.

Dna is not inherently an identifier. It needs the lookup code in order to act as one. A database like this MAKES it no longer a secret.


I'm not talking about taking random samples off a sidewalk. I'm saying if you follow a person you know and collect something they've discarded, now they're in the database. Do that enough times and everyone's in it. That's the exact technique the police use to collect people's DNA without their consent.


> Is your DNA a secret? I think the fact that you leave it everywhere means no.

There is a complicated procedure to convert this skin scales to data. Not everybody is able to do it, so if is not a secret, neither is exactly open data.


Yes your DNA is a secret, just like your fingerprint is a secret.

Companies shouldn't be allowed to aggregate and resell that information. Hope the GDPR will give grounds to close shops doing that.

edit: typo DNS instead of DNA


> Yes your DNS is a secret, just like your fingerprint is a secret.

But is it really? I think the point being made here is that actually it is relatively easy to obtain someone's DNA. Is there a law that prevents someone who knows your name from picking up a discarded coffee cup and extracting your DNA? I think it's an interesting debate. Is your face private? Is the sound of your voice private? Those things are unique to you but anybody that interacts with you will be exposed to those features including possibly your DNA. I guess the concern is how the data is collected, what it is used for and in the case of DNA the impact it has on anybody that has a genetic link to us. I think it's fair to consider DNA in separate category. There's only so much that can be deduced from your face as compared to DNA. It's tricky...


It's made me run away from at least one business when I saw that their office address was basically an obviously unoccupied 2up 2down hovel.


By targeted reconnaissance effort do you mean trivial geographic correlation based on your phones location data. So if the Google Street View car had a DNA sequencer on the back and GPS recorded any fragments and location it could trivially reconstruct quite a bit. No one has done this yet, but it's utterly doable. DNA is not private information its the most public information you can imagine is not controllable in any way thats meaningful to traditional thoughts on data privacy.


If an action requires less investment and provides the same value, it will happen more frequently — economics. A database lookup requires less investment than a targeted DNA harvesting, sequencing, and location correlation operation.


So because it is supposed to be trivial to identify people based on GPS, phone and DNA (which I dont believe), it doesn't matter if one gets his data into a DB, which gets leaked to the internet and then can be found/used by anyone? I don't think I follow u our reasoning. I'll also state that DNA is hardly the most public information there is, surely your face/skin color/size/other physical characteristics are more public?


This is the same gap between being seen face to face in a public square and having a high resolution 3D scan of your body.

We're ok with the former since the dawn of times, we're not happy with the later being digitally shared around the world.


Not secret but I would definitely consider it a PII (personally identifiable information), which makes it subject to regulations such as GDPR.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: