The Dark Side of DNA Data:

Exploring the National Security Implications of Aggregated Domestic Genomic Information

featured in the Master’s graduation showcase at Arizona State University

Center on the Future of War

School of Politics and Global studies

Final research paper

I. Introduction

DNA testing has continued to garner attention in the last decade, and with it has come a stream of promises for research, medicine, and services to consumers. Personalized medicine, a silver bullet for cold cases, early disease detection, and learning about family heritage all being selling points. However, this paper will focus specifically on the national security risks presented by the aggregation of data generated and collected from domestic DNA testing kits. From a national security perspective, this can have profound implications for government personnel, jeopardize clandestine operations, and put specific populations at risk for biological weapons targeting. 

In this paper we will briefly discuss the process of DNA testing, including the various elements of domestic DNA testing and what there is to gain from it. We will move on to then analyze what issues are presented from the perspective of national security, ending with an evaluation of the existing regulations and policies pertaining to privacy around storing DNA information. Finally ending with security and policy recommendations based on the gaps identified and issues explored.

II. Background on DNA testing

DNA testing kits have continued to gain popularity for a variety of reasons. For some, it is the opportunity to connect with their heritage; for others, it could be in search of truths about their identity (searching for a parent as an example) or an investment in their future quality of life and wellness. The use cases for purchasing DNA kits are typically personal, unfortunately receiving little thought around the national security implications of this process.

Before we get too far into the risks, we must first understand the process. The kit arrives with instructions; the recipient collects a saliva sample and swirls it around using the provided cotton swab. They then package the swab with the provided packaging and send it back to the company, where they will send it to their lab. In about two months or so, the customer will receive information about themselves, depending on the company, typically in the form of a dashboard where they can view ethnicity percentages and find details about their bloodline. The series of algorithms used is interesting but not relevant to this paper. Once the sample is processed and run through proprietary software, Ancestry generates the consumer's results. It retains this data based on if the customer agreed to let your DNA be used for “informed consent research.” Examples of such companies that offer these kits are Ancestry, 23andMe, and MyHeritage.

Collected from the Ancestry’s privacy statement:

“Neither your saliva nor the extracted DNA (together referred to as “Biological Samples”) are Personal Information under this Privacy Statement…Future testing may be done if you agree to our Informed Consent for Research or if you consent to other tests of your Biological Samples. If you do not consent to the storage of your Biological Sample, we will destroy your sample.”

The actual sample itself will be destroyed, but the extracted information does not appear to fall under the same procedure. In the case of Ancestry, their DNA “network” contains the DNA information of 22 million people. Advertising as having the “world’s largest consumer DNA network” (Ancestry). Further supporting the notion that genomic information is being stored long term and the data is not destroyed unless otherwise requested, a step that most individuals will likely overlook. “Vanderbilt University researchers found that 71 percent of companies used consumer information internally for purposes other than providing the results to consumers” (Roberts 2020).

At face value, it may not be clear what risks might be associated with illegally accessing genomic data and how they can be used. The risks become clear upon understanding the applications of using such data. DNA data is not like a social security number or other information associated with a person; it is biometric, and a person is identifiable by this information. “DNA presents privacy issues different from those involved in other biometrics collection ...[since] it can contain information about a person’s entire genetic make-up, including gender, familial relationships, ... race, health, disease history and predisposition to disease” (Lynch 2012). From a national security perspective, this can have profound implications for government personnel, jeopardize clandestine operations, and put specific populations at risk for biological weapon targeting. In 2015 the Office of Personnel Management announced two incidents that jeopardized the confidential data about Federal government employees and any other types of staff engaged. Devastatingly, it was determined that China stole “sensitive information, including the Social Security Numbers (SSNs) of 21.5 million individuals” from OPM databases used to store background investigation information (OPM). Combining personal information with genomic data produces a complete picture of an individual. John Demers, when he was head of the DOJ’s national security division, put it clearly when discussing the national security risks of genetic information when he said this data “can be used from a counterintelligence perspective to either coerce you or convince you to help the Chinese,” further adding, “the worst case would be the development of some kind of biological weapon … if you had all of the data of a population, you might be able to see what the population is most vulnerable to.” 

To add insult to injury, combining genomic information with a complete background check can also identify an individual's closest living relatives and family circle, adding more pressure points for that individual to be manipulated. Once an individual's DNA data is collected, it is not difficult to use today’s technologies to identify that individual's closest relatives and family. As Evanina stated, “sometimes Americans or people around the globe don't even know the value of their DNA, that it even has value — but it's your single, sole identifier of everything about you as a human being.” (Dunleavy).

III. National Security concerns in DNA testing

As discussed previously, illegal access to highly sensitive data has proven to be less of an anomaly than one would hope. In particular, healthcare data, including genomic information, is a particular target for China (DNI). According to The National Counterintelligence and Security Center, the People's Republic of China (PRC) has been collecting large amounts of U.S. healthcare data (in addition to other countries' data). Legitimate uses of this information can benefit society across the globe, especially when they aid in healthcare developments and treatments. However, a chief concern is when this information is used for nefarious reasons and aids in human rights abuses, such as acts against specific minority groups and in support of mass surveillance. China has and still is pursuing; "China has been brutally repressing Uighurs and other Muslims, forcing them into re-education camps reminiscent of Mao Zedong's Cultural Revolution." (Gehrke). Additionally, persecution of any group opposing China's government has been increasing at an alarming rate; "The PRC has a documented history of exploiting DNA for genetic surveillance and societal control of minority populations in Xinjiang, China." (NSCS 2 3).

The Xinjiang province in China and the surveillance and control of the muslim minority present (the Uighurs) is a critical use case to consider, but what types of national security risks does this information imply for the United States? Unfortunately, through various breaches and heavy exfiltrations of data, the Peoples’ Republic of China has obtained Personal Identifying Information (PII) for a majority of the United States population (NCSC 3). Combining the information collected, including genomic data, PII, and any other details the PRC has collected, allows the Chinese government to target individuals precisely. This could include blackmail of high-level officials and the manipulation and extortion of an individual in a difficult situation. This also aids in recruitment efforts and provides opportunities to locate foreign dissidents amongst other unsavory opportunities.

Additionally, the NCSC reports the long-term economic implications which affect the U.S. much more broadly. The mass acquisitions of U.S. healthcare data have propelled China's development of healthcare-related Artificial Intelligence and advancements in precision medicine. Even though China is quick to collect any information available, they do not reciprocate this generosity and keep their version of this information highly protected. This relationship could allow China to develop new drugs and treatments sooner than the U.S. causing a shift in power in the industry (as well as displacing the U.S. as the leader in Bio-Tech). These discoveries could also benefit U.S. citizens if China chooses to make them available abroad; it would also create a deep dependence on China (USCC).

The NSCS tries to express just how valuable DNA information is, particularly from a national security perspective, going on to state, "Your DNA is the most valuable thing you own…It is your unique genetic code and can enable tailored healthcare delivery to you. Losing your DNA is not like losing a credit card…you cannot replace your DNA. The loss of your DNA not only affects you, but your relatives and, potentially, generations to come.". 

An additional growing concern is the application of genomic information for developing biological weapons, or "black biology," using biotechnology to create biological weapons. Although Ainscough paints a grim picture in describing the more sinister applications of Gene therapy, even with showing promise in curing cystic fibrosis and diabetes, there is also the application of "inserting pathogenic genes" to cause incurable illnesses (270). There is also the use case of expirations around retroviruses, a type of virus that permanently integrates into human chromosomes, a genetic manipulation of this type of virus with the intention of creating a bioweapon could be catastrophic (Ainscough 271). To not be shy, we’re talking global eugenic capabilities. 

In addition to targeting humans or certain populations, there is also the threat of crops and livestock being targeted through similar weaponry. Inducing great hardship, economic losses, and potential famine in the U.S.

IV. Privacy concerns and Standards in DNA testing

Currently, there is a well-defined set of standards issued by the FBI for handling and storing DNA information that will be included in the Combined DNA Index System (CODIS), the program defines a standard for support of criminal justice DNA databases and extends to cover the software used to run them (CODIS and NDIS 2022). This however is specific to law enforcement and does not appear to extend to cover DNA information generated, stored, and maintained, by private companies (NIST). 

According to the National Human Genome Research Institute, direct-to-consumer (DTC) genetic testing has limited regulations - “While many companies have robust privacy and informed consent policies, no federal laws prohibit companies from providing individuals’ genetic information to third parties.”. The Federal Trade Commission can provide some level of protection to consumers by being able to enforce action if a company makes false claims or misleading statements in regard to privacy and security or in the event that the company fails at protecting an individual’s information (NHGRI). 

In reference to de-identifying DNA data, meaning stripping the data set of personal identifiers, there has been a lot of skepticism around the accuracy of the claims around the ability to successfully do this. In response to the memo issued by the Department of Defense urging military members not to take at-home DNA tests, Couzoin-Frankel commented on the matter that “there have been some discoveries that it's often very difficult to truly de-identify data. And of course, Department of Defense may have other ideas that they're choosing not to share. But they obviously feel that there is some potential risk here, and they are trying to steer clear of it.” (Shapiro). 

Additionally, not all privacy policies are the same. For example, 23andMe requires customers to opt-in and provide consent before sharing the customer's data. However, this relationship can change if the customer downloads their DNA information and then uploads it to another website. An example of this, provided by Segert, is GEDmatch. GEDMatch’s privacy policy is much looser and displays users' real names and is publicly searchable. The site received infamy when police used it to solve the Golden State Killer case (Segert).

DNA information falls into the same categories as any other personal information but is not recognized as health data. The gap in privacy protection is between those two categories, there are currently no restrictions in place for persons in sensitive roles in the government or important to national security to be prevented from taking any direct-to-consumer DNA tests. And there are currently no additional protections in place surrounding healthcare information specific to consumer DNA information.

V. Regulations

The Genetic Information Nondiscrimination ACT (GINA) was adopted to prevent employers from discriminating against employees based on genetic information. GINA does not however apply to third-party direct to consumer testing like Ancestry and 23andMe or the handling of the information after it is collected (Roberts 2020).

As discussed previously, de-identification as a solution to privacy concerns has not been successfully adopted and has received much skepticism. For good reason, “it is not clear if this is entirely effective because genetic data is intrinsically identifying. This is because each person’s genome is unique and may be traced back to them, similar to a thumb print” (Segert). This confirms that genomic data is biometric, and should be regulated as such. Unfortunately, that does not seem to be the current reality.  According to Ancestry’s privacy policy: 

“Ancestry is not a covered entity under the Health Insurance Portability and Accountability Act (“HIPAA”), and as a result no data provided by you is subject to or protected by HIPAA.”

In a case study, the researchers were able to infer the last names of participants using a small portion of their genetic data along with census information such as date of birth and their home state (Segert). This confirms that it is possible to re-identify an individual after the information has been de-identified.

Additionally, there seem to be lax regulations around a company's ability to sell their customer's genetic information, as Segert explained, direct-to-consumer companies are able to offer their services at an affordable price point because “they can sell their customer’s genetic data to pharmaceutical companies for a profit. 23andMe, for example, has a contract to license customer data to the biotech giant Genentech for their research efforts into Parkinson’s disease.” Relating to 23andMe, it was announced in February 2021 by the Virgin Acquisition Group that the company was being acquired by the firm (Paul). As the saying commonly goes, follow the money. And in this case, you have to ask yourself what value must a DNA testing company geared to learning about your ancestors have to investors like Richard Branson who are willing to spend 3.5 billion U.S. dollars to acquire it. The answer lies in the asset that a consumer DNA database is and the gap in regulations preventing companies from using and profiting from it. 

There also seems to be a deficiency in regulating the limitations of access to U.S. genetic information from foreign entities from a legal perspective. There is currently nothing preventing a foreign company from purchasing a U.S. company that holds DNA data as a primary asset. This has already occurred in at least two documented instances; on December 4th, 2020 it was announced that Blackstone acquired Ancestry for 4.7 billion dollars. Blackstone is a private equity fund with a stake in pharmaceuticals and healthcare-related businesses. Even though Blackstone is an American investment management company, the nature of its partnership structure and the companies they have acquired in the past make them a global entity (Karr). According to Bradsher, the Chinese government holds a $3 billion dollar nonvoting stake in the Blackstone group, muddying the clarity of information transfer and ownership around genetic data we discussed previously. 

These are important factors to note when addressing regulatory concerns and determining how the data should be treated. Currently, since this information does not fall under HIPAA, it is covered by regulations that apply to general personal information. More specifically, it is federally regulated based on three criteria: analytical validity, clinical validity, and clinical utility by the Food and Drug Administration (FDA), the Centers for Medicare and Medicaid Services (CMS), and the Federal Trade Commission (FTC) as stated by the National Human Genome Research Institute. These regulations however do not appear to regulate or dictate privacy and data handling measures as well as access and ownership from foreign entities. 

VI. Solutions and Recommendations 

The solutions to the issues explored in this paper fall primarily within addressing regulation and oversight deficiencies. There is not a lack of transparency around genomic data but rather a lack of identifying the data as biometric and handling it with appropriate security and privacy standards, regulations, and procedures. Current measures have failed in their effectiveness to secure DNA testing databases and minimize the exploitation of the information in regard to national security.

Ultimately, as with the responsibility of ensuring proper cybersecurity measures are in place as well as adherence to globally recognized standards, the responsibility of protecting and ensuring the proper handling of DNA data falls on the owner storing and up keeping the data. This should include preventing genomic information from being accessed or transferred to foreign entities. Perhaps exclusions can be developed, for example, allied and partner countries with the consent of the data owner, and maybe even a stipend for their contribution to science. 

In regards to foreign entities like China being able to legally purchase U.S. DNA data, this needs to be banned immediately. Especially with the total failure of reciprocity of sharing health care-related information.

Additionally, I recommend HIPAA be expanded to recognize and cover DNA data. This would help fill in some of the identified gaps and would alter the way this data can be handled.

I recommend businesses and institutions follow the “Framework for Responsible Sharing of Genomic and Health Related Data” published by the Global Alliance for Genomics and Health. This tool kit provides information around following a clearly defined framework where there is a responsibility around sharing human genomic information and other health-related data. Additionally, this guide adheres closely to Article 27 of the 1948 Universal Declaration of Human Rights. For reference, “Article 27 guarantees the rights of every individual in the world “to share in scientific advancement and its benefits” ..and…“to the protection of the moral and material interests resulting from any scientific…production of which [a person] is the author.” (Framework for Responsible Sharing of Genomic and Health-Related Data).

And finally, regulations and the standards established around genomic data should be enforceable with repercussions should an entity fail to meet them. Large fines and restrictions on business can be successful ways to throttle abuse industry-wide. Public information is also an effective way to inform customers of a business that refuses to comply with standards; perhaps a license should be required for any entity handling DNA information. Failure to meet the requirements results in a revocation of their license which would be information upfront for a consumer to consider before entrusting that company with their information. 

VII. Areas for Further Study

This paper, though specific to national security risks, only covered a small introduction to a few key issues of concern. Some additional areas for further study would be the societal risks and implications over commercial and government accessible consumer DNA databases. What types of exploitation would be possible if companies were able to leverage DNA information in consumer targeting? For example, targeting a pre-diabetic population with specific products. Especially as the understanding of DNA data as a biometric collection point seems to be widely misunderstood. 

A further investigation into the utility of biometric data in counterintelligence and the feasibility of bioweapon development and targeting capabilities is also merited. We need to understand the reality of global eugenics as well as the limitations of these types of threats. 

VIII. Conclusion

Key takeaways from this effort are understanding the relationship genomic data has with businesses and individuals, and then further understanding what inherent risks emerge. As we have identified, genomic data is biometric, and there are uses for this information that present threats to U.S. national security, it is not currently covered under HIPAA, and it is not regulated or prevented from crossing borders or from the purchase of foreign entities. Countries like China are able to legally purchase genomic data on U.S. citizens with the purchase of companies that possess DNA databases as an asset. Since we have confirmed that the method of de-identification has not proven successful, we must also conclude that de-identified DNA data is still sensitive and presents the same risks to national security as the transfer and acquisition of identified DNA data. 

Moving forward, changes need to be made to the way DNA data is recognized. It should be first and foremost treated as biometric information that is not strippable of personally identifiable information. 

Monitoring and restriction should be implemented in preventing the legal and illegal acquisition of U.S. DNA data by China and other adversarial nations that have made their intentions clear that they are not in line with the U.S.’s best interests. 

As the value of DNA data grows and more companies place a vesting interest, it will get harder to implement regulations and safeguards later. It will already be challenging to implement and enforce any type of new regulations and frameworks around this information as companies around the world have already been making multi-billion dollar investments where usage of large DNA databases are the primary asset. Genomic information has been pitched as providing the necessary data to unlock medical breakthroughs that would nonetheless change the future of medicine. Though this is great from a medical research perspective, prioritizing U.S. national security will help ensure this can be the case for generations to come.


2022 Report to Congress of the U.S.-China Economic and Security Review Commission. USCC, Nov 2022, https://www.uscc.gov/sites/default/files2022-11/2022_Annual_Report_to_Congress.pdf

AncestryDNA Informed Consent. (n.d.). https://www.ancestry.com/dna/lp/informedconsent-v4-en

Ainscough, Michael. J. Next Generation Bioweapons: Genetic Engineering and BW: US Air Force Counterproliferation Center Future Warfare Series No. 14. media.defense.gov. https://media.defense.gov/2019/Apr/11/2002115480/-1/-1/0/14NEXTGENBIOWEAPONS.PDF

Behind the Scenes: How Does AncestryDNA Work? (n.d.). https://www.ancestry.com/cs/dna-redirect/ancestry-dna-lab

Blackstone Inc.(2020, December 4). Blackstone Completes Acquisition of Ancestry®, Leading Online Family History Business, for $4.7 Billion. Https://Blackstone.Com. Retrieved February 25, 2023, from https://www.blackstone.com/news/press/blackstone-completes-acquisition-of-ancestry-leading-online-family-history-business-for-4-7-billion/

CHINAS COLLECTION OF GENOMIC AND OTHER HEALTHCARE DATA FROM AMERICA: RISKS TO PRIVACY AND U.S. ECONOMIC AND NATIONAL SECURITY.The National Counterintelligence and Security Center, Feb.2021, www.dni.gov/ filesNCSCdocumentsSafeguardingOurFutureNCSC_China_Genomics_.Fact_Sheet_2021revision20210203.pdf

China Says It Made Blackstone Investment to Raise Returns - New York Times. (n.d.). https://archive.nytimes.com/www.nytimes.com/ref/business/22blackstone.html

CODIS and NDIS Fact Sheet. (2022, August 3). Federal Bureau of Investigation. https://www.fbi.gov/how-we-can-help-you/dna-fingerprint-act-of-2005-expungement-policy/codis-and-ndis-fact-sheet

Company Facts | Ancestry Corporate. (n.d.). https://www.ancestry.com/corporate/about-ancestry/company-facts

Cybersecurity Incidents. (n.d.). U.S. Office of Personnel Management. https://www.opm.gov/cybersecurity/cybersecurity-incidents/

Dunleavy, Jerry. “Trump counterintelligence chief warns China is trying to acquire American DNA and health data” Washington Examiner, 01 Feb. 2021, https://www.washingtonexaminer.com/news/china-american-dna-health-counterintelligence-chief

Framework for Responsible Sharing of Genomic and Health-Related Data. (n.d.). https://www.ga4gh.org/genomic-data-toolkit/regulatory-ethics-toolkit/framework-for-responsible-sharing-of-genomic-and-health-related-data/

Gehrke, Joel. “State Department compares China to Nazi Germany in human rights briefing” Washington Examiner, 13 Mar. 2019, https://www.washingtonexaminer.com/news/china-american-dna-health-counterintelligence-chief

Karr, R. (2015, February 26). Understanding the Blackstone partnership structure. Market Realist. https://marketrealist.com/2015/02/understanding-blackstone-partnership-structure/

Lynch, J. 2012. From fingerprints to DNA: Biometric data collection in U.S. Immigrant communities and beyond. Immigration Policy Center: American Immigration Council. Available at papers.ssrn.com/sol3/papers.cfm?abstract_id=2134481 (accessed Feb 25, 2023).

Newman, L. H. (2019, December 23). The Worst Hacks of the Decade. WIRED. https://www.wired.com/story/worst-hacks-of-the-decade/

NHGRI. (2019, March 13). Regulation of Genetic Tests. Genome.gov. https://www.genome.gov/about-genomics/policy-issues/Regulation-of-Genetic-Tests

NIST. https://strbase.nist.gov/QAS/Final-FBI-Director-Databasing-Standards.pdf

Paul, K. (2021, February 11). Fears over DNA privacy as 23andMe plans to go public in deal with Richard Branson. The Guardian. https://www.theguardian.com/technology/2021/feb/09/23andme-dna-privacy-richard-branson-genetics

Privacy Statement - Ancestry.com. (n.d.). https://www.ancestry.com/c/legal/privacystatement

Privacy in Genomics. (n.d.). Genome.gov. https://www.genome.gov/about-genomics/policy-issues/Privacy

Roberts, B. C. (2020, July 23). Your Genetic Data Isn’t Safe. Consumer Reports. https://www.consumerreports.org/health-privacy/your-genetic-data-isnt-safe-direct-to-consumer-genetic-testing-a1009742549/

Segert, Julian. (2018, November 28). Understanding Ownership and Privacy of Genetic Data. Science in the News. https://sitn.hms.harvard.edu/flash/2018/understanding-ownership-privacy-genetic-data/

Shapiro, Ari. Pentagon Advises Members Of Armed Forces Not To Use Home DNA Testing Kits. NPR (2019, December 24). https://www.npr.org/2019/12/24/791205583/pentagon-advises-members-of-armed-forces-not-to-use-home-dna-testing-kits