Codify — Article

America’s Living Library Act creates a federal genomic biobank for National Park species

Establishes a USGS‑run pilot to sequence, store, and publish genomic data from organisms in National Park units while restricting physical transfers and limiting pre‑release AI access to vetted U.S. entities.

The Brief

The bill establishes the America’s Living Library Project, a pilot under the Department of the Interior (administered through USGS) to collect, sequence, and catalogue genomic material from animals, plants, fungi, and microbes found in selected National Park System units. The statute requires an interoperable, publicly accessible genomic database (with protections for personally identifiable information and sensitive locality data), domestic long‑term storage of physical samples, and interagency coordination with USDA, Smithsonian, NIH/NCBI, and the National Park Service.

This project is pitched as a conservation and research infrastructure program: it builds genomic baselines for species preservation and scientific discovery, creates curated physical and digital repositories, and authorizes funding for sequencing, curation, and data infrastructure. It also introduces explicit access controls and biosecurity safeguards — including limits on who may receive expedited, pre‑publication bulk access (restricted to U.S. entities not owned or controlled by foreign entities of concern) — creating a hybrid model of open science and controlled access that will matter to researchers, repositories, and AI developers alike.

At a Glance

What It Does

Creates a USGS program office to run a pilot that collects biological samples in National Park units, performs whole‑genome sequencing, and publishes genomic assemblies and metadata to a public database subject to privacy and sensitive‑location redactions. Physical samples are to be curated domestically in Smithsonian and USDA repositories under federal ownership.

Who It Affects

Impacts the Department of the Interior and USGS, the National Park Service, Smithsonian Institution and USDA germplasm/natural history repositories, NCBI for data hosting, academic and commercial researchers, domestic AI developers seeking bulk genomic data, and Indian Tribes required to be consulted for program activities on or affecting tribal interests.

Why It Matters

The bill sets a federal precedent for large‑scale, centralized genomic collection and curation tied to public lands, couples data openness with security controls for downstream AI use, and creates long‑term obligations (staffing, storage, and standards) that shift costs and governance responsibilities onto federal agencies and designated repositories.

More articles like this one.

A weekly email with all the latest developments on this topic.

Unsubscribe anytime.

What This Bill Actually Does

The statute directs the Secretary of the Interior (acting through the USGS Director) to stand up a pilot program — the America’s Living Library Project — that sequences genomic material from organisms found on selected National Park lands. The program is designed to operate in close coordination with the National Park Service, USDA research and germplasm programs, the Smithsonian’s National Museum of Natural History, and NCBI.

Sampling must comply with existing statutory permitting and consultation regimes (for example, the Endangered Species Act, Migratory Bird Treaty Act, Lacey Act, and Marine Mammal Protection Act) and departmental policies, and the program is to leverage current monitoring and research efforts at park units when possible.

Data created by the program — long‑read sequence reads, genome assemblies, annotations, and metadata — will be deposited into a publicly available genomic database. The bill requires the Secretary to withhold personally identifiable information and to prevent release of sensitive collection locality details (precise coordinates) when necessary to protect personnel, specimens, or resources.

It also requires adoption of interoperable data standards and cybersecurity safeguards, referencing applicable federal frameworks so that sequence data and computational services meet findability, accessibility, interoperability, and reusability principles and NIST security guidance.Physical samples are treated as federal property that will be curated for the long term: the text requires samples identified for long‑term storage to be accessioned into the Smithsonian natural history collections and USDA germplasm repositories. The statute bars physical transfer, export, or loan of samples outside the United States and mandates that processing and curation occur in U.S. facilities.

The Secretary must evaluate species for long‑term storage priorities, segregate and impose enhanced biosafety controls on microbes or pathogens that pose heightened risk, and ensure appropriate containment and access limits for such material.Program governance includes an office within USGS to run the pilot, authority to enter contracts and accept in‑kind contributions from biotech firms (subject to conflicts‑of‑interest vetting and limits on the types of in‑kind support), and a selection process for park units: five units must be chosen initially and 20 more later according to objective biological, operational, and public‑engagement criteria. The law requires tribal consultation, an implementation plan (including a pathway for expedited, non‑public bulk access for qualified U.S. entities), reporting at three years and at program end, a prescribed funding schedule for sequencing, storage, and data hosting, and a sunset of program authority after ten years.

The Five Things You Need to Know

1

The Secretary must select 5 National Park units within 180 days of enactment to start the pilot and select 20 additional units not later than 2 years after enactment.

2

Physical samples collected under the program may not be transferred, exported, or loaned outside the United States and all physical processing and curation must occur in U.S. facilities.

3

The implementation plan must include expedited, pre‑public bulk access pathways (such as APIs) for AI model development limited to entities organized under U.S. law that are not owned or controlled by any foreign entity of concern.

4

The office may accept in‑kind contributions from biotechnology companies but must limit those to operational resources (materials, instrumentation, sequencing capacity) and implement vetting to prevent conflicts of interest.

5

The program authority automatically terminates 10 years after enactment; Congress authorized specific appropriations for sequencing, museum and germplasm storage, and NCBI data hosting for fiscal years 2027–2031.

Section-by-Section Breakdown

Every bill we cover gets an analysis of its key sections. Expand all ↓

Section 2(a) — Definitions

Key terms that shape scope and controls

The bill defines critical concepts that determine who is in scope and what controls apply: 'control' (for screening ownership or influence), 'foreign entity of concern' (referenced to the Infrastructure Investment and Jobs Act), 'high‑priority species' (to be designated jointly by multiple agency heads), and 'office' and 'program' labels. Those definitions matter operationally because they trigger access restrictions (e.g., who can get expedited data) and influence procurement and partnership screening.

Sections 2(b)–2(c) — Establishment and Purpose

Pilot program under USGS to build a genomic library

The Secretary must establish the America’s Living Library Project as a pilot focused on collecting new genomic data from species found on selected National Park units. The statutory purpose is explicitly conservation and research: to facilitate whole‑genome sequencing and cataloging across taxa (animals, plants, fungi, microbes) found in parks. Practical implication: USGS has responsibility for program design and must ensure legal compliance with species protection statutes before sampling.

Section 2(d) — Interagency Coordination and Database

A public genomic database with curated metadata and safeguards

The provision requires coordination among DOI, NPS, USDA, Smithsonian, and NIH/NCBI to create a publicly accessible genomic database containing long‑read data, assemblies, annotations, and metadata. The Secretary retains discretion to withhold personally identifiable information and sensitive locality data; the Secretary must also integrate taxonomic systems (ITIS, PLANTS, WoRMS) where appropriate. For implementers, this creates dual tasks: adopt interoperable taxonomic and metadata standards, and put in place redaction and risk‑based release rules for sensitive content.

4 more sections
Section 2(e) — Program Office, Authorities, and In‑Kind Rules

USGS office runs the program and can contract, accept vetted in‑kind help

The bill directs creation or designation of an office within USGS, authorized to hire staff, contract with other federal agencies and private parties, and accept in‑kind contributions from biotech firms limited to operational inputs (instruments, sequencing capacity). The office must implement conflict‑of‑interest vetting for contributions. This gives program managers flexibility to partner with industry but requires written policies to avoid vendor capture or preferential access tied to donations.

Section 2(f) — Selection of Park Units

Objective criteria, public notice, and phased roll‑out

Selection is phased: an initial 5 units must be chosen within 180 days and an additional 20 within two years, using objective criteria that include biological landscape, operational readiness, education value, research alignment, and geographic/ecological diversity. The Secretary must post notices explaining selections and rationale. For parks, this means early engagement will determine inclusion; for NPS management, readiness and local capacity will be a practical gating factor.

Section 2(g) — Data, Sampling, Sequencing, and Long‑Term Storage

Standards, biosafety, domestic curation, and no export of samples

The Secretary must adopt or develop data standards ensuring FAIRE (findable, accessible, interoperable, reusable) principles, and meet applicable USGS and federal biological data policies. Cybersecurity safeguards must align with NIST SP 800‑53 and 800‑111. The statute requires species evaluations to identify samples for long‑term storage, designates Smithsonian and USDA as repositories (with accessioning and institutional curation duties), requires enhanced containment for risky microbes, and prohibits physical transfer or export of samples outside the U.S. Operationally, repositories will need to expand capacity and implement controlled‑access processes for high‑risk materials.

Sections 2(i)–2(l) — Implementation Plan, Reporting, Funding, and Termination

Plan, reporting cadence, specific appropriations, and a 10‑year sunset

Within 180 days the Secretary must deliver an implementation plan covering expansion, partnerships, and an expedited non‑public access pathway for qualifying U.S. entities (not majority‑owned or controlled by foreign entities of concern). The Secretary must produce a preliminary methodological report at 3 years (including a plan for sustainable funding, potentially a graduated subscription model) and a final report at program end. The statute authorizes specified appropriations for sequencing, museum and germplasm storage, and NCBI data hosting for fiscal years 2027–2031 and terminates program authority 10 years after enactment. Practically, the funding schedule gives a predictable ramp but leaves longer‑term financing to policy recommendations and potential subscription mechanisms.

At scale

This bill is one of many.

Codify tracks hundreds of bills on Science across all five countries.

Explore Science in Codify Search →

Who Benefits and Who Bears the Cost

Every bill creates winners and losers. Here's who stands to gain and who bears the cost.

Who Benefits

  • Conservation scientists and ecologists: gain standardized genomic baselines and long‑term physical vouchers to inform species management, population genetics, and restoration planning.
  • U.S. academic researchers and public institutions: receive a curated, interoperable dataset (and potential expedited access pathways) that lowers barriers to comparative genomics, evolutionary studies, and ecosystem‑scale science.
  • Domestic AI and biotech developers organized under U.S. law: can access pre‑publication bulk data under the program’s expedited pathways if they pass ownership/control screening, enabling model training and commercial R&D within the U.S. legal boundary.
  • Smithsonian and USDA repositories: receive federal mandate and direct appropriations to expand curation and germplasm capacity, enhancing national collections and preservation infrastructure.
  • National Park Service and public audiences: benefit from new scientific outputs and public‑engagement opportunities tied to park ecosystems and education programming.

Who Bears the Cost

  • Department of the Interior / USGS: must staff, manage, and oversee the program office and compliance tasks (sampling permissions, data standards, cybersecurity), creating ongoing administrative burdens.
  • Smithsonian Institution and USDA germplasm programs: while allocated funds are provided, these repositories must scale facilities, operations, and biosafety controls to accept accessioned material and meet access management obligations.
  • Small academic labs and nonprofit researchers: may face future costs if the recommended graduated subscription model is adopted for bulk or prioritized data access, or if expedited services are priced.
  • Indian Tribes and tribal governments: will need to engage in government‑to‑government consultations and may see cultural, sovereignty, or benefit‑sharing issues arise when program sampling touches tribal lands or culturally important species.
  • Private biotech contributors: must undergo vetting for conflicts of interest and face limits on how in‑kind contributions can be structured, potentially constraining desirable public‑private collaborations.

Key Issues

The Core Tension

The central dilemma is balancing open, broadly useful scientific data and long‑term collection for conservation against risks from exposing sensitive site data, potential pathogen misuse, and unequal commercial access—choices that force trade‑offs between transparency, biosecurity, tribal sovereignty, and who gets first use of federally generated genetic information.

The bill stitches together three distinct policy aims—biodiversity conservation, open scientific data, and national security/privacy controls—and implementing them will require granular choices with real trade‑offs. For example, the program mandates public data release but also allows suppression of sensitive locality details; defining that threshold is inherently judgmental and will shape whether data remain useful for fine‑scale ecological work.

Similarly, segregating microbial material that requires enhanced biosafety creates categorical responsibilities for repositories (containment infrastructure, restricted access) that are expensive and operationally complex despite the appropriations provided.

Access controls raise their own tensions. The law creates expedited, pre‑publication bulk access strictly for U.S. entities not owned or controlled by foreign entities of concern and permits vetted in‑kind contributions from industry; these provisions aim to protect national security and prevent foreign exploitation of genetic resources but risk creating a two‑tier system where well‑resourced domestic firms and institutions gain privileged early access.

The subscription model and reports point to long‑term cost recovery, but the regime does not fix long‑term governance (pricing, licensing, or benefit‑sharing) or fully resolve Indigenous consent, data sovereignty, or potential commercial downstream uses of publicly derived genomic resources.

Try it yourself.

Ask a question in plain English, or pick a topic below. Results in seconds.