The bill creates a statutory pathway for independent, noncommercial research on the data held by large digital platforms. It tasks the National Science Foundation with approving “qualified research projects” and identifying the datasets necessary for those projects, while the Federal Trade Commission sets privacy and cybersecurity safeguards governing how platforms must deliver data and how researchers must handle it.
Platforms with at least 50 million U.S. monthly users would be required to provide specified, proportionate datasets unless doing so imposes undue burden or creates unacceptable security risks.
Beyond researcher access, the Act compels broad transparency reporting: searchable public repositories for highly disseminated public content, representative samplings of public posts, ad disclosure, summaries of recommender/ranking algorithms and related company metrics, and content moderation statistics. It also creates a limited safe harbor protecting certain automated collection (including research accounts and browser extensions) used for journalism or research, while excluding training large language models.
Enforcement and remedies are administered through the FTC’s existing unfair‑or‑deceptive practice framework; the Act also provides platform and researcher immunities when they comply with its terms.
At a Glance
What It Does
The NSF will establish an approval program for qualified research projects and identify what platform data is necessary; the FTC will define privacy and cybersecurity safeguards, require platforms to deliver qualified data, and issue extensive rulemaking on public transparency about content, ads, algorithms, and moderation. The bill also creates immunity for compliant platforms and researchers and a separate safe harbor permitting certain automated collection for journalism and research.
Who It Affects
Large social platforms with >=50 million U.S. monthly users, U.S. university and nonprofit researchers (excluding those affiliated with law‑enforcement/intelligence), the FTC and NSF, journalists using research accounts, and advertisers whose targeting and impressions must be reported.
Why It Matters
It converts long‑standing access requests and voluntary transparency practices into statutory obligations, forcing platforms to operationalize secure data delivery, public repositories, and algorithmic disclosures. That will change how researchers, watchdogs, and regulators study platform effects, while raising new compliance and privacy trade‑offs for platforms and institutions that handle sensitive datasets.
More articles like this one.
A weekly email with all the latest developments on this topic.
What This Bill Actually Does
The Act defines which companies are in scope and what counts as study‑worthy data. It targets platforms that operate user accounts, amplify user‑generated content, and deliver advertising, and that reach at least 50 million unique U.S. users in most months.
Important categories—direct private messages, biometric data, and precise geolocation—are carved out and cannot be designated as ‘‘qualified data.’'
The National Science Foundation must, within a year, launch a program to solicit, review, and approve research proposals as qualified research projects based on scientific merit; proposals must identify affiliated U.S. universities or 501(c) nonprofits and meet institutional review board requirements (or an approved exemption). The NSF and FTC jointly determine what platform datasets are ‘‘necessary’’ and feasible for each approved project; they may allow consortia of researchers to undertake multi‑project work.The FTC focuses on privacy and security: it prescribes safeguards tailored to the sensitivity of each dataset—encryption, delivery formats that limit reidentification, access and keystroke logs, and, where appropriate, secure physical or virtual environments.
The Act requires prepublication review by the FTC to prevent disclosure of personal information or trade secrets and forbids government entities from compelling data from researchers who receive platform datasets under the program. Platforms that comply get a statutory safe harbor from suits arising solely from providing qualified data; researchers who misuse data face civil and criminal exposure.Separately, the Act creates a safe harbor for automated collection of ‘‘publicly available information’’ (with enumerated exclusions) and the use of research accounts for journalism and research projects, so long as collectors take reasonable privacy protections, do not retain identifiable personal data unnecessary to the work, and do not use the data to train large language models.
The Commerce Department must issue clarifying regulations within 180 days for these provisions.Finally, Section 9 gives the FTC broad rulemaking authority to require continuous public disclosures: searchable repositories of highly disseminated public content and content from major public accounts; representative samplings of public content weighted by impressions; an advertising repository with advertiser/payer, targeting parameters, reach, and impressions; semiannual reporting on recommender/ranking algorithm inputs, objectives, and company metrics; and content moderation and violating‑content statistics. The FTC may limit disclosures to protect privacy, trade secrets, or platform integrity and may scale requirements by platform size.
The Five Things You Need to Know
The bill applies to platforms that operate user accounts, serve user‑generated content, deliver advertising, and have at least 50 million unique monthly U.S. users.
The NSF must establish a research‑approval program within one year to designate ‘‘qualified research projects’’ and identify what data each approved project needs; institutional review board compliance is required.
The FTC prescribes dataset‑specific privacy and cybersecurity safeguards (encryption, delivery formats to prevent reidentification, access/keystroke logs) and may require secure physical or virtual environments and prepublication review to protect PII and trade secrets.
Compliant platforms receive a statutory safe harbor from state and federal causes of action arising solely from providing qualified data; qualified researchers are barred from reidentifying or commercially exploiting personal data and face civil and criminal penalties for violations.
Section 8 creates a separate safe harbor protecting certain automated collection of publicly available information and the use of research accounts for journalism and research—subject to privacy‑protection requirements and an explicit exclusion for training large language models.
Section-by-Section Breakdown
Every bill we cover gets an analysis of its key sections.
Key definitions and scope
This section sets the gatekeeping definitions: ‘‘platform’’ is a site/app that permits accounts, user‑generated content, and ad delivery and reaches ≥50 million U.S. monthly users. It narrows ‘‘personal information’’ to anything linkable to a consumer or device, and explicitly excludes private DMs, biometric identifiers, and precise geospatial data from what can become ‘‘qualified data.’' For implementers this matters because the definitional floor determines which companies must build infrastructure and which types of datasets are off‑limits from the outset.
NSF approval process and FTC safeguard role
The NSF launches and manages the scientific review and approval of research proposals and identifies the data elements necessary for each approved project; the FTC evaluates privacy and cybersecurity risks and prescribes safeguards. The statute requires joint guidelines, opportunities for platform comment and appeal, and criteria to assess whether researchers can comply with safeguards. Notably, determinations about qualified projects are not subject to judicial review—making the NSF/FTC process effectively final for project approval.
Platform obligations, continuity, notice, and safe harbor
Platforms must provide qualified data under the FTC’s safety terms and may not cut off access during an ongoing project without a reasonable belief of a safeguards breach; if they do, they must notify the FTC, which will promptly review. Platforms get immunity from state or federal suits that arise solely from releasing required datasets when they comply. The provision preserves platform authority to act immediately to protect safety or security, creating a carve‑out for incident response.
Researcher rules and penalties
Qualified researchers may use data only for the approved, noncommercial research purpose and must follow FTC‑mandated safeguards; they may not reidentify, disclose, or commercialize personal information, and may not share datasets with third parties. Intentional, reckless, or negligent breaches can be referred to DOJ or state authorities and carry civil and criminal exposure, tying researcher behavior to traditional enforcement channels rather than purely administrative remedies.
Reporting to Congress
NSF and FTC must produce a joint report within 24 months and annually thereafter listing each qualified project, the researchers and affiliations, platforms compelled to provide data, categories of datasets delivered, and the FTC’s safeguard terms. This creates a public record that can be used to audit program scope and to inform future legislative or regulatory changes.
Safe harbor for automated collection and research accounts
This separate safe harbor shields automated collection of publicly available information and the creation/use of research accounts for journalism and research from platform causes of action that rely on breach of terms of service, provided collectors take reasonable privacy measures, limit retention of identifiable data, and do not use the data to train large language models. The Commerce Department must issue clarifying regulations in 180 days; importantly, the statute defines ‘‘publicly available’’ and lists exclusions such as nonconsensual intimate images and biometrics.
FTC rulemaking for broad transparency disclosures
This sprawling section authorizes the FTC to require platforms to publish searchable public repositories and APIs for highly disseminated content and major public accounts, representative public‑content samplings weighted by impressions, an advertising repository (advertiser, payer, targeting parameters, reach, impressions), semiannual disclosures about recommender/ranking algorithms (signals, optimization objectives, company metrics), content‑moderation statistics, and data dictionaries to help researchers navigate available datasets. The FTC must balance disclosure with privacy, trade‑secret protection, and platform‑integrity risks and may scale requirements by platform size.
This bill is one of many.
Codify tracks hundreds of bills on Technology across all five countries.
Explore Technology in Codify Search →Who Benefits and Who Bears the Cost
Every bill creates winners and losers. Here's who stands to gain and who bears the cost.
Who Benefits
- Academic and nonprofit researchers: Gain a formal, FTC‑backed pathway to access platform datasets that are otherwise proprietary or technically inaccessible, plus published data dictionaries and APIs that make independent analyses feasible and reproducible.
- Journalists and public‑interest organizations: Receive a statutory safe harbor for certain automated collection and research accounts, enabling investigations into algorithmic amplification, advertising influence, and content moderation without as much exposure to contract or computer‑access litigation.
- Policymakers, regulators, and watchdogs: Obtain annually compiled reports and ongoing public repositories and algorithmic summaries that support oversight, evidence‑based policymaking, and audits of platform effects on public discourse and public health.
Who Bears the Cost
- Large platforms (≥50M U.S. users): Face engineering and operational costs to inventory data, build secure delivery environments or APIs, implement FTC safeguard requirements, produce ad/content repositories, and manage review/appeals and prepublication processes, plus potential competitive risk from disclosure of company metrics.
- Research institutions and universities: Must build or fund compliance infrastructure (secure storage, access controls, logs), expand IRB processes to align with FTC safeguards, and manage legal exposure for researchers, increasing administrative overhead for grants and projects.
- FTC and NSF (and Commerce): Federal agencies must scale staffing, technical review capacity, and inspection/enforcement resources to evaluate projects, set safeguards, construct usable public reporting standards, and adjudicate platform appeals—an unfunded or underfunded workload risk.
Key Issues
The Core Tension
The central trade‑off is between creating rigorous, reproducible access to platform data for independent scrutiny and protecting individual privacy, platform security, and proprietary information: greater transparency and research utility require more data access and retention, which increases privacy and trade‑secret risks; stricter safeguards protect privacy and platform integrity but raise costs, slow research, and risk administrative bottlenecks that could blunt the law’s transparency aims.
The bill resolves certain access problems but leaves sharp implementation questions. The NSF/FTC joint process centralizes approval, yet the statute bars judicial review of project determinations—this expedites decisions but concentrates discretion and could reduce transparency if appeal paths are limited.
Prepublication review by the FTC intends to guard PII and trade secrets, but it creates a potential chokepoint: timing and criteria for review will materially affect research timelines and could chill work on politically sensitive topics.
Operationally, platforms and researchers must reconcile competing demands: reproducibility requires retention of sufficient data to allow replication, while privacy and FTC safeguards may require deletion or heavily redacted outputs. The Act permits the FTC to withhold disclosures to protect trade secrets or platform integrity, but it gives no clear formula for balancing those claims against public‑interest transparency.
The safe harbor for automated collection is a pragmatic recognition of journalistic practice, yet its boundaries—what counts as ‘‘reasonable measures’’ to avoid identifying individuals, how research accounts are monitored, and how to prevent gaming—are left to Commerce rulemaking and will determine whether the safe harbor is protective or easily abused.
Finally, the Act’s exclusions (no LLM training from the safe harbor, no biometrics/precise geolocation as qualified data) answer some risks but raise cross‑border and derivative questions: datasets that are aggregate or inferred may still produce sensitive inferences; companies may argue ‘‘undue burden’’ to deny access; researchers could be exposed to state or civil claims for negligence despite statutory immunities; and the carve‑outs for law‑enforcement‑affiliated researchers create a safe space for independent inquiry but restrict certain kinds of public‑interest collaborations.
Try it yourself.
Ask a question in plain English, or pick a topic below. Results in seconds.