Logo Epstein Email Archive

About This Archive

This is a searchable archive of Jeffrey Epstein-related emails and documents released by the House Republican Oversight Committee on November 12, 2025.

23,124 Scanned Pages
2,887 Documents
2010-2019 Date Range

What's in the Archive?

The original data release includes:

  • Scanned Images: 23,124 individual page scans (TIFF format)
  • Metadata: Document titles, dates, sender/recipient information, and Bates numbering
  • Text Files: Text extractions provided by the House Oversight Committee
  • Native Files: Original documents in their native formats (PDFs, etc.)

The documents consist primarily of email printouts, correspondence, and related materials that were part of the congressional investigation. Each page has been scanned and made available for public review.

What We've Added

To make this collection more accessible and searchable, we've enhanced it with:

Searchable PDFs

Every multi-page document has been compiled into a single PDF with embedded OCR (optical character recognition) text. This means you can:

  • Download entire documents as PDFs
  • Search within PDFs using your PDF reader
  • Copy and paste text from the documents

Full-Text Search

We've performed OCR on all 23,124 pages and created a searchable database. You can:

  • Search for any word or phrase across the entire collection
  • Use advanced search to filter by sender, date range, or multiple keywords
  • See highlighted snippets showing where your search terms appear

Search Features & Tips

  • Whole Word Matching: Check "Match whole words only" to search for "NSA" without matching "transactions" or "NASA"
  • Short Terms & Special Characters: Search terms under 4 characters (like "CIA", "FBI", "44") or with special characters (like "+44", email addresses) automatically use a more flexible search
  • Exact Phrases: Use quotes for exact phrases, e.g., "palm beach" or "lolita express"
  • Wildcards: SQL wildcards are supported in short-term searches - use % to match any characters (e.g., %maxwell%) or _ for a single character
  • Multiple Terms: In Advanced Search, use Pattern 1 and Pattern 2 to find documents containing both terms (AND logic)

Enhanced Navigation

  • Browse by Date: Explore documents chronologically
  • Correspondent Directory: See the most active email participants
  • Random Discovery: Jump to random documents or pages to explore the collection
  • Document Viewer: View scanned pages with easy page-by-page navigation

How to Use the Site

Quick Search

From the home page, just type what you're looking for (names, topics, keywords) and hit Search. For example:

  • Search for maxwell to find mentions of Ghislaine Maxwell
  • Search for "lolita express" (with quotes) for exact phrase matches
  • Search for flight to find travel-related correspondence

Advanced Search

Click the "Advanced Search" button for more options:

  • Search multiple patterns: Find documents containing both "flight" AND "caribbean"
  • Filter by sender: Find all emails from a specific person
  • Filter by date: Narrow results to a specific time period

Browse Documents

  • By Date: See all documents organized chronologically
  • By Correspondent: View the most active email participants and click to see their correspondence
  • Random Exploration: Use the "Random Doc" or "Random Page" buttons in the header to discover documents serendipitously

View Documents

When viewing a document:

  • Page navigation: Use the bottom navigation bar to move through multi-page documents
  • View PDF: Click the "View PDF" button to download or view the searchable PDF version
  • Document info: See metadata like sender, recipient, date, and subject line (when available)

Understanding the Data

Bates Numbers

Each page has a unique identifier called a Bates number (e.g., "HOUSE_OVERSIGHT_010486"). Multi-page documents have a range of Bates numbers. This is a standard legal document numbering system.

Document Quality

The documents are scans of printed emails and correspondence. Quality varies:

  • Some pages are crisp and clear
  • Others may be faded, have handwritten notes, or include redactions
  • OCR accuracy depends on the original scan quality

What's NOT in Here

This archive contains the documents as released by the House Republican Oversight Committee. We have not:

  • Edited or altered any content
  • Added commentary or analysis
  • Removed or redacted any material

Source & License

Original Source: House Republican Oversight Committee (November 12, 2025 release)

Public Domain: These documents are works of the U.S. Government and are in the public domain under 17 U.S.C. § 105. This means they can be freely used, reproduced, distributed, and adapted for any purpose without permission.

Archive Purpose: This site exists for research, journalistic, and archival purposes—to make these public records more accessible and searchable.

Technical Details

For those interested, this archive is built with:

  • OCR Processing: Tesseract OCR via OCRmyPDF for text extraction
  • Database: MySQL with full-text search indexes for fast queries
  • Web Application: Python Flask with responsive HTML/CSS/JavaScript frontend
  • Search: Natural language search using MySQL's MATCH...AGAINST
  • Hosting: Apache web server with mod_wsgi

Note: This is an independent archive created to facilitate public access to government documents. It is not affiliated with or endorsed by the House Oversight Committee or any government entity.