Documents
Upload, organize, review, and analyze your contract documents.
Overview
The Documents section is where you manage all your contract documents. Upload documents, organize them into matters, and access detailed review, comparison, and analytics tools.
Uploading documents
Drag and drop PDF or Word documents onto the document list, or click the upload area to browse. ContractRabbit automatically processes each document through a multi-stage pipeline:
- Format conversion — Documents are converted into a rich structured format that preserves paragraph hierarchy, list numbering, table layouts, bold/italic formatting, and indentation — formatting details that carry legal significance
- Preprocessing — The document is normalized: empty paragraphs are collapsed, split tables are merged, inconsistent list levels are unified, inline page numbers are stripped, and space-based indentation is converted to structural indent. This ensures consistent analysis regardless of how the document was authored
- Clause identification — The document is broken into logical clauses and sections using a legal-aware parser that correctly handles abbreviations like
Inc.,LLC.,L.P., section references likeSection 12.7, and semicolons in enumerated lists — cases where general-purpose NLP tools produce incorrect splits - Multi-document detection — If the file contains multiple distinct legal documents (e.g., an NDA with attached exhibits and schedules), ContractRabbit automatically detects document boundaries using signature block patterns, exhibit markers, and structural analysis, then classifies each sub-document independently
- Classification — Each clause is embedded as a 1536-dimensional vector and classified against your corpus using nearest-neighbor voting. The system learns from your existing labeled documents without explicit training
- Attribute extraction — Two layers run in sequence: document-level extraction identifies parties, effective dates, governing law, duration, and other headline attributes; then clause-level extraction processes every clause to find specific dollar amounts, dates, durations, entities, citations, and locations
- Scoring — Each clause is scored for party favorability, and aggregate scores are computed per document and across your corpus
You can upload multiple documents at once. Processing status is shown on each document row with real-time progress notifications.
Document list
The main document list shows all your documents with sortable columns:
- Name — Document filename
- Document Type — Automatically classified (e.g., NDA, Service Agreement, Employment Agreement)
- Party / Counterparty — Extracted parties with corporate enrichment data
- Source — Where the document came from
- Lifecycle Stage — Current stage (Drafting, Internal Review, etc.)
- Created / Updated — Timestamps
Use the search bar and filters to narrow results by document type, party, counterparty, jurisdiction, effective date, or custom attributes.
Bulk operations
Select multiple documents to:
- Delete documents
- Reprocess documents (re-run the full extraction pipeline)
- Update metadata
- Extract parties
Document detail tabs
Click any document to open its detail view with the following tabs:
Review
Read through the clause-structured document with AI-powered analysis. Key features:
- Version control — Select and compare different versions with a dropdown showing commit history
- Change highlighting — Word-level diffs between versions
- Document tabs — Navigate between multiple documents detected within the file (e.g., the main agreement vs. Exhibit A)
- Alignment panel — Select a standard and generate clause-by-clause alignment recommendations that appear as inline tracked changes
- Accept/reject workflow — Review each AI recommendation individually, with feedback that trains future alignment sessions
- Export — Download the document as clean final or as a redline with tracked changes
- Sections panel — Jump to specific sections using the hierarchical clause tree
Metadata
View extracted metadata and corpus scoring:
- Corpus scoring — Quadrant charts showing party vs. counterparty favorability
- Radar analysis — Drill down into detailed score breakdowns by clause category
- Key metrics — Summary cards with important document data points (effective date, duration, governing law, etc.)
Compare
Compare the current document against others in your corpus:
- Side-by-side cohort comparison
- Filter by attributes to find similar documents
- See how terms stack up across your portfolio
Analytics
Visual analytics for the document and related documents:
- Timeline charts — Effective date distribution and temporal trends
- Jurisdiction analysis — Geographic distribution of governing law and forums
- Dynamic filtering to explore different dimensions
Edit
Full document editing capabilities:
- Rich text editor with change tracking (Track Changes or Direct Edit mode, configurable per team or per matter)
- Section-based navigation with drag-and-drop reordering
- Version history with restore capability
- Changes panel for reviewing modifications before saving
Timeline
Complete audit trail of the document's lifecycle:
- Stage transitions with timestamps
- Who made each change and their role
- Visual progression through the workflow
How clause classification works
When ContractRabbit processes a document, every clause and section is embedded as a high-dimensional vector. These embeddings are compared against cluster centroids — average vectors computed from all previously classified clauses of each type. The nearest centroid determines the clause label.
This means classification improves over time: as you process more documents, centroids become more representative and new documents are classified more accurately. There is no manual training step — the system learns continuously from your corpus.
For complex clauses with enumerated sub-items, ContractRabbit decomposes them into parent clauses and child subclauses. Each subclause inherits context from its prefix (e.g., "Each party shall:") and is classified independently. If all subclauses receive the same label as the parent, they are collapsed back to avoid redundancy.
Hybrid search
When you search across documents, ContractRabbit combines five signals in a single query:
- Lexical (BM25) — Traditional keyword matching with legal-aware normalization (e.g.,
§maps to "section") - Vector (cosine similarity) — Semantic matching using clause embeddings, so "terminate the agreement" matches "end the contract"
- Taxonomy label — Matches clauses classified under a specific node in the clause taxonomy
- Structured attributes — Filters on extracted dates, amounts, durations, etc.
- Party favorability — Weights results by how favorable they are to a given party
Different search profiles weight these signals differently — discovery mode emphasizes breadth, while favorability mode emphasizes party scoring.