Entities & Resolution
How Moongraph extracts entities and merges duplicates.
Entities & Resolution
This page explains what entity extraction does and why entity resolution matters for building useful knowledge graphs.
What entities are
Entities are the "things" mentioned in your documents. During graph creation, Moongraph identifies:
- People — Names of individuals
- Organizations — Companies, agencies, institutions
- Locations — Places, addresses, regions
- Dates — Specific dates, time periods
- Events — Named occurrences
- Concepts — Topics, ideas, domains
Each entity has:
- A label (the name as it appears)
- A type (Person, Organization, etc.)
- Properties (additional attributes like title, nationality, founding date)
- Source attribution (which documents and chunks mention it)
Why resolution matters
The same real-world entity often appears in different forms:
| Variations | Same entity? |
|---|---|
| "John Smith", "J. Smith", "Dr. John R. Smith" | Yes |
| "Acme Corporation", "Acme Corp", "ACME" | Yes |
| "Washington DC", "Washington, D.C.", "the capital" | Yes |
Without resolution, your graph would have three separate "John Smith" nodes instead of one—fragmenting information and making it harder to trace connections.
Entity resolution identifies when different mentions refer to the same real-world thing and merges them into a single node. The result is a cleaner, more useful graph.
How resolution works
During graph creation, resolution runs after entity extraction:
- Extraction — Entities are identified in each document chunk
- Resolution — Similar entities are grouped and merged
- Storage — The deduplicated graph is saved
Resolution uses several signals:
- Name similarity (string matching, fuzzy matching)
- Context (entities mentioned together may refer to the same thing)
- Type consistency (a Person is unlikely to be the same as an Organization)
When entities are merged, non-canonical labels become aliases on the merged entity. All relationships and source attributions are combined.
When resolution works well
- Names have clear overlap ("J. Smith" and "John Smith")
- Entities appear in similar contexts
- Documents have consistent formatting
- Text quality is good (clean OCR, no typos)
When resolution struggles
- Completely different surface forms (nicknames like "Bobby" vs. "Robert")
- Common names where multiple people share the same name
- Poor document quality (OCR errors making names unrecognizable)
- Entities mentioned in completely different contexts
Manual merging
When automatic resolution misses a connection, you can merge entities manually:
- Select entities in the Entities tab
- Click Merge Selected
- Choose a canonical label
- Confirm the merge
See Merge Duplicate Entities for step-by-step instructions.
Manual merging is irreversible. All selected entities become one, with other labels saved as aliases.
Properties
Entities can have properties—key-value attributes extracted from text:
| Entity | Property | Value |
|---|---|---|
| John Smith | title | CEO |
| John Smith | nationality | American |
| Acme Corp | founded | 1985 |
| Acme Corp | industry | Technology |
When resolution merges entities, their properties are combined. If the same property exists with different values, both are preserved.
Source attribution
Every entity links back to where it was mentioned:
- Which documents
- Which chunks within those documents
- Which pages (if page information is available)
This provenance lets you verify extractions and cite sources in your work.
Related
- Knowledge Graphs — Why graphs are useful
- Merge Duplicate Entities
- Graphs Reference — Entities tab details