Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
[0.0.0-beta.2] - 2025-11-25
Added
GEDCOM Import (lib)
- GEDCOM 5.5.1 support - Import standard GEDCOM 5.5.1 files
- GEDCOM 7.0 support - Import GEDCOM 7.0 with new features
- GEDCOM 5.5.5 support - Import GEDCOM 5.5.5 specification samples
- Two-pass conversion - Entities first, then families for proper relationship handling
- Evidence chain mapping - GEDCOM SOUR tags → GLX Citations → GLX Assertions
- Place hierarchy building - Parse place strings into hierarchical Place entities
- Geographic coordinates - Extract MAP/LATI/LONG coordinates from GEDCOM
- Shared notes - Support for both GEDCOM 7.0 SNOTE and GEDCOM 5.5.1 NOTE records
- External IDs - Import GEDCOM 7.0 EXID tags (wikitree, familysearch, etc.)
- Comprehensive test coverage - 33 GEDCOM test files (5.5.1, 5.5.5, 7.0) successfully imported
- Large file support - Tested with files containing thousands of persons and events
- Edge case handling - Empty families, self-marriages, same-sex marriages, unknown genders
- Character encoding support - ASCII, UTF-8, Windows CP1252 (CRLF and LF)
GLX Serializer (lib)
- Single-file serialization - Convert GLX archives to single YAML files
- Multi-file serialization - Entity-per-file structure with random IDs
- Archive loading - Load both single-file and multi-file GLX archives
- Vocabulary embedding - Embed standard vocabularies using go:embed
- Vocabulary loading from directory - Load vocabularies from multi-file archives
- ID generation - Random 8-character hex IDs for entity filenames
- EntityWithID wrapper - Preserve entity IDs in multi-file format using _id field
- Collision detection - Retry logic for filename generation
- Configurable validation - Optional validation before serialization
- 12 standard vocabularies embedded in binary
- Round-trip preservation - Single→Multi→Single conversions preserve all data
CLI Commands (glx)
glx import- Import GEDCOM files to GLX format- Single-file and multi-file output formats
- Optional vocabulary inclusion (default: true)
- Optional validation (default: true)
- Verbose mode with import statistics
- Supports both GEDCOM 5.5.1 and 7.0
glx split- Convert single-file GLX to multi-file format- Splits archive into entity-per-file structure
- Includes standard vocabularies
- Preserves entity IDs
glx join- Convert multi-file GLX to single-file format- Combines multi-file archive into single YAML
- Restores entity IDs from _id fields
Schema Enhancements
- Properties field added to 5 entity types for extensibility:
- Source - Store GEDCOM ABBR, EXID, custom tags
- Citation - Store event type cited, role, entry date
- Repository - Store FAX, additional contacts, EXID
- Media - Store crop coordinates, alternative titles, EXID
- Assertion - Store assertion metadata
- Backward compatible - Properties fields are optional with omitempty
Project Organization
.claude/plans/directory for all planning documentsCLAUDE.mdproject context guide for AI assistants- Plans README documenting all planning files and current status
- Moved all planning docs from
docs/to.claude/plans/
Vocabularies & Standards
- Developer documentation - Comprehensive GEDCOM Import Developer Guide
- Architecture and conversion flow
- Entity mapping details
- ID generation (current incremental + future random)
- GEDCOM 5.5.1 vs 7.0 differences
- Malformed line recovery strategies
- Testing and debugging guides
- User documentation - Updated Migration from GEDCOM Guide
- Automated import instructions
- Testing and validation procedures
- Import result expectations
Fixed
GEDCOM Import
- Malformed line recovery - Parser now handles MyHeritage export bug
- Recovers from NOTE fields with missing CONT/CONC prefixes
- Gracefully imports files with HTML-formatted notes
- Test case: queen.ged (4,683 persons, line 15903 missing CONT prefix)
- Family event handling - Added missing ANUL, DIVF, EVEN to case statement
- Place type references - Fixed gedcom_place.go to use "state" instead of "state_province"
Vocabularies
- Event types vocabulary - Fixed probate description ("Probate of estate" not "of will")
- Place types vocabulary - Removed duplicate state_province alias (use "state" instead)
- Schema categories - Updated allowed categories in vocabulary schemas
- Event types: Added "legal", "migration"; changed "custom" → "other"
- Place types: Added "institution"; changed "custom" → "other"
- Source types vocabulary - Added to embedded vocabularies (was missing)
Code Quality
- Clean architecture - Removed file I/O from library layer
- Moved importGEDCOMFromFile to test helpers (gedcom_test_helpers.go)
- CLI handles file operations, lib works with io.Reader
- Better separation of concerns
- File organization - Renamed gedcom_7_0.go → gedcom_shared.go (more accurate)
Testing & CI
- Multi-file vocabulary loading - Fixed LoadMultiFile to properly load vocabularies from directory
- Vocabulary preservation - Vocabularies now correctly preserved in round-trip conversions
- CI test coverage - Updated GitHub Actions to explicitly run all tests
- Large file tests (habsburg.ged: 34,020 persons)
- Added 15-minute timeout for comprehensive test runs
- No tests skipped in CI (no -short flag)
- Test documentation - Fixed queen.ged README with correct software attribution
- GEDCOM TITL handling - Now uses proper
PersonPropertyTitleconstant instead of hardcoded string - GEDCOM name fields - Only populate
name.fieldsfrom explicit GEDCOM substructure tags (GIVN, SURN, etc.), not inferred from parsing the name string - Test data consistency - All testdata files updated to use unified name format
Removed
Attribute Event Types
- Removed attribute-type events from schema - Events are now strictly discrete occurrences with participants
- Removed from event.schema.json enum:
residence,occupation,title,nationality,religion,education - Removed
censusfrom event-types.glx vocabulary - These attributes are now represented as temporal properties on Person entities
- Removed from event.schema.json enum:
- Removed CENS (Census) event handling - Census records are skipped during GEDCOM import (TODO: re-implement as citations supporting property assertions)
- Converted RESI (Residence) to temporal property - GEDCOM RESI tags now create temporal
residenceproperties on Person entities instead of events
Quality Ratings Support
- Removed
quality_ratingsvocabulary - The GEDCOM 0-3 Quality Assessment scale was removed from the GLX specification- Deleted
quality-ratings.glxvocabulary file - Deleted
quality-ratings.schema.jsonschema file - Removed
qualityfield from Citation entity - Removed
QualityRatingtype from Go code
- Deleted
- Removed auto-generated assertion confidence - GEDCOM imports no longer auto-populate assertion confidence levels
- Confidence levels should reflect researcher judgment, not be inferred from QUAY values
- GEDCOM QUAY tags are now preserved in citation notes (e.g.,
GEDCOM QUAY: 2)
Assertion Entity Fields
- Removed
evidence_typefield - Evidence quality classification belongs on citations, not assertions - Removed
typefield - Redundant withclaimfield andtagsfor categorization - Removed
research_notesfield - Consolidated into singlenotesfield
Provenance Fields (All Entities)
- Removed
modified_at,modified_by,created_at,created_byfields - Redundant with git history; usegit logandgit blameinstead
Changed
Person Properties Schema
- Unified
nameproperty - Replaced fragmented name properties with single unified property- Old: Separate
given_name,family_nameproperties - New: Single
nameproperty withvalueand optionalfieldsbreakdown - Format:
name: { value: "John Smith", fields: { given: "John", surname: "Smith" } } - Supports temporal lists for name changes over time
- Fields include:
prefix,given,nickname,surname_prefix,surname,suffix
- Old: Separate
- Added
titleproperty - Nobility or honorific titles (temporal, like occupation)- Properly handles GEDCOM TITL tag imports
- Added
PersonPropertyTitleconstant
Vocabulary Updates
- person_properties vocabulary - Updated to reflect unified name structure
nameproperty now includesfieldssub-schema for structured breakdown- Added
titleproperty definition
Other
- Documentation structure - Separated user docs (docs/) from planning docs (.claude/plans/)
Technical Details
GEDCOM Import Coverage:
- 100% critical features implemented
- 94% high-priority features implemented
- PRODUCTION-READY status
- Comprehensive gap analysis completed
Serializer Features:
- Uses crypto/rand for ID generation
- 32 bits of randomness per ID (4.3 billion possible values)
- Collision probability: ~1 in 400,000 with 10,000 entities
- EntityWithID wrapper pattern for multi-file format
- All 12 standard vocabularies embedded with go:embed
Testing:
- All existing tests passing
- 48 new test cases for serializer
- 33 GEDCOM files tested for import (100% coverage of test files)
- Full round-trip serialization/deserialization tests
- Vocabulary preservation tests for both single-file and multi-file formats
- Comprehensive unit and integration tests
- Large file stress tests (3000+ persons, 4000+ events)
[0.0.0-beta.1] - 2025-11-18
Fixed
- Fixed GitHub release workflow to build on beta tags (
v*.*.*-beta*pattern) - Fixed VitePress build by adding
shikidependency towebsite/package.json
Changed
- Removed roadmap section from README (no longer maintaining public roadmap)
Removed
- Removed archive folder containing old planning documents
[0.0.0-beta.0] - 2025-11-14
Added
Specification & Standards
- Complete GENEALOGIX specification defining modern, evidence-first genealogy data standard
- 9 core entity types with full JSON Schema definitions:
- Person (individuals with biographical properties)
- Relationship (family connections with types and dates)
- Event (life events with sources and locations)
- Assertion (evidence-backed claims with quality assessment)
- Citation (evidence references with source quotations)
- Source (primary/secondary evidence documentation)
- Repository (physical storage information)
- Place (geographic locations with coordinate data)
- Participant (individuals involved in events)
- Repository-owned controlled vocabularies for extensibility
- Git-native architecture for version control and collaboration
- YAML-based human-readable format with schema validation
CLI Tool (glx)
glx init: Initialize new GLX repositories with optional single-file modeglx validate: Comprehensive validation with:- Schema compliance checking against JSON Schemas
- Cross-reference integrity verification across all files
- Vocabulary constraint validation
- Detailed error reporting with file/line locations
glx check-schemas: Utility for verifying schema metadata and structure- Support for both directory-based and single-file archives
- Cross-file entity resolution and validation
Documentation & Examples
- Comprehensive specification documentation (6 core documents)
- Complete examples demonstrating various use cases:
- Minimal single-file archive
- Basic family structure with multiple generations
- Complete family with all entity types
- Participant assertions workflow
- Temporal properties and date ranges
- Development guides covering:
- Architecture and design decisions
- Schema development practices
- Testing framework and test suite structure
- Local development environment setup
- User guides including:
- Quick-start guide for new users
- Best practices and recommendations
- Common pitfalls and troubleshooting
- Manual migration guide for converting from GEDCOM format
- Glossary of key terms and concepts
Testing & Quality Assurance
- Comprehensive test suite with:
- Valid example fixtures demonstrating correct usage
- Invalid example fixtures testing error handling
- Cross-reference validation tests
- Vocabulary constraint tests
- Schema compliance validation tests
- Automated CI/CD pipeline using GitHub Actions
- Full code coverage reporting
Project Infrastructure
- Apache 2.0 open-source license
- Community guidelines and code of conduct
- Contributing guidelines for developers
- GitHub issue and discussion templates
- Development container configuration for consistent environments
- Pre-configured VitePress documentation site