Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
Unreleased
Added
SECURITY.mdsafe-harbor clause and coordinated-disclosure embargo policy — Adds the two elements the OpenSSF OSPS Baseline Level 2 "coordinated vulnerability disclosure" control was missing. A Safe Harbor section (adapted from the disclose.io Core Terms, a widely used open-source safe-harbor framework) assures good-faith researchers we will not pursue or support legal action under anti-hacking laws such as the US CFAA for accidental, good-faith policy violations, and sets the guardrails (only test your own machine and data, minimise impact, report before disclosing) that keep research inside that harbor. A Coordinated Disclosure and Embargo section publishes the advisory when the fix ships and treats 90 days as a backstop (the outer limit beyond which an issue should not stay embargoed indefinitely; extendable by mutual agreement), names who is inside the embargo (maintainers via the private GitHub Security Advisory draft, plus the reporter), states the embargo-conduct expectations on both sides, and documents how downstream consumers are notified — a published GitHub Security Advisory with a requested CVE that lands in the GitHub Advisory Database and the Go vulnerability database, alertinggo-glx/glxusers throughgovulncheck,dependency-review, and Dependabot.SECURITY-POSTURE.mdis updated in step: the Level 2 coordinated-disclosure row flips from partial to met, clearing the last Level 2 gap, so the headline self-attestation advances from Level 1 (most Level 2 met) to Level 2 (most Level 3 met), leaving #269 (SBOM) as the sole remaining gap. Closes #424.- Binary archive cache for fast repeated loading (
glx cache) — Every command that loads a multi-file archive re-parses every.glxYAML file on each run; on large archives that is tens of seconds and gigabytes of peak memory per invocation. A new opt-in binary cache persists the fully parsed archive to{archive}/.glx/cache.bin(viaencoding/gob) so subsequent loads deserialize a blob instead of re-parsing YAML. Newglx cache build [--force],glx cache clean, andglx cache statussubcommands manage it; read commands (summary,query,timeline,analyze,stats,vitals,places,cite,cluster,coverage,diff,duplicates,evidence,path,search,report,export, and the ancestor/descendant tree) transparently use a fresh cache when one is present. The cache is disposable and best-effort: any detected problem (missing, stale, corrupt, version/struct mismatch, implausibly large, decode failure) silently falls back to the authoritative YAML parse, so a missing, stale, or unreadable cache is ignored rather than trusted. The cache file is size-capped and decoded through a bounded reader before any gob decode, since gob is not hardened against untrusted input (a hostile cache could otherwise be planted in a shared repo). Staleness is decided solely by a stat-only(path, size, mtime)fingerprint over all.glxfiles (no file reads); the archive's gitHEADcommit and clean state are also recorded (read in-process via the pure-Gogithub.com/go-git/go-git/v5library, so nogitbinary onPATHis required) and surfaced byglx cache status, but do not affect staleness. As with any mtime-based cache, a content edit that preserves a file's size and mtime is the one change the fingerprint cannot see;glx cache build/glx cache cleanforces a refresh in that corner case.GLX_CACHE=autoadditionally builds the cache on the first miss;GLX_CACHE=offbypasses it entirely.glx initnow git-ignores.glx/, and write/transform commands keep parsing YAML directly so their output is never affected by the cache. Scoped to multi-file (directory) archives; single-file archives and parse-time memory reduction are out of scope. (#197) - Release binaries now carry SLSA build-provenance attestations — The
release.ymlworkflow runsactions/attest(pinned to thev4.1.0commit SHA) after GoReleaser, taking GoReleaser'schecksums.txtassubject-checksumsso a single attestation covers every released archive. The provenance is signed keyless via the GitHub OIDC token and stored on GitHub, letting downloaders verify a binary was built by this repo's CI withgh attestation verify --owner genealogix <file>. This is on top of the existing cosign keyless signing of the checksums file (the two are complementary: cosign proves the checksums file's integrity, the attestation proves build provenance per SLSA). Required adding theattestations: writepermission to the release job.actions/attest-build-provenanceis only a composite wrapper aroundactions/attestthat forwards empty predicate inputs, so callingactions/attestdirectly — which defaults to the SLSA provenance predicate when no predicate is supplied — is equivalent. SBOM generation is tracked separately in #269. (#256) NOTICEfile for Apache 2.0 third-party attribution — Adds a top-levelNOTICEfile declaring the GENEALOGIX (GLX) copyright and attributing the third-party Go components statically linked into the distributedglxbinary, grouped by direct vs. transitive dependency with each component's SPDX license, copyright holder(s), and source URL. The component list is derived from the modules actually compiled into the binary (go version -m), not the source-treego.mod, so test-only dependencies such asgithub.com/stretchr/testify(which are not redistributed) are intentionally omitted. The file is also added to the GoReleaser archivefileslist so it ships alongsideLICENSEin every release archive. (#427)glx migrate --rename-ssn-to-national-id— New opt-in flag onglx migratethat renames the legacy US-centricssnperson property to the internationalizednational_id(the vocabulary rename in #532), so pre-rename archives can adopt the new key without hand-editing. It movesperson.properties.ssn→national_id, re-points person-subject assertions, and renames an inlinedperson_properties.ssndefinition (multi-file archives also pick up the corrected key when the standard vocabulary file is regenerated on write). A person or vocabulary already carryingnational_idis never overwritten — the legacyssnis left in place with a warning so the user can reconcile by hand. Non-person-subject assertions (an event/relationship may legitimately carry a customssnproperty) are untouched. Idempotent. GEDCOM compatibility is unaffected: both keys map to theSSNtag. (#532)- Seven new standard source types in
source-types.glx— Addedfamily_bible(Family Bible records of births, marriages, deaths — often the only pre-civil-registration source for US colonial/antebellum research),gravestone(tombstone/headstone inscriptions, cemetery memorials),dna_test(genetic test results — autosomal, Y-DNA, mitochondrial DNA),memoir(autobiographies, diaries, personal journals),manuscript(unpublished manuscripts or typescripts),map(historical maps, atlases, survey plats), andsocial_media(social media posts and profiles), growing the standard vocabulary from 18 to 25 types. Source type is an open vocabulary, so archives could already add these as custom types; standardizing the common ones keeps classification consistent across archives and unlocks GEDCOM round-trip. The parallelSourceType*constants ingo-glx/constants.gowere extended to match, and the GEDCOMSOUR.TYPEimport mapping now recognizes the aliasesbible/gravestone/tombstone/headstone/dna/memoir/diary/manuscript/map/social(explicit-value mapping, no false-positive risk) plus identity mappings for the canonical keysfamily_bible/dna_test/social_mediaso all seven types survive a GLX→GEDCOM 7.0→GLX round-trip (the exporter writesSource.Typeverbatim as theTYPEvalue; without the identity entries these three would re-import asother). Title-based inference picks up distinctive phrases (e.g. "family bible", "headstone", "dna test", "survey plat", checked before the broad "land" keyword) while deliberately avoiding the bare word "map" so place names like "Mapleton" do not misclassify. Thesourceentity spec and the vocabularies reference were updated in step. (#563) - Deterministic schema/code drift detection (
make check-code-drift) — A new Go tool undertools/driftcheckcompares thego-glxtype definitions (via reflection) against the JSON Schemas inspecification/schema/v1, catching structural drift — missing/extra fields, yaml-tag mismatches,omitempty-vs-requiredmismatches, and type-family mismatches — on every PR with no API keys. It is the deterministic counterpart to the LLM-based/check-code-driftslash command: zero false positives by design, with ambiguous/semantic cases still deferred to the LLM checker. It resolves cross-file and in-file$refs, recurses into nested types (Participant, EntityRefoneOf, Submitter, Search), handles the customNoteList/DateStringmarshalers, and compares the unifiedVocabularyEntryagainst the union of every type-vocabulary schema. Known, by-design drift is suppressed by reading.claude/drift-allowlist.yaml(matching the same per-symbol identity the slash commands use). Wired intomake check, runs as a dedicatedvalidate-specCI job, and is guarded by an in-repo regression test that fails if the committed types and schemas ever diverge. The closed-issue staleness gate noted in #797 needs issue-state lookups and remains future work. (#673) assertion-workflowexample now demonstrates conflicting evidence — The example's README repeatedly promised conflicting-evidence support ("Doesn't capture conflicting evidence", "Documenting conflicting evidence", the "Conflicts: Hidden vs Documented" table row), but thearchive.glxcontained only corroborating assertions. Added a genuine conflict on Robert's birth year: a newcitation-robert-birth-lore(family oral history placing his birth in 1956) backs adisproven, low-confidenceassertion-robert-birth-lorethat contradicts theprovenbirth-certificate assertion (1955-03-22) — two assertions on the same property of the same event with different values, resolved in favor of the primary source. A new "Conflicting Evidence (and Its Resolution)" README section walks through theproven/disproven/disputedstatus pattern, demonstrating GPS element 4 (resolution of conflicting evidence). (#575).claude/drift-allowlist.yaml— Machine-readable, self-validating allowlist of known, per-symbol drift suppressions for the/check-*-driftslash commands, replacing inline "known drift" prose with an auditable file. Each entry is either permanent (by-design) or a temporary deferral that must reference a tracking issue; this is enforced by.claude/drift-allowlist.schema.jsonvia the newmake check-drift-allowlisttarget, wired intomake checkand thevalidate-specCI workflow. The closed-issue staleness gate (failing when a temporary entry's tracking issue is closed) is left to the deterministic drift checkers tracked by #673 / #295. (#797)glx validate --stdin --entity-type— Validate a single entity snippet piped on stdin against its entity-type schema, without writing a temp file or wrapping it in archive shape.echo '<yaml>' | glx validate --stdin --entity-type personruns structural validation only (Pass 1 / JSON Schema — no cross-reference resolution) and exits1on a structural failure or malformed-YAML input (content was piped but is unparseable — a content failure, like a structural one),2on a bad invocation (unknown or missing--entity-type, stray path arguments, empty stdin — nothing piped, or combining--stdinwith--report); the--entity-typeallow-list is derived by reflection fromglxlib.AllEntityTypes(entity singulars such asperson) and theGLXFilevocabulary collection keys (event_types,place_types,confidence_levels, …), so a vocabulary entry is validated against its own schema — e.g. aplace_typesentry'scategory: administrativepasses against the place-type enum instead of being rejected by the event-type one. New vocabularies added toGLXFileare picked up automatically with no parallel list to maintain. Lets thecheck-*drift skills validate documentation fragments directly instead of the mktemp/cat/rmtemp-file dance. (#910, #1016)- Deterministic drift-check CI layer (
.github/workflows/drift-checks.yml) — Two new no-LLM checks complement the existingmake check-code-drift(#673): spec↔schema field parity (scripts/drift-checks/spec-schema-drift.mjs) compares the field tables inspecification/4-entity-types/*.mdagainst eachschema/v1/*.schema.jsonon both presence and required/optional, warn-only (DRIFT_STRICT=1makes it blocking) (#309); and schema↔schema backward-compatibility (specification/schema-compat.mjs) diffs PR-changed schemas against the base branch withjson-schema-diff-validatorand hard-fails on backward-incompatible changes (#311).validate-schemas.mjsadditionally validates every5-standard-vocabularies/*.glxagainst its vocabulary schema (#839), andmake check-code-driftfindings now carrygo-glx/types.go:NNsource lines via a new I/O-free AST extractor (tools/internal/structdump), satisfying #676 item 11 (#795). Both Node checks expose an I/O-free core (compareEntity,classifySchemaChange) covered by fixture-basednode:testunit tests (make test-scripts, wired intomake checkand thevalidate-specCI job), so the hand-rolled markdown-table parser and the breaking-vs-compatible classifier are pinned rather than only exercised against the live tree. #309 remains open: its acceptance criterion is failing CI on drift, so it tracks the future flip of the warn-only parity check toDRIFT_STRICT=1once the parser is proven. (#309, #311, #839, #795, #1016) - Two new standard relationship types in
relationship-types.glx— Addedcivil_union(legally recognized union distinct from marriage in its jurisdiction) andcommon_law_marriage(marriage established by cohabitation and repute rather than ceremony or registration), growing the standard vocabulary from 20 to 22 types. Both are legally distinct from marriage in many jurisdictions and exist as separate categories in GEDCOM X (CivilUnion,CommonLawMarriagecouple facts) and Gramps (CIVIL UNION), so archives previously had to collapse them intomarriageorpartner. The parallelRelationshipType*constants ingo-glx/constants.gowere extended to match, and the vocabularies reference prose was updated in step. Neither type declares agedcom:mapping — GEDCOM 7.0 has no dedicated tags, and the importer's documented behavior (see #544) of collapsingMARR TYPE "civil union"/"common law"into amarriagerelationship plus amarriage_typeevent property is unchanged; declaringMARRon these entries would make the GEDCOM→GLX reverse lookup ambiguous. Whether GEDCOM import should map recognizedMARR TYPEvalues onto the new relationship types (which would also require widening the export-side FAM reconstruction filter that currently emits onlymarriagerelationships) is a separate behavioral decision, deliberately not made here. (#556)
Changed
JSON schemas migrated from draft-07 to JSON Schema 2020-12 — Every
*.schema.jsonunderspecification/schema/(the meta-schema, all entity schemas, and all vocabulary schemas), plus the repo-internal.claude/drift-allowlist.schema.jsonand.claude/skills/check-suite/findings.schema.json, now declares"$schema": "https://json-schema.org/draft/2020-12/schema". The draft-07-isms were migrated mechanically:definitions→$defs(with all#/definitions/...$refs rewritten), and the Place schema's latitude↔longitude co-requirement moved from array-formdependenciestodependentRequired. Validation behavior is unchanged — the renames are identity-preserving, and the Go validator (santhosh-tekuri/jsonschema/v6, swapped in by #268) already speaks 2020-12 natively. The Node tooling went dialect-aware:validate-schemas.mjsandvalidate-drift-allowlist.mjsimport ajv's2020class (the defaultajvexport is draft-07-only), the meta-schema now requires the 2020-12 dialect declaration, andtools/driftcheckreads$defs. Becausejson-schema-diff-validator(the schema↔schema backward-compatibility gate, #311) predates 2020-12 and would flag the keyword renames as breaking removals,schema-compat.mjsgained an exportednormalizeDialect()that canonicalizes both sides to draft-07 keywords before diffing — dialect renames diff as no-ops while real structural breaks still fail the gate (unit-tested inschema-compat.test.mjs). Thecheck-code-driftandcheck-schema-driftskills now state the dialect explicitly so LLM-assisted checks don't suggest draft-07 idioms, and ADR-0007 records the decision (2020-12 now; TypeSpec-as-upstream deferred, not rejected). Consumers validating archives against the published schemas need a 2020-12-capable validator; most maintained validators read both drafts. Closes #794.PR template corrected and given DCO / AI-disclosure reminders —
.github/PULL_REQUEST_TEMPLATE.mdno longer claims the title "Description must start with an uppercase letter":lint-pr-title.ymlsets nosubjectPattern, so casing is not CI-enforced (a lowercase subject passes today). The top comment now nameslint-pr-titleas the source of truth for the allowed type list and notes that on squash-merge this title is normally the commit subject. A grouped "before submitting" comment block adds two reminders the template was missing for policies that live inCONTRIBUTING.md: signing off every commit per the DCO (with thegit rebase --signoffrecovery one-liner) and disclosing substantial AI assistance via anAssisted-by:trailer (kept SHOULD-level, no mandatory checkbox). The CHANGELOG reminder now mentions the required issue/PR reference, the "Related issues" hint distinguishes a closing keyword from a bare#123, "What and why" points spec/data-model changes at the proposal + ADR gate, and "Testing" asks for a screenshot on website/UI changes. The template stays lean (prose/HTML-comment prompts, no rendered checklist) — deliberately not adopting PR-template forms (GitHub has none in 2026), multiple templates, a long checklist, or a bespoke drift check. Automating the DCO check itself is tracked in #1057. (#1056)Auto-generated event titles no longer append the year in parentheses — GEDCOM import titles drop the trailing
(YEAR)suffix:Birth of John Smith (1850)becomesBirth of John Smith, andMarriage of John Smith and Jane Doe (1850)becomesMarriage of John Smith and Jane Doe. The year was redundant with the event's owndatefield, and was frequently wrong: it came fromExtractFirstYear, which only reliably parses dates the importer normalized to ISO, so non-ISO dates (full month names, mixed case, numericDD/MM/YYYY) surfaced the day-of-month instead of the year (e.g.Birth of John Smith (15)). The parenthetical was an implementation choice in #99, not an acceptance criterion of the originating feature request (#86).GenerateEventTitleno longer takes a date argument. The underlyingExtractFirstYearmis-parse (which also affects temporal validation, duplicate detection, and privacy cutoffs) is tracked separately in #1025. (#1024)ssnperson property renamed tonational_id— The standard person property had a US-specific label ("Social Security Number") contradicting its generic description ("National identification number"), making GLX read as US-centric for UK/Canada/Australia researchers whose records use NI numbers, SINs, or TFNs. The property key is nownational_idwith label "National Identification Number" and a description listing international examples. GEDCOM compatibility is preserved — thegedcom: "SSN"mapping is unchanged, so importing a GEDCOMSSNtag now populatesnational_idand exportingnational_idstill emitsSSN(GEDCOM 7'sSSNtag is itself defined as US-only). Archives using the oldssnkey should rename it tonational_id. (#532)golangci-lint now enables the
std-error-handlingexclusion preset — golangci-lint v2 ships its built-in exclusion presets disabled by default (v1'sexclude-use-default: truehas no direct v2 equivalent), so without the preset errcheck flags any uncheckedClose/Flush/print/Remove/Setenvreturn that isn't manually suppressed. Thestd-error-handlingpreset suppresses those — ignoring them is idiomatic Go (e.g.defer f.Close()on a read-only handle) — making the manual_ = f.Close()and//nolint:errcheck // CLI outputworkarounds the codebase had accumulated unnecessary going forward. The change is behavior-neutral on the current tree (errcheck already reported zero issues, because every such call was hand-suppressed); it only removes future friction. The other three v2 presets are intentionally left off:common-false-positivesfilters only gosec and is deferred to the gosec-enablement work in #374,commentsis redundant with revive's already-disabledpackage-comments, andlegacycovers onlyunsafe.Pointercasts the codebase does not use. (#630)source-properties.glxdescriptions rewritten to be self-contained — Theabbreviation,call_number,events_recorded, andagencyproperty descriptions no longer reference GEDCOM tags as their definition (e.g. "(from GEDCOM ABBR tag)"); each now stands on its own for readers unfamiliar with GEDCOM. The GEDCOM origin remains captured in each property's structuredgedcom:field.publication_infocarried the identical "(from GEDCOM PUBL tag)" pattern and was cleaned up in the same pass for consistency (not listed in the issue, but the same defect). (#558)make build-clinow builds with GoReleaser's-trimpathand stripped ldflags — the target builds with-trimpathand-ldflags "-s -w -X main.version=$(VERSION)"instead of a barego build, so local binaries report a version (glx --version), strip debug symbols (~27% smaller), and drop local filesystem paths for reproducibility. A newVERSIONvariable defaults todev(matching theglx/cli_commands.gofallback) and can be overridden, e.g.make build-cli VERSION=0.1.0-local. This aligns the target on stripping,-trimpath, and version injection; GoReleaser additionally injectscommit/date(added in #384), which the Makefile omits to keep local builds git-independent. (#440)/check-code-driftand/check-schema-driftnow read the drift allowlist instead of hardcoding known-drift notes inline. Removed the staleMetadata.Notes"known drift point" note — that drift was already resolved by wideningglx-file.schema.jsonmetadata.notestooneOf[string, array], which is exactly the go-stale-and-mislead failure mode #797 targets. (#797)README.mdrestructured to surface Install/Quick Start above the fold — Installation and Quick Start now sit immediately under the badges, so first-time visitors see "how to use it" before "why to use it" (the previous order buried Quick Start at line 143 behind a 90-line marketing section). The "Why GENEALOGIX?" comparison was shortened — the comparison table is kept, but the inline GEDCOM-vs-GLX code blocks and the three "Beyond Exchange" subsections were dropped or linked out to Core Concepts. The redundant "What is GENEALOGIX?" section was renamed "Features" (bullet list preserved). A new flat "Documentation" link list replaces the old "Quick Links" header and absorbs the doc-related entries that were buried under "Community & Support". "Community & Support" itself was compacted from eight emoji-headed subsections into a single table covering Issues, Discussions, Chat, Mailing list, Contributing, Code of Conduct, Security, and Releases. The "Getting Help" walkthrough (numbered For-Users / For-Developers steps), the "Project Status" beta-10 feature checklist, and the "Acknowledgments" prose were dropped outright rather than folded into the table. Length dropped from 335 to 201 lines (-40%). No link contracts removed except a pre-existing broken link toglx/tests(that directory does not exist; the Go tests live alongside their source as*_test.go). Closes #472.Pinned six third-party GitHub Actions to release commit SHAs —
golangci/golangci-lint-action(lint.yml),lycheeverse/lychee-actionandpeter-evans/create-issue-from-file(lychee.yml),goreleaser/goreleaser-action(release.yml),DavidAnson/markdownlint-cli2-action(lint-markdown.yml), andcodecov/codecov-action(validate-spec.yml) were floating on bare@vNmajor tags; each is now pinned to its latest release's 40-character commit SHA with a trailing# vX.Y.Zcomment, matching the repo's existing SHA pins (golang/govulncheck-action,amannn/action-semantic-pull-request) and the.github/CLAUDE.md"always pin tovX.Y.Z, never@vN" policy. (codecov/codecov-actionwas not in #963's enumerated list but surfaced during the repo-wide sweep the issue calls for, and the OpenSSF Scorecard Pinned-Dependencies check scores it too.) Mutable tags were the attack vector in the March 2025tj-actions/changed-files(CVE-2025-30066, ~23k repos) andreviewdog/action-setup(CVE-2025-30154) supply-chain compromises, where existing tags were rewritten to point at malicious commits; SHA pins are immutable and were untouched. The existing Dependabotgithub-actionsentry keeps the pinned SHAs (and their version comments) current. First-partyactions/*floats remain the repo's accepted convention and are out of scope. Closes #963.Release job no longer restores a Go build cache writable by
pull_requestruns (cache: false) —release.yml'sSet up Gostep setcache: true, restoring the sameactions/setup-goGo build cache thatpull_request-triggered workflows (security.yml's govulncheck/gosec,validate-spec.yml) populate. Because the release job is privileged — it holdsid-token: writefor cosign keyless signing — restoring a cache a lower-trust run can write is a cache-poisoning path: a poisoned entry would taint the binaries upstream of signing, so provenance and signatures wouldn't catch it (the recognizedcache-poisoningclass; cf. the Angular Mar-2026 / TanStack May-2026 advisories). The step now setscache: false; the release job already does a cleango mod download, so the cache bought little on a cold tagged build. This is defense-in-depth given Actions cache scope isolation (a fork PR can't directly write the release run's cache — a poisoned entry would first have to land onmain), removing the trust-boundary bridge entirely. A general rule (id-token: writemust never coexist with apull_request-reachable cache) was added to the.github/CLAUDE.mdrelease section; wiring zizmor'scache-poisoningaudit into CI (#928) would catch regressions generically. (#1051)Added a top-level read-only
permissions:default to theauto-update-branches.ymlworkflow — the workflow setpermissions:only at the job level, leaving the top-levelGITHUB_TOKENscope at the repository default. OpenSSF Scorecard flags this under Token-Permissions ("no topLevel permission defined"): a future job added without its ownpermissions:block would silently inherit the broad default. A top-levelpermissions: contents: readleast-privilege default is now declared; the singleupdate-branchesjob keeps its explicitcontents: write/pull-requests: writegrant, so behavior is unchanged. Matches the top-level read-only default already used bylint.ymlandrelease.yml. (PR #979)Added a top-level read-only
permissions:default to theauto-resolve-conflicts.ymlworkflow — the sibling fix to #979. This workflow (triggered byauto-update-branches.yml) likewise setpermissions:only at the job level, leaving the top-levelGITHUB_TOKENscope at the repository default — the same OpenSSF Scorecard Token-Permissions finding ("no topLevel permission defined"). A top-levelpermissions: contents: readleast-privilege default is now declared; the singleresolve-conflictsjob keeps its explicitcontents: write/pull-requests: readgrant, so behavior is unchanged. This was the last workflow in the repo missing a top-levelpermissions:block. (PR #1054)validate-examplesCI job now validates each example in a parallel matrix instead of a sequential shell loop — Thevalidate-specworkflow built the CLI and then looped overdocs/examples/*/with./bin/glx validate, which ran sequentially, stopped at the first failure (defaultset -e), and reported every example under one opaque step. It is now three jobs: abuild-clijob builds the binary once, uploads it as an artifact, and emits the example list as a dynamic matrix (discovered viafind docs/examples+jq, so newly added example directories are picked up with no workflow edit — preserving the loop's auto-discovery); avalidate-examplematrix job downloads the binary and validates one example per leg withfail-fast: false, giving per-example pass/fail in the Actions UI and running them in parallel; and a smallvalidate-examplesaggregator job keeps that exact status-check name (required by the repo ruleset) stable regardless of how the matrix expands. The matrix value is staged throughenv:before use inrun:per the repo's workflow-injection guidance. Closes #352.govulncheck is now version-pinned via a Go 1.24
tooldirective — Added a dedicatedci-tools/module (ci-tools/go.mod) whosetooldirective pinsgovulncheck(v1.3.0), making it the single source of truth for CI and local runs. Thesecurity.ymlgovulncheckjob drops the third-partygolang/govulncheck-action(which installedgovulncheck@latest— unpinned) and now runsgo tool -modfile=ci-tools/go.mod govulncheck -format sarif ./..., keeping the existing Code Scanning SARIF upload. A separate module is used deliberately so the tool's transitive dependencies stay out of the rootgo.mod/go.sumand are never inherited by consumers importinggo-glxas a library; it lives inci-tools/rather thantools/becausetools/already holds the first-partytools/driftcheckpackage (part of the main module), and atools/go.modwould have pulled that package into the isolated tool module. A newmake vulnchecktarget runs the same pinned tool locally; aci-tools-tidy-checktarget (and matchingvalidate-specCI step) keepsci-tools/go.modtidy, and Dependabot'sgomodecosystem now watches/ci-toolsso the pin stays current. gosec stays on its existing version-pinnedgo install ...@v2.22.4: itsautofixpackage pulls the Googlegenerative-ai-go/Cloud SDK/gRPC/OpenTelemetry tree, whose (unreachable, build-time-only) advisories would otherwise makedependency-reviewblock every PR.golangci-lint(pinned via.golangci-lint-version, see #272) andgoreleaserlikewise remain on their dedicated actions. (#480)/check-schema-driftslash command emits a machine-readable findings block plus provenance, cross-reference, and false-positive fixes — Parallels the structured-output work for the siblingcheck-*commands. (1) Afindings-jsonfenced block is appended after the prose report — valid JSON with one object per drift carryingscope(entity | archive_root | vocabulary_type | vocabulary_property | meta),target,schema_path,spec_path,json_pointer,category,severity(critical | major | minor | info, rubric tracked by #838),drift_direction(spec_to_schema | schema_to_spec), and amessage— so the eval harness (#796) can compute precision/recall against a deterministic fence instead of free-form markdown. (2) A Provenance header records the commit SHA, run timestamp, the schema files actually visited (to catch silently dropped files), and themake check-schemasexit status. (3) A Cross-Reference with Known Issues section layers an open-issue scan on top of the existing file-backed.claude/drift-allowlist.yamlso triaged drift is not re-surfaced. (4) The Output Format template now covers all four scopes (entity / archive root / type vocabulary / property vocabulary) with fixed headings rather than only## [Entity Type]. (5) The "Required Fields Alignment" check now excludes theEntity ID (map key)table row — a parent-map keying constraint, not an entity property — removing a per-entity false positive. (6) A "Files to ignore" subsection excludesembed.go, READMEs, and any non-.schema.jsonfile. (7) Field-description comparison is replaced with a two-tier structural-vs-wording rule to stop scoring paraphrase as drift. The prompt'sallowed-toolsfrontmatter gainsBash(git rev-parse:*),Bash(date -u:*),Bash(make check-schemas:*), andBash(gh issue list:*);.claude/settings.jsongains the matchingBash(date -u:*)entry (the others were already allowed). Closes #837.gosecsecurity job now uploads SARIF to Code Scanning — Thegosecjob in theSecurityworkflow rangosec -quiet ./...and emitted findings only to the Actions log, so warnings never reached the GitHub Security tab or showed as PR annotations. It now runsgosec -fmt sarif -out gosec.sarif ./...and uploads the result viagithub/codeql-action/upload-sarif(SHA-pinned tov4.36.1per the repo's action-pinning policy), gaining asecurity-events: writepermission and the sameif: always() && (…head.repo.full_name == github.repository)guard as thegovulncheckjob in the same workflow — making the two security tools consistent and surfacing gosec findings inline on changed files and in the Security tab. The pre-existing floatingupload-sarif@v4reference in thegovulncheckjob was SHA-pinned to the same commit in the same pass so both references stay consistent. Closes #343.The five
check-*drift commands migrated from slash commands to skills —.claude/commands/check-{code,schema,docs}-drift.md,check-examples.md, andcheck-spec.mdbecome.claude/skills/check-*/SKILL.md(#833), sharing one output contract (.claude/skills/check-suite/findings.schema.json— a single object-wrapperfindings-jsonshape every skill emits) and one severity rubric (.claude/skills/check-suite/severity-rubric.md, thecritical | major | minor | infoscale). The agreed per-skill improvements are folded in during the move:check-docs-driftdrives a dynamicdocs/**glob with ADRs excluded and cross-references Known Issues (#829, #800, #831; supersedes #298);check-schema-driftadopts the shared rubric and enumerates schemas dynamically (#838, #986);check-code-driftfolds in the agreed #676 prompt improvements and #771's accuracy fixes;check-exampleskeeps and expands its Step 1 and points Step 4 at the CI cross-reference, treating westeros as an external pointer (#834, #835, #303, #836);check-specmakes the RFC 2119 check tooling-only, adds cross-cutting checks, and delegates snippet validation toglx validatewith mktemp-safe cleanup (#314, #315, #847, #849). All five carrymodel: claude-opus-4-8frontmatter — the originating slash commands pinned the olderclaude-opus-4-7, and the value is carried over and refreshed to the current recommended Opus (it had been dropped from four of the five during the move). This records intent, not an enforced runtime pin: unlike a slash command, a skill runs inline in the active conversation, so themodel:key does not switch models mid-session — it documents the top-tier model these accuracy-sensitive drift checks are meant to run under, and is forward-compatible if the skills later adoptcontext: fork. To actually run them on Opus today, invoke them from an Opus session (see.claude/skills/check-suite/README.md). The eval harness (#796) is deliberately not part of this migration: its CI F1-gate conflicts with the "no LLM runs in CI" decision and it needs separate scoping, but its prerequisites (uniformfindings-jsonoutput and the shared rubric) ship here, so it can be picked up standalone. Invocation moves from/check-*to the skills surface; documented in.claude/skills/check-suite/README.md. (#833, #1016)
Fixed
websitedependencies bumped to clear thenpm auditCI gate —vite7.3.2 → 7.3.5 (lockfile only) patches a high-severity advisory (GHSA-fx2h-pf6j-xcff) and a transitivelaunch-editoradvisory (GHSA-v6wh-96g9-6wx3) that were failing theSecurityworkflow'snpm audit --audit-level=moderatejob on every PR. (PR #1102)- Both
PreToolUsehooks now actually fire (dead-matcher fix) — Both hooks in.claude/settings.jsonused amatchercontaining special characters ("Bash(git commit*)"from #655 and"Bash(gh api*)"from #912). Claude Code compiles any such matcher to a JavaScript regex and tests it against the tool name ("Bash"), not the command, so neither ever matched and both hooks were dead code. They are now consolidated under a single"matcher": "Bash"group (tool name, exact) with the command pattern moved to each handler'siffield —if: "Bash(git commit:*)"andif: "Bash(gh api:*)"— which uses permission-rule syntax against the Bash subcommand, per the documented 2026 hooks schema. (#869, #1014)- The golangci-lint handler's inlined shell one-liner was extracted to
scripts/claude-hooks/pre-commit-golangci.sh(set -euo pipefail, standalone-runnable, shellcheck-clean) so the logic is reviewable and testable, and it resolves the project root via the documented${CLAUDE_PROJECT_DIR}rather than the undocumented${CLAUDE_WORKING_DIRECTORY}. It stays advisory (golangci-lint's exit 1 is non-blocking forPreToolUse); promoting it to a blocking gate is tracked separately in #870. Fixes #869. - The
gh apimutation gate (scripts/claude-hooks/gh-api-gate.{py,sh}) had been dead since it was introduced (same defect, never shipped in a release), which would have re-enabled auto-approval of destructivegh apicalls (GraphQL mutations, REST writes, ref deletion) granted by theBash(gh api graphql:*)/Bash(gh api repos/genealogix/glx:*)allow rules — the exact bypass #911/#912 set out to close. With the matcher fixed the gate fires: provably read-only calls stay prompt-free, writes prompt (ask), and writes to the git-refs endpoint are hard-blocked (deny). (Note: a read whose output is captured via command substitution —ISSUE_ID=$(gh api graphql …)— now prompts, because the static parser floors any shell-dynamic command toask; run the bare query if you want it prompt-free.) Its invocation was harmonized to the documented${CLAUDE_PROJECT_DIR}form, dropping the undocumented${CLAUDE_WORKING_DIRECTORY}fallback; the fail-closed|| exit 2is preserved. A duplicateBash(gofmt:*)allow entry was removed in the same pass. Fixes #1014.
- The golangci-lint handler's inlined shell one-liner was extracted to
0.0.0-beta.11 - 2026-05-27
Added
CLI
Added
glx evidence <person> <property>command — Lays out every assertion for one person+property side-by-side, grouped by asserted value, for weighing conflicting evidence during active (brickwall) research. Each value lists its supporting reports (citation ID and resolved source title) with confidence, the report count, and the best confidence in the group; values are ranked by report count then confidence, and a closing line names the best-supported value — or flags an inconclusive tie when the leading values match on both. Place, person, and event reference values resolve to the referenced entity's name; everything else is shown verbatim. The person argument accepts an exact ID or a name search; the property is matched exactly with a case-insensitive fallback. Supports--archiveand--format text|json. Complementsglx analyze(which only emits one-line conflict warnings) and the proposedglx proof(#118, summaries of resolved questions). Closes #144glx analyzesibling-birthplace-outlier consistency check —analyze --check consistencynow flags children whose birthPlaceIDdiffers from a strict majority (>50%) of their siblings'. Runs per parent over any sibling group of 3+ children with recorded birthplaces. Severity defaults tomediumand is elevated tohighwhen the outlier's parent-child relationship is hypothetical (any assertion about that relationship with confidencetentative,low, ordisputed) or when the majority is reinforced by matching a parent's own birthplace while the outlier matches neither. Catches data errors, temporary relocations, and incorrect parent-child assignments that previously passed silently. Distinct from #154 (cross-source disagreement about a single person). (#158)glx export --format jsonld— Exports a GLX archive as a single self-contained JSON-LD document using a Schema.org-aligned@context(Person→schema:Person,Event→schema:Event,Place→schema:Place,Repository→schema:ArchiveOrganization,Source→schema:CreativeWork,Media→schema:MediaObject;Citation,Relationship,Assertion, andParticipationuse theglx:namespace). The canonical context lives atspecification/jsonld/glx-context.jsonldand is embedded inline in every export so output is self-contained — no network access required to resolve it. Closes #291glx export --privatize-living— Opt-in flag that redacts living persons before writing the exported file. Works for every output format (GEDCOM 5.5.1 / 7.0 and JSON-LD). A person is treated as living when theirliving: trueproperty is set, or — under the fallback heuristic — when no death, burial or cremation event is recorded for them and their most recent known birth year is less than 100 years before the current wall clock (when a person has multiple, conflicting birth events the most recent parseable year is used, so the filter errs toward redaction). Redaction replaces the person's name withLiving, strips all other properties (occupation, residence, religion, sex, etc.), and drops their notes. On every event whose subject is a living person, the date / place / notes / properties are blanked. On every relationship that names any living participant, the relationship's notes / properties are cleared and the referenced start / end events are fully redacted — this covers marriages between living spouses, whose participants carry the non-subjectspouserole and would otherwise leak DATE / PLAC / NOTE through theFAMrecord. On every other event that names a living person as a non-subject participant, per-participant Properties and Notes are scrubbed so fields likename_as_recordeddo not leak. Assertions whose subject or participant is a living person are dropped. Free-text fields on Sources / Citations / Repositories / Media are out of scope and must be redacted by hand. Redaction operates on the loaded archive before either exporter runs, so the same guarantees apply to GEDCOM and JSON-LD output; event types and family structure are preserved (GEDCOM FAM, FAMS, FAMC reconstruct; JSON-LD Relationship and Participation nodes still link) so the export stays valid. Closes #288.GEDZIP (
.gdz) import —glx importnow auto-detects.gdzarchives by extension and ingests them end-to-end: GEDCOM 7.0 parsing plus extraction of bundled media files into the archive'smedia/files/directory. Reuses the existing GEDCOM importer and media-copy pipeline. ZIP entry paths are validated to prevent zip-slip; backslashes, NUL bytes, absolute paths, Windows volume prefixes, symlink-mode entries, and entries whose cleaned case-folded destination paths collide are all rejected. Per-archive entry count is capped at 100,000, and each archive entry's decompressed size is capped at 512 MiB to bound decompression-bomb exposure during extraction. As defense-in-depth, still avoid runningglx importon untrusted archives without an external resource limit. (#41, #775)Added
glx addcommand family — Create GLX entities (person, place, event, repository, source, citation, relationship, assertion) from CLI flags instead of hand-writing YAML. Each subcommand derives an entity ID from the descriptive flags (overridable with--id), validates supplied vocabulary keys against the archive's vocabularies and entity references against existing IDs, and refuses to overwrite an existing entity unless--forceis supplied with--id. Derived IDs auto-suffix-2,-3, … on collision. The created entity ID is echoed on a dedicated machine-output stream (IOStreams.MachineOut) that is NOT silenced by--quiet, soid=$(glx --quiet add person --given Anna --archive .)shell substitution works for scripted workflows. Supports--dry-runfor preview and--skip-validateto skip the whole-archive validation pass after writing (vocab and reference checks still run). Closes #712.Added
--quiet/-qglobal flag — Suppresses non-error stdout output (success messages, warnings) on commands that have been migrated to the IOStreams abstraction (#679). Currently honored byvalidate,link, and the media-copy step ofimport. Errors continue to go to stderr.--quietis a no-op on commands not yet migrated; remaining runners will adopt it as they're individually migrated. (#681)Added
glx linkcommand — Create a GLX citation, and (when needed) a FamilySearch repository and source, from a FamilySearch ARK URL. Offline URL-parse MVP of #87: no network I/O, no authentication required. The citation captures the canonical URL, today's date as the accessed date, and the ARK identifier as a structuredexternal_idsentry whosefields.typematches the URI form produced by GEDCOM 7 EXID import (https://www.familysearch.org/ark:/61903/), keeping FS-imported GEDCOM files andglx link-generated citations format-compatible. Accepts full URLs (https://www.familysearch.org/ark:/61903/...), URLs withoutwww, and bare ARK identifiers. Requires exactly one of--source(attach to existing) or--create-source <title>(mint a new source). Supports--text,--locator, and--dry-run. Idempotent — re-running with the same ARK is a no-op once the deterministic citation IDcitation-familysearch-<slug>exists. Follow-ups for the remaining #87 scope (unauthenticated HTTP fetch, OAuth PKCE, GEDCOM X JSON extraction, person/event/relationship import) are tracked separately (#87)Hidden
glx docssubcommand — Generates per-command Markdown reference for every (non-hidden) Cobra command into a configurable output directory (default./docs/cli/). Used bymake docs-cliand thedocs-driftCI workflow. Addsgithub.com/spf13/cobra/docto the dependency graph (transitive via existing cobra dependency). (#299)glx migrate --rename-gender-to-sex— New opt-in flag onglx migratethat renames the legacygenderperson property (and related assertions and inlined vocabulary entries) tosex, completing the two-field-model split in pre-v1.0 archives (#528).glx migrate --confidence-disputed-to-status— New opt-in flag onglx migratethat moves the legacyconfidence: disputedsignal tostatus: disputedper the confidence-vs-status separation (#516). For each assertion withconfidence: disputed: clears theconfidencefield, and either setsstatus: disputed(when status is empty), leaves the existing status alone (when set todisputedalready, or — with a warning — to any other value such asproven/speculative), so the user's research state is never overwritten. Idempotent. (#516)glx migrate --source-description-to-property— New opt-in flag onglx migratethat moves a Source's legacy top-leveldescriptionfield intoproperties.description, completing the structural-field-to-vocabulary-property consolidation for pre-v1.0 archives (#667). An explicitproperties.descriptionis never overwritten; the flag is idempotent.glx migrate --media-description-to-property— New opt-in flag onglx migratethat moves a Media's legacy top-leveldescriptionfield intoproperties.description, mirroring the Source treatment from #667 for the remaining structural-vs-property inconsistency on Media (#894). An explicitproperties.descriptionis never overwritten; the flag is idempotent. (PR #933)glx merge --previewwith cross-archive duplicate detection — Preview mode now detects potential duplicate persons across source and destination archives using 7-signal similarity scoring (name, birth/death year and place, shared relationships and events). Configurable via--threshold(default 0.6). Replaces the previous--dry-runflag, which has been removed. (#702, part of #94)Added
IOStreamstype for testable CLI output — Newglx/iostreams.gointroducesIOStreams{Out, ErrOut io.Writer}following kubectl's minimal pattern.validatePaths()migrated to accept*IOStreamsinstead of writing directly toos.Stdout/os.Stderr; all 18 validation tests useTestIOStreams()with buffer capture instead ofos.Pipe()hacks. Foundation for incremental migration of remaining runners (#682, Fixes #678)glx searchandglx queryrecognize ResearchLog and Study entities — Both commands' entity-type lists hardcoded the pre-beta.11 9 types and could not return matches from the two new entity types added in this release.glx searchgainssearchResearchLogs(covers every Search entry inside a log viasearches[i].fieldpaths) andsearchStudies;glx querygainsqueryResearchLogs(status-aware one-line listing) andqueryStudies(type/status-aware listing). The diff summarizer (go-glx/diff.go) likewise gainedsummarizeResearchLogandsummarizeStudyso per-entity diff output is no longer a bare ID. Closes the remaining open items in #881. (PR #933)Added
glx merge-persons <keep-id> <drop-id>command — Consolidates two person entities afterglx duplicatesflags them as the same individual. The keep-id is retained; the drop-id's properties, notes, and cross-references (event/relationship/assertion participants and assertion subjects) are folded into it in a single atomic operation, then the drop-id person file is removed. Replaces the previous 5–10 hand edits per merge. Multi-value list properties (includingexternal_ids) are unioned with deep-equal deduplication; single-value conflicts default to keep's value and are reported, with--keep-newest/--keep-oldestavailable to resolve dated conflicts by date. Notes combine per--notes-strategy(append|prefer-keep|prefer-drop, defaultappend). Supports--archiveand--dry-run. Closes #718
Documentation
README.md CLI Commands section —
README.mdnow includes a "CLI Commands" section after the Quick Start that lists everyglxsubcommand grouped into Archive Management, Import & Export, Exploration, Data Entry, Analysis, and Shell completion, each with a one-line description and a link to the full CLI reference. Closes #466Developer Certificate of Origin (DCO) policy —
CONTRIBUTING.mdnow documents the DCO 1.1 and requires contributors to sign off commits withgit commit -s. Closes #409SECURITY-POSTURE.mdself-attestation — New repo-root document attesting GLX's compliance with OpenSSF OSPS Baseline v2026.02.19 at Level 1, with most Level 2 controls also met. Tracks outstanding gaps (#269 SBOM, #424 safe-harbor/embargo, #256 SLSA provenance) and notes EU Cyber Resilience Act applicability for adopters. Surfaced on the website at/development/security-postureand linked fromREADME.mdandSECURITY.md. Closes #425Architecture Decision Records (ADRs) — Added
docs/decisions/directory with an ADR template, an index, and six foundational ADRs covering YAML as the archive file format, the evidence-first data model (Repository → Source → Citation → Assertion), archive-owned vocabularies, Git-native archives, flexible entity IDs, and the go-glx library's no-I/O rule.CONTRIBUTING.mdnow describes the ADR practice and when to write one. Closes #416
CI
docs-driftworkflow — New.github/workflows/docs-drift.ymlbuilds the CLI, runsmake docs-cli, and fails ifgit diff --exit-code -- docs/cli/is non-empty. Catches PRs that change a Cobra command without regenerating its Markdown page. Triggered on changes to**.go,Makefile, ordocs/cli/**. (#299)Scheduled
lycheeexternal-link check — Newlychee.ymlworkflow runs weekly (Mondays 08:17 UTC) and onworkflow_dispatch, validating every external URL referenced inspecification/**/*.md,docs/**/*.md, and root-level*.md. Broken URLs are reported by creating or updating a single GitHub issue titled "Broken external links detected"; the workflow never blocks PRs. Internal relative links continue to be validated on every PR byscripts/check-links.sh. (#316)Cosign keyless signing of release checksums —
.goreleaser.ymlnow signschecksums.txtwith cosign keyless (Sigstore / OIDC) at release time, producingchecksums.txt.sigstore.jsonalongside the manifest. Users verify withcosign verify-blob, passing--certificate-identity-regexp '^https://github\.com/genealogix/glx/\.github/workflows/release\.yml@refs/tags/'and--certificate-oidc-issuer 'https://token.actions.githubusercontent.com'so the signature binds to this repo's release workflow — without those flags,verify-blobvalidates the Fulcio chain, signature, and Rekor inclusion proof but does not constrain which OIDC identity (workflow) or which OIDC issuer the cert attests to, so any valid keyless cosign signature from any GitHub Actions workflow would pass. SeeSECURITY-POSTURE.mdfor the full command. Signing the checksum file transitively covers every release artifact via its SHA-256. Complements the planned SLSA build provenance (#256) — signing proves authenticity, provenance proves how the binary was built. (#387)
Project Infrastructure
- Added
public/.well-known/security.txt— RFC 9116 machine-readable security-contact file served at/.well-known/security.txt. Points security scanners and researchers to the GitHub Security Advisories report channel andSECURITY.mdpolicy. (#271) - Added
.github/SUPPORT.md— Surfaces GitHub's "Support resources" link on the new-issue flow, directing support questions to Discussions, Discord, and the mailing list instead of the issue tracker. (#423) - PR template changelog reminder —
.github/PULL_REQUEST_TEMPLATE.mdnow ends with an HTML-comment reminder to updateCHANGELOG.mdfor user-facing changes (Added/Changed/Fixed/Removed). (#363) - PR template Review focus section —
.github/PULL_REQUEST_TEMPLATE.mdnow includes a "Review focus" section after "Related issues", prompting PR authors to state what reviewers should pay attention to (e.g. "API design", "correctness", or "trivial change"). (#362) - Added
public/robots.txtand VitePress sitemap — VitePress now emits/sitemap.xml(hostnamehttps://genealogix.dev), andpublic/robots.txtallows all crawlers and points to it.srcExcludewas also added to keep vendorednode_modules/**/*.mdout of the build and sitemap. (#282)
Specification
ResearchLogentity type — New entity for tracking research investigations, including searches that found nothing (negative evidence). Each log carries an optionalsubject(Person/Event/Relationship/Place EntityRef),objective,status, and an embedded list ofSearchentries (one per query) that reference therepository/sourcesearched, theresult(found/not_found/inconclusive/partial/not_searched), and thecitationproduced when something was located. Two new standard vocabularies —search_result_typesandresearch_log_status_types— back the validated string fields.ResearchLog.DateandSearch.Dateparticipate in the same date-format validation as Event/Source/Media/Assertion dates. The shape supports the Genealogical Proof Standard requirement for a "reasonably exhaustive search" by promoting "searched X, target absent" from freeform notes to a queryable structure. No GEDCOM mapping (research process is researcher-state, not source data). CLI commands (glx log add/list/report) are tracked separately. Closes #224, supersedes #89.livingperson property — New standard person property (value_type: boolean, no GEDCOM mapping) used as an explicit privacy opt-in marker. When set totrue, every export privacy filter (currentlyglx export --privatize-living) must treat the person as living; whenfalse, the filter must treat the person as not living and skip the date-based heuristic. Closes #288.possibly_same_personrelationship type — New standard relationship type for linking two person records that may refer to the same individual but cannot yet be confirmed; pair with an Assertion subject-referencing the relationship to recordconfidenceand supporting citations. No direct GEDCOM mapping. Closes #227.Studyentity type for research-project scope — New first-class entity that formally declares the scope of a research project within an archive: One Place Studies, One Name Studies, family reconstructions, descendancy/ancestry studies, and brick-wall investigations. Fields:title,type(validated against newstudy_typesvocabulary),status(validated against newstudy_statusesvocabulary),date_range(GLX date string, e.g.,FROM 1610 TO 1875),places(Place refs),sources(Source refs),properties(vocabulary-extensible), andnotes. Replaces the prior workaround of recording scope informally in Placenotes, making study scope machine-readable so tooling can report coverage and progress. GLX-native; no GEDCOM equivalent. Closes #226.associateandboarderrelationship types;associate,household_head,boarderparticipant roles — Standard FAN (Friends, Associates, Neighbors) research types:associatefor durable social connections (friends, colleagues, repeat witnesses, migration companions) andboarderfor non-family household members in an asymmetric head-of-household + dependent relationship (boarder, lodger, ward, servant). Distinct fromhousemate(symmetric cohabitation) andapprenticeship(trade-training). Three new relationship-only participant roles complete the model. Existingneighborandemploymenttypes already covered two of the four scenarios from #171; this fills the remaining gaps. Enables FAN-traversal use cases includingglx cluster(#111),glx migrations(#114), andanalyzeFAN detection (#159). Closes #171.external_idsproperty added toplace_properties— Standard property for cross-system place identifiers (GeoNames, Wikidata, OpenStreetMap, etc.), mirroring the existingexternal_idspattern onperson,source,citation, andrepositoryproperties. Multi-value with atypefield for the issuing authority. Maps to GEDCOM 7.0PLAC.EXID. Closes #536name_as_recordedproperty added toevent_propertiesandrelationship_properties— Standard structured property for the participant's name exactly as written in the source backing the event or relationship. Captures source-specific renderings (Latin-genitive forms in 18th-century parish registers, abbreviations, transcription quirks) onevent.participants[].propertiesandrelationship.participants[].propertiesrather than as a temporal variant on the Person entity. Mirrors the field shape ofperson_properties.name(prefix,given,surname_prefix,surname,suffix). Closes #714repository-types.glxGEDCOM mappings added — Addedgedcom:fields to nine of the ten standard repository types (archive,library,church,database→online,museum,registry,historical_society→society,university,government_agency→government);otheris intentionally left unmapped as the fallback target for unrecognized values. The hardcodedgedcomRepositoryTypeMappingmap ingo-glx/constants.gohas been removed and the mapping is now read from the vocabulary at import time, bringing repository-types in line with how every other vocabulary handles GEDCOM tags. Closes #555.- Enslavement relationship metadata model enriched — The
enslavementrelationship type's description inrelationship-types.glxnow documents the per-enslaver temporal-bounds pattern and inherited-enslavement modeling (children of enslaved mothers). Two new participant roles (enslaver,enslaved_person) are added toparticipant-roles.glxand exposed as Go constants ingo-glx/constants.go. A newlegal_statusrelationship property (vocabulary_type: legal_statuses) is backed by a newlegal_statusesstandard vocabulary (chattel,indentured,debt_bondage,apprenticeship) — filespecification/5-standard-vocabularies/legal-statuses.glxplus matching JSON schema, embedded ingo-glxalongside the other standard vocabularies and validated through the existingvocabulary_typevalidation path.specification/4-entity-types/relationship.mdgains an "Enslavement Relationship" usage section covering the simple, multiple-enslavers-over-a-lifetime, and inherited-enslavement patterns, with further-reading links to Enslaved.org and Beyond Kin. Closes #547. unresearchedassertion status value — Addedunresearchedto the documented common values for the assertionstatusfield, distinguishing claims that have not yet been investigated fromspeculativeclaims that have weak evidence. Status remains free-text by design; this only extends the documented examples and adds a worked usage in theassertion-workflowexample archive. Closes #649.confidence-levels.glxrankfield — Optional integer field on confidence-level vocabulary entries that establishes upgrade/downgrade ordering. The four standard levels now carryrank: 0-3(low=0, disputed=1, medium=2, high=3), matching the previously-hardcodedconfidenceRankmap ingo-glx/diff.go. Archives that extend the vocabulary with custom levels (e.g.very_high,tentative) can now participate inglx diffconfidence-upgrade detection by supplying arankvalue; previously these custom levels were silently ignored by the upgrade/downgrade counters. The hardcoded ordering remains as a fallback for archives loaded without their vocabulary (e.g. fixtures that construct*GLXFiledirectly), so behavior is unchanged for the standard levels. Closes #517.
Tooling
make fixandmake fix-difftargets for Go 1.26 modernizers —make fixrunsgolangci-lint run --fixacross the repository to apply modernizer auto-fixes;make fix-diffpreviews the changes without writing. Lets contributors adopt new modernizer rules in one batch instead of hand-editing each call site. (#789, Closes #442)Pre-commit hooks via
lefthook—lefthook.ymldefinespre-commitjobs that, when Go files are staged, rungolangci-lint(flagging only issues introduced sinceHEAD) and, when JS/Vue files underwebsite/.vitepress/are staged, runeslinton those staged files. Install withmake install-hooks. Catches lint issues locally before they reach CI; skip a single commit withLEFTHOOK=0 git commit .... (#280)
Tests
- Round-trip validation tests for example archives —
go-glx/example_archives_roundtrip_test.gowalks every archive underdocs/examples/(single-file or multi-file), runs it through deserialize → re-serialize, validates each entity in the re-emitted output against its per-entity JSON schema (person.schema.json,event.schema.json, etc.), and asserts that the parsed-input YAML map equals the parsed-output map. The map-level comparison catchesomitemptydrops that struct equality cannot detect. (#296) - Regression tests pinning
value_typeenforcement on temporal properties —go-glx/validation_temporal_test.gonow covers all three temporal-value shapes (simple value, single{value, date}object, list of objects) for properties declaringvalue_type: integer, asserting a warning is emitted when the value's runtime type doesn't match. Locks in behavior already implemented invalidateTemporalItem. Closes #668. - Regression tests for
glx searchrepository address fields —glx/search_runner_test.gonow populates the shared search fixture with a Repository entity and asserts thatsearchArchivematches againststate,postal_code, andcountry. Locks in behavior already implemented insearchRepositories(glx/search_runner.go). Closes #619.
Changed
devcontainer: move provisioning into the prebuild-cached lifecycle phases —
go mod downloadand bothnpm ciruns (specification/,website/) moved frompostCreateCommandtoupdateContentCommand, and the golangci-lint install moved toonCreateCommand;postCreateCommandremoved (no user-scoped steps).postCreateCommandis not executed during a Codespaces prebuild, so the previous placement re-ran every dependency download on each codespace creation. The new mapping bakes the tool install into the prebuild and refreshes the content-derived deps whengo.sum/package-lock.jsonchange. With prebuilds disabled the same steps still run during container creation, just in earlier lifecycle phases, so the provisioned environment is functionally unchanged. (#883)glx analyzeconsolidates parent-child census suggestions — When both a parent and a minor child (under 18 at the census year) are missing the same US federal census, the parent's suggestion now lists the children covered (— would also cover: <Child Name> (~<age>), ...) and the children's independent suggestions for that year are suppressed. A single search of the parent's record covers everyone in the household, so consolidating the suggestion makes the research direction clearer. Children remain independently suggested when the parent already has a census event for that year. Closes #161.basic-familyexample renamed to use descriptive person IDs — Replaced role-based IDs (person-mother,person-father,person-bob) with name-based IDs (person-mary-thompson,person-robert-thompson,person-alice-thompson,person-robert-thompson-jr) to match the project's example-best-practice guidance and the convention already used incomplete-family/. Also corrected the son's name to "Robert Thompson Jr." with a structuredsuffix: "Jr."field, matching the README's "Robert Jr." description. Person filenames were renamed in lockstep with their IDs, andrel-parent-bob.glx/rel-parent-bobwere renamed torel-parent-robert-jr.glx/rel-parent-robert-jrto drop the lingering nickname. Closes #577.CLI reference is now auto-generated — Replaced the ~1,200-line manually-maintained "Commands" section in
glx/README.mdwith a short pointer paragraph linking to the auto-generated pages underdocs/cli/. The VitePress/cliroute now serves a hand-writtendocs/cli/index.mdoverview instead of the README; per-command pages live at/cli/glx_init,/cli/glx_validate, etc. (file-per-command, replacing the previous single-page anchor links like/cli#glx-init). The website sidebar inwebsite/.vitepress/config.jswas rewritten accordingly and now also covers the previously-unlistedduplicates,coverage,diff,rename, andcompletioncommands. (#299)BREAKING: Removed
disputedfrom the standardconfidence_levelsvocabulary —disputedwas conflating two distinct axes: confidence reflects evidence quality (high / medium / low), while disputed describes a conclusion state (sources conflict, resolution unclear). Disputed is now documented on the assertionstatusfield alongsideproven,speculative, anddisproven. The standard vocabulary is now 3 entries (high / medium / low); archives may still extend the vocabulary with custom levels, includingdisputedif they prefer that wire-format. Wire-format breaking: archives that retainconfidence: disputedagainst the standard vocabulary will produce an out-of-vocabulary warning. The numeric ranking ingo-glx'sconfidenceRank(used to detect upgrades/downgrades inglx diff) and the per-level display order inglx reportandglx statsarelow=0, medium=2, high=3(rank 1 left open for custom levels such astentative); pre-existingdisputedvalues fall through to the "unknown level" branches and no longer count as confidence transitions. Archives predating this change can be migrated viaglx migrate --confidence-disputed-to-status. Closes #516.BREAKING: Split
genderproperty intosex(recorded) andgender(identity) — The existinggenderproperty conflated "recorded sex" (GEDCOMSEX) with "self-identified gender identity". Introduced a newsex_typesvocabulary (male,female,unknown,not_recorded,other) bound to a newsexperson property that maps to GEDCOMSEX. Thegender_typesvocabulary now covers identity values (male,female,nonbinary,other) bound to the repurposedgenderproperty (no direct GEDCOM mapping — GEDCOM 7.0 defers identity toFACT). GEDCOM import/export, HUSB/WIFE assignment, census parsing, and CLI readers (vitals,summary) all updated to prefersexwith legacygenderfallback. Archives predating this split can be migrated viaglx migrate --rename-gender-to-sex. Wire-format breaking: archives usinggender: "unknown"produce an out-of-vocabulary warning post-split (unknownnow lives only insex_types). Resolves #528. Closes #518. Closes #534.BREAKING: Demoted Source
descriptionfrom a top-level structural field to theproperties.descriptionvocabulary property — resolving the cross-entity structural-vs-property inconsistency, since Relationship, Event, and Place already modeldescriptionas a property.descriptionis removed from the top level ofsource.schema.jsonand added to thesource-properties.glxstandard vocabulary; the GoSource.Descriptionstruct field is removed, and GEDCOMSOUR.TEXT/SOUR.NOTEimport/export now round-trips throughproperties.description. A backward-compatibility loader folds any legacy top-leveldescription:intoproperties.descriptionon read (an explicitproperties.descriptionalways wins), so existing archives are never silently truncated;glx validatestill flags the old top-level form against the schema. Wire-format breaking: archives with a top-level sourcedescription:can be migrated viaglx migrate --source-description-to-property. In JSON-LD export, Sourcedescriptionnow serializes under theglx:namespace (consistent with the other property-based entities) rather thanschema:description. Closes #667.BREAKING: Demoted Media
descriptionfrom a top-level structural field to theproperties.descriptionvocabulary property — finishing the cross-entity consolidation #667 began for Source. Media was the last entity still modellingdescriptionas a structural field; it now joins Source, Relationship, Event, and Place in modelling description as a vocabulary property.descriptionis removed from the top level ofmedia.schema.jsonand added to themedia-properties.glxstandard vocabulary; the GoMedia.Descriptionstruct field is removed. A backward-compatibility loader folds any legacy top-leveldescription:intoproperties.descriptionon read (an explicitproperties.descriptionalways wins), so existing archives are never silently truncated;glx validatestill flags the old top-level form against the schema. Wire-format breaking: archives with a top-level mediadescription:can be migrated viaglx migrate --media-description-to-property. In JSON-LD export, Mediadescriptionnow serializes under theglx:namespace (consistent with the other property-based entities) rather thanschema:description. Also corrected the Media GEDCOM mapping documentation:OBJE.NOTEhas always routed to thenotesfield on import (notdescription), and themedia.mdmapping table is now consistent with the importer. Closes #894. (PR #933)participant-roles.glxclarifiedprincipalandsubjectas synonyms — The previous wording describedsubjectas "preferred over 'principal'", which was misleading about the intended contract. The two role descriptions now state thatprincipalis the canonical term andsubjectis an accepted synonym.glx initand GEDCOM import emitprincipal; some paths (e.g., census import's empty-role default) emitsubject. Tooling treats both as equivalent and no data migration is required. Fixes #523.person-properties.glxGEDCOM mappings completed — Addedgedcom:tags toname(NAME),sex(SEX), andresidence(RESI); documentedgender,ethnicity,race, andprimary_nameas having no direct GEDCOM equivalent via inline comments. Closes #533.relationship-types.glxGEDCOM mapping coverage documented — Documentedstep_parent,godparent,partner,guardian,neighbor,coworker,housemate,apprenticeship,employment,enslavement, andrelativeas having no direct GEDCOM equivalent via inline comments, with rationale forstep_parent(PEDIsealedis mapped to genericparent_childperspecification/4-entity-types/relationship.mdandgo-glx/gedcom_converter.go),godparent(GEDCOM 7.0ASSO ROLE GODPis consumed as a baptism event participant role, not a standalone relationship), andpartner(MARR TYPEis collapsed into themarriagerelationship plus amarriage_typeevent property by the importer). Closes #544.SECURITY.mdSecurity Measures section reorganized into three subsections covering all five active controls — Vulnerability Scanning, Dependency Management, and Code Scanning replace the previous brief flat list (govulncheck,gosec, weekly-scan note). Newly documented:npm audit,Dependabot, anddependency-review-action. Closes #475source-properties.glxclarifiedurlvs citationurl— The previous description didn't distinguish the source-level URL (collection or database landing page) from the per-record citation URL, leaving readers looking for "where to put a permalink to a specific record" with no signal. The description now names the source-level intent and redirects record-specific URLs to the citationurlproperty. Closes #560.Devcontainer and CI use
.golangci-lint-versionas the single source of truth for the linter version — The new top-level.golangci-lint-versionfile (currentlyv2.12.2) is consumed by.github/workflows/lint.yml(viagolangci-lint-action'sversion-file:input) and by the devcontainer'sgo-toolsonCreateCommand(viatr -d '[:space:]' < .golangci-lint-version, validated against^v[0-9]+\.[0-9]+\.[0-9]+$before URL interpolation as defense-in-depth, with atrapto clean up the install-script temp file on every exit path). Stops the silent CI/devcontainer drift that previously had the workflow floatingv2.11while the container was frozen atv2.11.4. The lint workflow'spaths:filter now also includes.golangci-lint-versionand.github/workflows/lint.ymlitself, so version bumps and workflow edits get exercised by CI on the PR that introduces them. The devcontainer also now callsmake install-hooksat create time, so fresh Codespaces have lefthook pre-commit hooks installed without contributor intervention; theMakefileinstall-hookstarget now pins lefthook viaLEFTHOOK_VERSION ?= v2.1.8(overridable) instead of@latest, keeping container creations reproducible. VS Code customizations gaindbaeumer.vscode-eslint,eslint.workingDirectories: ["website"],editor.formatOnSave: true, and per-languageeditor.defaultFormattersettings so in-editor feedback matchesmake lintand CI.CONTRIBUTING.md's "all other tooling pre-configured" sentence is narrowed:goreleaseris release/maintainer-only and intentionally not bundled..gitattributespins LF line endings on the new file so a Windows checkout withcore.autocrlf=truecan't sneak a\rinto the install URL. Closes #272.Place schema enforces
latitude↔longitudeco-dependency — A Place may carry both coordinates or neither, but setting one without the other is meaningless (a point requires two axes). The schema now expresses this with a draft-07dependenciesclause, soglx validateandmake check-schemasreject half-set coordinate pairs. Archives with both or neither are unaffected; the*float64Place struct representation is unchanged. Closes #509.
go-glx
Unified 9 vocabulary structs into a single
VocabularyEntrytype —EventType,ParticipantRole,ConfidenceLevel,RelationshipType,PlaceType,SourceType,RepositoryType,MediaType, andGenderTypehave been replaced by oneVocabularyEntrystruct carrying the union of their optional fields (GEDCOM,Category,MimeType,AppliesTo). The YAML wire format (.glxfiles on disk) is unchanged; consumers of*EventTypeetc. must switch to*VocabularyEntry. Closes #504Unified slug generation under an options-based
SlugifyAPI ingo-glx— Addedglxlib.Slugify(s, opts...)withWithSlugPrefix,WithSlugSuffix,WithSlugMaxLength, andWithSlugFallback, plus the ergonomic wrapperglxlib.EntityID(prefix, text). Replaces the two duplicate slug helpers in main (go-glx/census.go::slugifyStringandglx/link_runner.go::slugifyForID) and the two parallel truncation paths (truncateID/truncateIDWithSuffixingo-glx/census.go,trimToMaxLeninglx/link_runner.go). Also exportsEntityIDPrefix{Person,Relationship,Event,Place,Source,Citation,Repository,Assertion,Media}and the canonicalMinEntityIDLength/MaxEntityIDLengthconstants fromgo-glx; the duplicates that lived inglx/validator.goand the entity-prefix string literals scattered acrossglx/add_runner.go,glx/link_runner.go,glx/migrate_runner.go,go-glx/datagen.go, andgo-glx/gedcom_utils.gonow resolve through these. Closes #720
Dependencies
- Replaced JSON Schema validator library — Migrated
glx/validator.go(Pass 1 structural validation) andgo-glx/example_archives_roundtrip_test.gofrom the unmaintainedgithub.com/xeipuuv/gojsonschema v1.2.0(last release 2020) to the actively-maintainedgithub.com/santhosh-tekuri/jsonschema/v6 v6.0.2. Behavior preserved: same compile-once schema cache, same[]stringissue list. Drops the stalexeipuuv/gojsonpointerandxeipuuv/gojsonreferenceindirect dependencies (both pinned to 2018/2019 pseudo-versions). Error messages from invalid GLX files are reworded by the new library; no external code parses these strings. Closes #675; supersedes #268.
Developer Experience
Tightened markdownlint
MD033from disabled to strict + per-file allow-list —.markdownlint-cli2.jsoncnow leavesMD033(no-inline-html) at its strict default. The only page in the linted scope (specification/**/*.md,docs/**/*.md, root*.md) that needs raw HTML —specification/5-standard-vocabularies/README.md, which embeds<script setup>and<YamlFile />— declares its own allow-list inline via amarkdownlint-configure-filedirective. (website/standard-vocabularies.mduses the same constructs but is outside the lint globs.) Stray inline HTML anywhere else (e.g., an accidental<div>,<iframe>, or even a<script>outside that one page) is once again caught by lint. Closes #740Concurrency groups added to PR-triggered workflows —
validate-spec.yml,check-links.yml,security.yml,dependency-review.yml, andlint-pr-title.ymlnow declare workflow-level concurrency groups withcancel-in-progress: true, so pushing multiple commits to a PR branch in quick succession cancels stale runs and reduces wasted CI minutes.auto-resolve-conflicts.yml,auto-update-branches.yml, andlint.ymlalready had appropriate concurrency configs and are unchanged. Closes #346gocritic now runs
performancetag checks —.golangci.ymladdsenabled-tags: [performance]to thegocriticsettings, surfacinghugeParam(parameters >80 bytes passed by value),rangeValCopy(large value copies in range loops),appendCombine(sequentialappendchains),equalFold(strings.ToUpper(x) == "Y"→strings.EqualFold), andstringXbytes(string(b1) != string(b2)→bytes.Equal). Existing call paths flagged inglx/{analyze,link,migrate,query,report,vitals}_runner.goandgo-glx/{census,datagen,gedcom_encoding,gedcom_individual,gedcom_name}.gowere updated to take pointers, combine appends, or use the byte-equality helper. Thego-glx.PersonName.FormatFullNamevalue receiver is preserved (and the line marked//nolint:gocritic) to keepgetName().FormatFullName()callers compiling. Note: the issue (#379) literally proposed[diagnostic, style, performance], but on golangci-lint v2.11 only theperformancetag is genuinely new —diagnosticis the upstream default andstylewould have been a separate scope (53 additional findings). Closes #379. Supersedes the gocritic portion of #361.nolintlint enforces
//nolintdirective discipline —.golangci.ymlenables thenolintlintlinter withrequire-explanation: trueandrequire-specific: true, so every//nolintdirective must name the specific linter(s) it suppresses and carry an explanatory comment. Annotated the 12 previously-bare complexity suppressions (gocyclo/gocogniton the GEDCOM conversion functions plusvalidatePaths) — narrowing one of them (detectGEDCOMVersion) togocognitonly, sincegocyclodoes not fire there — and removed two directives that suppressed nothing: a dead//nolint:gocycloonloadVocabulary, and a//nolint:errcheckinarchive_io_test.gowhose statement was refactored tot.Chdir(errcheck is path-excluded for_test.go).allow-unused: trueis set for now: nolintlint's unused-directive check additionally surfaced 15 pre-existing redundant//nolint:mnddirectives, tracked for cleanup in #902. Closes #375. Supersedes the nolintlint portion of #361./check-docs-driftslash command invokes the real validator on extracted YAML snippets — Step 2 ("Example Code Blocks in Documentation") of.claude/commands/check-docs-drift.mdpreviously asked the LLM to mentally simulate JSON-schema validation of YAML examples embedded in docs. It now runsmktemp -d /tmp/glx-drift-XXXXXX, captures the printed path, uses the Write tool to save the snippet to<TMPDIR>/snippet.glx(avoiding all shell-quoting concerns around YAML bodies containing$, backticks, or quotes), runs./bin/glx validateon the file, and cleans up the scratch directory per-snippet viarm -rf <TMPDIR>(literal path, so the command unambiguously matches the allow-list entry and concurrent runs of the slash command don't race on a shared cleanup glob). The prompt'sallowed-toolsfrontmatter gains theWritetool plusBash(mktemp -d /tmp/glx-drift-*:*),Bash(./bin/glx validate:*), andBash(rm -rf /tmp/glx-drift-*:*)(all narrowly scoped);.claude/settings.jsonpermissions.allowgains the matchingmktempandrmentries. Validator exit code is reported as a single deterministic verdict (exit 0 → passed structural + semantic validation; any non-zero → CRITICAL, with the validator's stderr surfaced verbatim for human-review triage) — the prompt deliberately does NOT ask the LLM to second-guess the validator by inspecting top-level YAML keys, since that would re-introduce the exact LLM-simulation anti-pattern this change removes. #910 tracks--stdin --entity-typeflags so fragment-shape doc snippets (which currently trip(root): additional propertieserrors even when correct) can be validated directly. Closes #832.
Removed
Removed
glx merge --dry-run— The--dry-runflag onglx merge(added in beta.10, #264) has been removed in favor of--preview, which supersedes it with richer output including cross-archive duplicate detection. (#702)BREAKING: Removed unused random filename helpers from
go-glx—GenerateRandomID,GenerateEntityFilename,GenerateUniqueFilename, and theErrUniqueFilenameFailedsentinel are deleted. They were superseded byEntityIDToFilename(#699) and had no remaining callers ingo-glx; theglx migratedeprecated-property path that usedGenerateRandomIDfor syntheticevent-<8hex>IDs now does so via a small private helper inglx/migrate_runner.go. External consumers ofgithub.com/genealogix/glx/go-glxthat imported any of the four removed symbols will need to migrate toEntityIDToFilename(for filename derivation) or vendor the deleted helpers. Closes #697
Fixed
Entity-ID slugs now NFC-normalize input before the German digraph layer —
go-glx/id_slug.go::slugifyOncerangermanSlugReplacer.Replace(s)directly on the input, which keys on the precomposed German umlauts (ü= U+00FC, …). Decomposed input — a base letter followed by U+0308 COMBINING DIAERESIS, as produced by some macOS/HFS+ filesystem sources and a few clipboard paths — does not match those keys, so the umlaut slipped past the digraph layer and was then reduced to a bare vowel by the subsequent NFKD + combining-mark strip (München→muncheninstead ofmuenchen). The pipeline now composes to NFC first, so digraph expansion is robust to input normalization form. Closes #896Non-ASCII characters in entity-ID slugs now transliterate to ASCII instead of being stripped — Any code path that derives an entity ID from free text (
glx link --create-source,glx add {person,place,event,...}, census import) previously ran a[^a-z0-9]+regex after lowercasing, which collapsed every non-ASCII character to a hyphen.glx link --create-source "Deutschland, ausgewählte evangelische Kirchenbücher 1500-1971"mintedsource-deutschland-ausgew-hlte-evangelische-kirchenb-cher-1500-1— theä/übecame lone hyphens, leaving partial gibberish. The slugifier now applies a two-layer transliteration before the regex: a small replacer for German digraphs (ä→ae, ö→oe, ü→ue, ß→ss, plus uppercase forms) followed by NFKD decomposition with combining-mark stripping (covers Mn/Mc/Me categories, soé→e, ñ→n, å→a, and compatibility characters that decompose to ASCII such as the ligaturefi→fi). The same input now mintssource-deutschland-ausgewaehlte-evangelische-kirchenbuecher-1500— the full transliterated body would bedeutschland-ausgewaehlte-evangelische-kirchenbuecher-1500-1971(62 chars), but after the 7-charsource-prefix the 64-char entity-ID cap trims-1971from the tail, leaving a 57-char body. Net: the surviving slug body is readable, deterministic, and matches German orthographic convention, even when the prefix budget forces some truncation. Non-Latin scripts (CJK, Arabic, etc.) still fall through the regex strip tounknown(or to a deterministic hash fallback at the call site) — fixing that is out of scope. Closes #720RenameEntitynow traversesResearchLogcross-references —go-glx/rename.go::findEntityType,moveMapKey, andupdateAllRefsall skipped theResearchLogsmap, so renaming a research log was rejected as "not found" and renaming a referenced Person/Event/Relationship/Place/Repository/Source/Citation left stale IDs insideResearchLog.Subject(a*EntityRef— guarded for nil), each embeddedSearch'sRepositoryID/SourceID/CitationID, and the log-levelCitationsslice. Rename now handles all of these. Raised by Copilot review on PR #818.glx diffnow reportsResearchLogadditions, modifications, and removals —go-glx/diff.go::DiffArchivesandentityTypeOrder, andglx/diff_runner.go::sortedEntityTypes, all omittedEntityTypeResearchLogs, so changes to research logs were invisible in CLI and library diffs. All three lists now includeresearch_logs(sorted aftermediato match the GLXFile field order). Raised by Copilot review on PR #818.JSON Schema vocabulary refs now point to the inner type definition — Every vocabulary property in
specification/schema/v1/glx-file.schema.json(event_types,relationship_types,place_types,repository_types,participant_roles,media_types,confidence_levels,source_types,sex_types,gender_types,legal_statuses,study_types,study_statuses,search_result_types,research_log_status_types, all*_properties— 23 entries in total) previously$ref-ed the entire vocabulary schema file, which requires a wrapper key (e.g.{ "event_types": { ... } }). Under standard JSON Schema semantics this means each entry's value (e.g.{ label: "Birth", category: "lifecycle" }) would fail to validate. The Go validator papered over this with aresolveVocabularyRefheuristic that walked into the wrapper to extract the inner entry schema, but tools that follow the spec literally produced spurious errors. Each ref now points at the nameddefinitions/<Name>TypeDefinition(ordefinitions/PropertyDefinitionfor property vocabularies) inside the wrapper schema, matching standard JSON Schema semantics.glx/validator.go::resolveFileRefnow honours a#/…JSON Pointer fragment on a file ref so the Go validator follows the same path. Raised by Copilot review on PR #818.Multi-file serializer emits forward-slash archive paths on every platform —
serializeEntitiesWrappedand the vocabulary writer ingo-glx/serializer.gopreviously usedfilepath.Jointo build map keys for the multi-file output (e.g.,persons/person-001.glx). On Windows this produced backslash-separated keys (persons\person-001.glx), breaking lookups in tests and consumers that expect archive-style relative paths. The serializer now usespath.Join(always/) and the deserializer correspondingly usespath.Extfor extension detection. Closes #817glx summaryresolves theresidenceperson-property to its place name — Under "Life Events", aresidencevalue (a place reference perreference_type: placesinperson-properties) printed as the raw entity ID (e.g.place-pohlgoens), while place references on vital events resolved to the place name (e.g.Pohl-Göns) in the same output.printLifeEventsSectionnow resolves any person-property declaredreference_type: placesthroughresolvePlaceName— covering plain-string, single-{value, date}, and temporal-list shapes (the(date)annotation is preserved) — and falls back to the raw ID when the referenced place is missing. Detection is vocabulary-driven, so a non-place property whose value happens to match a place ID is still shown verbatim. Single-file archives, which (unlike directory archives) do not merge standard vocabularies on load, now do so in the summary load path, so residence resolves regardless of archive form. Closes #897glx mergenow carries source media binaries into the destination, and the safe-write swap no longer destroys the destination's existing media binaries — Two bugs in one for multi-file archives. (1)mergeArchivespreviously merged only YAML entities and never touchedmedia/files/*, so source binaries were silently left behind and every Media URI imported from the source pointed at a missing file. (2)safeWriteMultiFileArchive(used byglx merge,glx migrate,glx rename, andglx merge-persons) wrote a fresh tmpDir, swapped it into place, and then deleted the backup — and becausemedia/is inarchiveManagedTopLeveltherestoreForeignEntriesstep did NOT carry the originalmedia/files/*binaries across, leaving every Media URI in the destination dangling regardless of merge. The merge runner now plans source binary copies against the destination's existingmedia/files/before the entity merge, deduplicating by content (skip when source and destination files have identical SHA-256) and renaming on name collisions (photo.jpg→photo-2.jpg, matching the GEDCOM importer's convention) with corresponding URI rewrites on the merged Media entities. Source binaries are copied after the safe-write completes; planned copies whose Media entity didn't survive the merge (ID conflict — destination's version kept) are skipped rather than left orphaned.safeWriteMultiFileArchivenow moves<backup>/media/files/into the freshly-written destination as part of the swap, andremoveStaleBackuprefuses to delete a.bakthat still contains unrecovered media binaries (media/files/*) — the same recovery posture already in place for foreign top-level files. Single-file destinations cannot hold binaries, so source binaries are not copied in that case and the runner emits a stderr warning so the user knows the merged Media URIs will dangle. Closes #593glx duplicatesrenormalizes scoring across the dimensions that actually have data — The 7-signal scoring model ingo-glx/duplicates.gopreviously summedweight * scoreover all dimensions with a fixed denominator of 1.0, so a name-only exact-match pair (e.g. parents identified on a child's birth record but otherwise unresearched) capped at the name weight (0.30) and silently fell below the 0.60 default threshold even when the names matched perfectly. The scoring functions now report aHasDataflag in addition to score/detail, andscorePairrenormalizes by dividing the weighted sum by the sum of weights of only the dimensions that carry comparable data on both sides. A dimension that compared and disagreed (e.g. birth years differ by >2, places differ, peers exist but don't overlap) keepsHasData: trueand remains in the denominator, so real disagreement is still penalized. Fully-documented pairs are byte-identical to the old algorithm (effective weight 1.0 → division by 1.0); name-only / partial-data pairs now score on the fraction of available evidence they share, surfacing the 18th-c. German parish-register and similar brick-wall pairs that motivated this issue.DuplicateSignalgains ahas_dataJSON field (additive, existing consumers unaffected). Closes #716glx init --create-test-data Nnow produces a valid archive —GenerateTestData::generateEvidenceChainminted anAssertionwithProperty: "birth"set but noValue, violating the schema rule thatvalueis required whenpropertyis present. Every freshly-initialized test archive failedglx validatewithmissing property 'value'on every generated assertion — the most common smoke test a new user runs against the CLI. The generator now emits an existential assertion (noproperty, novalue) attesting that the birth event occurred, which matches the semantics of "this event happened, here's the source" and validates cleanly. Closes #935. (PR #933)glx searchextracts values from temporal-list ([]any) properties —searchPropsnow enumerates property values via a newpropertyScalarshelper (sibling to the existingpropertyScalar), so list-shaped property values likeoccupation: [{value: farmer, date: 1850}, ...]and any other canonical GLX temporal-list shape are searchable. Previously only plain strings and single-map ({value, date}) shapes matched; temporal lists were silently dropped, soglx search blacksmithagainst an archive that recorded occupation as a list returned nothing. Thefmt.Sprint-on-map non-determinism that originally motivated the issue was already fixed in PR #252; this change closes the remaining shape-coverage gap and unifies property-value extraction with the rest of the package. Closes #618complete-familyexample: redundantsourceson assertions —docs/examples/complete-family/assertions/assertion-john-birth.glx,assertion-john-birthplace.glx, andassertion-marriage.glxeach carried both acitations:list and asources:list pointing at the same sources reachable through those citations. Perspecification/4-entity-types/assertion.md,citationsis the preferred evidence path when citations carry sub-location details (locator,text_from_source), which all three citations here do. Removed the redundantsourcesfield on each assertion; the evidence chain Repository → Source → Citation → Assertion remains intact through the citations. Closes #579glx duplicatesnow suppresses generationally-implausible pairs — When candidate person A is documented asrole=parentin some year P and candidate person B has a known birth year Y outside the plausible parenthood window[P-100, P-15](i.e., B was born within 15 years before P, after P, or more than 100 years before P), the pair is now scored as 0 (impossible to be the same person) instead of being scored solely on name + place similarity. Previously a putative father with no birth data and his newborn son could score identically to the genuine same-generation candidate; the suppressed pair now drops below any reasonable threshold and is annotated with anAge plausibilitysignal in the breakdown explaining which years were incompatible. Closes #717assertion-workflowexample uses flatjurisdictionstrings instead of place hierarchy —docs/examples/assertion-workflow/archive.glxdefinedplace-boston,place-new-york, andplace-cambridgewith ajurisdiction: "Massachusetts, United States"property string instead of theparent:chain documented inspecification/4-entity-types/place.md. The example now addsplace-united-states,place-massachusetts, andplace-new-york-stateand links the cities viaparent:, matching the canonical pattern incomplete-family. Closes #574complete-familymarriage-date assertion targeted relationship instead of event —docs/examples/complete-family/assertions/assertion-marriage.glxassertedstarted_onon the marriage relationship, duplicating the date already carried by the marriage Event entity (event-marriage-1875) and contradicting the canonical events-first pattern PR #360 established for birth/death dates. The assertion now targetsevent: event-marriage-1875withproperty: date, so the example teaches the canonical pattern. Closes #580assertion-workflowREADME evidence chain diagram contradicted its own worked example —docs/examples/assertion-workflow/README.mdshowed the chain ending atProperty, but the worked example beneath it (and the YAML earlier in the file) already targets an Event for birth-date evidence per the events-first pattern PR #360 established. The diagram now branches at the Assertion node to show both endpoints — a property (e.g.name,occupation) or an Event (e.g.date,place) — matching the assertion subject model. Closes #571personSex/displayableGenderIdentity/pronounForhandle temporal shapes —sexandgenderare both declaredtemporal: trueinperson-properties, so archives may store them as{value, date}maps or[{value, date}, ...]lists. Previously the CLI ran these throughpropertyString, whichfmt.Sprint-ed non-string values and produced useless display strings likemap[date:1850 value:male]— also breaking the legacy-gender fallback, the identity-vs-duplicate predicate, and pronoun selection (every temporal value silently fell through to they/their). A newpropertyScalarhelper extracts the canonical scalar from string / single-map / list shapes. (PR #742)glx migrate --rename-gender-to-sexno longer skips on emptysex—isPostSplitArchivepreviously treated the mere presence of thesexkey as a post-split signal, so archives withsex: "",sex: {}, orsex: []incorrectly skipped the migration while their real data still lived ingender. The check now requires a meaningful scalar value. (PR #742)glx migrate --rename-gender-to-sexdetects custom identity values — The post-split guard previously only flagged the canonicalgender: nonbinarymarker. An archive already using post-split semantics with a custom identity value (e.g.gender: two-spiritor any non-legacy-sex vocabulary key) and nosexset would have its identity values silently migrated intosex. The guard now skips migration for any non-legacy-sex value in either person properties or person-subject assertions. (PR #742)glx migrate --rename-gender-to-sexpreserves all vocabulary fields — Moving inlined pre-splitgender_typesentries intosex_typespreviously copied onlyLabel,Description, andGEDCOM, silently droppingCategory,AppliesTo, andMimeTypefrom custom user entries. The migration now clones the fullVocabularyEntry(with a deep copy ofAppliesTo). (PR #742)
Specification Tooling
- Removed unused
js-yamlruntime dependency fromspecification/package.json. It was never imported by the validation tooling or any script underspecification/. (PR #742)
GEDCOM Import
Non-geographic
PLACvalues flagged and skipped — GEDCOMs exported by legacy genealogy software routinely stuff non-place strings into thePLACtag — placeholders (Unknown,?,N/A,Private), status markers (Unmarried,Deceased,Stillborn), and event-circumstance prose (Died in childbirth,Killed in action). The importer previously accepted any non-empty string and minted a first-classPlaceentity for it, which then misclassified throughinferPlaceType(e.g."Unknown"became acity) and polluted the place graph. A newnonGeographicPLACReasoncheck rejects a tight set of sentinel values plus circumstance-phrase patterns (only when the value collapses to a single non-empty comma-separated component, to avoid false-positives on real hierarchies like"Unknown County, Texas") and emits anImportWarningtaggedPLACexplaining what was ignored. The event keeps its other subrecords (date, citations, notes) — only the place is dropped. Closes #493Empty
SEXtag maps tonot_recorded— A present-but-empty GEDCOMSEXtag (e.g.1 SEXwith no value) now maps tosex: not_recordedrather thansex: unknown.unknownis reserved forSEX U(source consulted but sex could not be determined); an empty tag represents absent-from-source data per GEDCOM 5.5.5. (#528)Preserve GEDCOM 7.0 extension tags during round-trip — Vendor-specific extension subrecords (underscore-prefixed tags like
_FSFTID,_GENEALOGY_SITE,_PROFILE) inside INDI and FAM records were previously recognized but dropped on import, silently losing data on every round-trip. They are now preserved on the relevant entity underproperties.gedcom_extensionsas a list of{tag, value, subrecords}records (recursive shape preserves duplicate child tags and nesting), and re-emitted on GEDCOM export. Top-level0 _CUSTOMrecords and SCHMA URI registry preservation remain follow-ups. (#289)GEDCOM export handles temporal
sex/gender—exportPerson(SEX emission) andgetPersonSex(HUSB/WIFE inference) previously read sex/gender viagetStringProperty, which only accepts plain strings. A temporal archive (sex: {value: male, date: 1850}orsex: [{value, date}, ...]) would silently get SEX omitted from its INDI record and fall back to first/second order for HUSB/WIFE. NewgetScalarPropertyhelper extracts the scalar from all three shapes and is used at the sex/gender call sites; non-temporal fields (citation locator, source publication info, etc.) continue to usegetStringPropertyso their type check still catches data errors. (PR #742)Multi-file serializer generates deterministic filenames — Entity filenames are now derived from entity IDs (
strings.ToLower(entityID) + ".glx") instead of random 8-char hex. Previously, every write generated new random filenames, causing massive git diffs even when no data changed. Case-insensitive collisions (e.g.,Person-Aandperson-a) are detected and reported as errors. Fixes #694GEDCOM export emits events whose principal participant uses
role: subject—buildPersonEventsIndexfiltered onparticipant.Role == ParticipantRolePrincipal, silently dropping events from the per-person event index whenever the participant used the documented synonymsubject(or left the role unset). The filter now uses the existingisSubjectRolehelper, which acceptsprincipal,subject, and""consistently withFindPersonEvent, the census import default, andvitals_runner. Fixes #523.glx migrate,glx rename, andglx mergeno longer delete non-archive files — The crash-safe write path (safeWriteMultiFileArchive, added in #598) swapped a fresh archive directory into place and then unconditionally removed the original via.bak, wiping any top-level entry the serializer didn't produce — including.git/,README.md,CLAUDE.md,.claude/, and arbitrary user content. Since GLX archives are designed to live inside git repositories, every invocation against a real archive silently destroyed git history and project docs. The swap now preserves every top-level entry that isn't in the managed set (metadata.glx,vocabularies/,persons/,events/,relationships/,places/,sources/,citations/,repositories/,media/,assertions/). Test coverage added for foreign-file preservation. Fixes #692research_logs/andstudies/recognized by safe-write swap andglx initscaffolding — The CLI side hadn't been updated when the ResearchLog (#224) and Study (#226) entity types were added:archiveManagedTopLevelinglx/archive_io.goand the directory list inglx/init_runner.goonly enumerated the older 9 entity types. As a result, runningglx migrate,glx rename, orglx mergeagainst an archive containingresearch_logs/orstudies/would treat those directories as foreign during the safe-write swap (and the single-fileglx inittemplate omitted both keys). Introduced a single source of truth —glxlib.AllEntityTypesingo-glx/constants.go— and refactored the safe-write managed set, bothglx initscaffolding paths, and three test fixtures to derive from it so a new entity type updates every consumer in one place.glx init --create-test-data Nnow also mints oneStudy(umbrella scope) and oneResearchLog(paired with the first generated citation) so the generated archive exercises all 11 entity types. (PR #933)Pin
ossf/scorecard-actiontov2.4.3— The action does not publish a floatingv2major-version tag, so the workflow failed at the "Prepare all required actions" step on every run since it was introduced in #753.v2.4.3is the latest published tag; unlike the other actions in the file (@v6,@v7,@v4),scorecard-actionrequires a minor.patch pin until upstream ships a floating major. (#779)glx vitalslimited to core vital records only — Removed non-vital events (census, marriage, residence, etc.) fromglx vitalsoutput. Vitals now shows exactly Name, Sex, Birth, Christening, Death, Burial; other life events remain available viaglx summaryandglx timeline(#685, Fixes #644)Witness events excluded from vital records display —
glx vitalsandglx summaryvital sections now show vital events (birth, christening, death, burial) only where the person is a principal/subject participant. Witnessing a christening (or other vital event) no longer appears as the person's own vital. Non-vital "Life Events" inglx summarycontinue to show all participant roles (#686, Fixes #647)go.mod/go.sumpinned to LF line endings — Addedeol=lfforgo.modandgo.sumin.gitattributes. Go tools always write LF (see golang/go#31870); without this rule, Windows users got CRLF on checkout, which causedgo mod tidy -difffalse positives in CI (#684, Fixes #638)/check-code-driftslash command covers all type categories — Expanded the drift-detection checklist to cover Metadata, Submitter, EntityRef, NoteList, all 9 vocabulary structs, and FieldDefinition. Added type-mapping entries forNoteList(oneOf),DateString(alias),*Submitter(pointer). New "Vocabulary Struct Types" section enumerates the 9 vocab types and their schemas. Documentation only (#689, Fixes #674)VitePress vocabularies data loader includes
legal-statuses— The new Legal Statuses vocabulary section in5-standard-vocabularies/README.mdreferencedvocabularies['legal-statuses'], butwebsite/.vitepress/data/vocabularies.data.jswas not updated when the vocabulary file was added, so the<YamlFile>block would render undefined on the published site. Loader now readslegal-statuses.glxalongside the other 21 vocabularies. (PR #933)Metadata.notesschema now accepts string or string array —glx-file.schema.jsonpreviously restricted top-levelmetadata.notesto a bare string, while the GoMetadata.Notesfield is aNoteListthat accepts both forms (matching every other entity's notes field). Multi-note metadata round-tripped through Go but failed schema validation. The schema now mirrors the other entities (oneOf: [string, array]). No archive data changes required. (PR #933)complete-familyexample now demonstrates all 11 entity types — Addedstudies/study-smith-yorkshire.glx(family reconstruction scope) andresearch_logs/research-log-john-smith-birth.glx(an investigation with twoSearchentries that produced the existing birth citations). The archive'svocabularies/directory gains the four new standard vocabularies (study-types,study-statuses,search-result-types,research-log-status-types) so the additions validate against archive-owned vocab. Closes a gap noted by/check-exampleswhereStudyandResearchLoghad no example coverage. (PR #933)complete-familyexample filenames now match entity IDs — Renamed 11 single-entity files (person-jane-smith.glx→person-jane-smith-1876.glx,source-census.glx→source-census-1851.glx, etc.) so the archive matches the deterministicEntityIDToFilenameconvention introduced in #694. Previously, round-tripping the archive throughglx's multi-file serializer would rename these files on every write. (PR #933)GEDCOM
STATEtag accepted in repositoryADDRimport — The repository address parser previously only recognizedSTAE, so any GEDCOM produced with theSTATEvariant lost its state/province value on import. Both spellings are now accepted; export continues to emit the spec-canonicalSTAE. (#821)GEDCOM 7
NOtag now creates a disproven assertion instead of ano_*person property — The importer previously rendered1 NO BIRTand similar negative assertions as fabricated person properties likeno_birth: true, which polluted the property vocabulary and lost the negative-assertion semantics. The importer now creates an Event of the corresponding type and a paired Assertion withstatus: disproven(plus citations and the GEDCOMDATE/PHRASEwhen present), matching the GEDCOM 7 spec for theNOstructure. (#688, Fixes #659)glx migrate/rename/mergeretryos.Renameon Windows for transient AV/indexer locks — The crash-safe write path (safeWriteMultiFileArchive, #598) failed intermittently on Windows when antivirus or the search indexer briefly held a handle on the freshly-written tmpdir during the swap. Rename now retries onERROR_ACCESS_DENIED/ERROR_SHARING_VIOLATIONwith exponential backoff (1 ms initial, doubled with jitter and capped at 500 ms per sleep, 2 s total window) before surfacing the error. (#701, Fixes #698, #700)glx validate <file.glx>now runs semantic validation on single files — The single-file code path previously ran only JSON-schema (Pass 1) validation, silently skipping deprecated-property warnings, date-format checks, vocabulary-membership checks, and cross-entity reference integrity.glx validate file.glxnow runs the full validator (Pass 1 + Pass 2), matching the directory-archive code path. (#687, Fixes #658)copyMediaFilesoutput routed through IOStreams; trailing single BLOB char rejected — The GEDCOM import media-copy step wrote progress messages directly toos.Stdout, bypassing--quietand the IOStreams abstraction (#679). Output now flows throughIOStreams.Out. As part of the same audit, the BLOB decoder gained anErrInvalidBlobCharsentinel and now rejects a trailing single Base64 character (which has no valid 6-bit decoding) instead of silently emitting a partial byte. (#695)
0.0.0-beta.10 - 2026-04-11
Added
CLI
- Added
glx searchcommand — Full-text search across all entity types (persons, events, places, sources, citations, repositories, assertions, relationships, media). Case-insensitive by default with--case-sensitiveflag,--typefilter, and results grouped by entity type (#252) - Added
glx mergecommand — Combine two GLX archives by merging all content from a source into a destination. Identical vocabulary entries are silently skipped; true conflicts are reported. Supports both single-file and multi-file archives, with--dry-runfor preview (#264, #609) - Added
glx migratecommand — Converts deprecated person properties (born_on,born_at,died_on,died_at,buried_on,buried_at) to birth/death/burial Event entities. Creates new events when none exist, merges date/place into existing events when they do, converts property assertions to event assertions, and removes the deprecated properties (#360, Fixes #645) - Added
--phoneticflag toglx query persons --name— Soundex phonetic matching finds names that sound alike regardless of spelling (Miller/Myller/Mueller, Smith/Smyth). Supports multi-word queries matching any word against any name word (#262) - Crash-safe writes for
migrateandrenamecommands — Multi-file archive writes now use a temp directory + atomic swap, preventing archive corruption on interrupted writes (e.g., power loss, disk full). Closes #597
Date Handling
- Non-Gregorian calendar support — GEDCOM calendar escape sequences (
@#DJULIAN@,@#DHEBREW@,@#DFRENCH R@) are now preserved as calendar prefixes on DateString values (e.g.,JULIAN 1731-03-15). Previously, calendar designations were silently discarded. Gregorian remains the default (no prefix). Includes full roundtrip support on GEDCOM export (#564)
GEDCOM Import
- Place-less RESI records preserved — GEDCOM
RESIrecords without aPLACsub-record (e.g., bareRESI YorRESIwith onlyDATE/TYPE) are now imported asresidenceEvent entities instead of being silently dropped.RESIrecords with aPLACcontinue to import as temporal person properties. Fixes #488 - ASSO subrecords imported as event participants — Witnesses, officiants, godparents, and other associated persons from GEDCOM ASSO/RELA records are now imported as event participants with correct roles. Supports both individual and family events. Unknown roles preserved in participant notes (#589)
- Separate NOTE records preserved through roundtrip — Notes changed from a single concatenated string to a list (
NoteList). Import appends separate notes instead of concatenating; export emits each as a separate GEDCOM NOTE record, preserving original note boundaries. Backwards-compatible with existing archives (#584)
Changelog
- Use
[Unreleased]header — Changed changelog to use[Unreleased]instead of pre-committed version number, per Keep a Changelog specification. Fixes #388
Changed
glx coverageJSON output keys renamed —born_on/born_at/died_on/died_atrenamed tobirth_date/birth_place/death_date/death_placeto match event-based data model. This is a breaking change for scripts parsing the JSON output (#568)
Fixed
- Unrecognized SEX values preserved — Non-standard or extension GEDCOM SEX values (e.g., custom values, or values such as
Nwhose meaning varies between GEDCOM 5.5.5Not Recordedand 7.0Nonbinary) are now lowercased and preserved as-is instead of being silently mapped tounknown. Validation will warn about out-of-vocabulary values (#588) - Correct year extraction from Hebrew and French Republican dates —
ExtractFirstYearnow uses calendar-aware extraction, finding the last digit sequence for HEBREW and FRENCH_R dates where the year appears last. Previously,HEBREW 15 TSH 5765would extract15(the day) instead of5765. Also handles range dates (BET...AND,FROM...TO) correctly (#590)
Import
- GEDCOM OBJE without FILE now preserves metadata — Previously, OBJE records with no FILE reference were silently dropped during import. Now the media entity is created with an empty URI, preserving all metadata (TITL, NOTE, FORM). Validation catches the missing URI downstream. (#492)
Developer Experience
- devcontainer: remove abandoned ajv-cli and install actual npm deps — Replaced the unused global
ajv-cliinstall with parallelpostCreateCommandthat runsgo mod download, pinsgolangci-lint v2.11.4, and installsspecification/andwebsite/npm dependencies. Removed unused Docker extension, added YAML and markdownlint extensions, addedforwardPortsfor VitePress dev server (#326, #327)
CI
- auto-resolve-conflicts: deduplicate changelog section headers after conflict resolution — The sed-based "keep both sides" merge produced duplicate
###/####headers when both sides added entries to the same section. Added an awk deduplication pass that merges entries under a single header (#331) - auto-resolve and auto-update workflows can no longer race — Both workflows now share a
branch-maintenanceconcurrency group. Auto-resolve triggers onworkflow_runafter auto-update succeeds instead of independently on push to main. Auto-update cron reduced from every 30min to daily; auto-resolve cron removed (runs only after auto-update) (#332) - release workflow no longer fails when Discord webhook is unconfigured — Guard the Discord announcement step with an empty-check on the webhook secret so missing secrets skip the step instead of failing the release job. Also switch to
curl -sffor proper HTTP error handling (#342) - CODEOWNERS: activate rules with real usernames and fix stale paths — All rules were commented out and several paths were stale (
/schema/,/test-suite/,/examples/). Activated with individual usernames, updated paths to match current directory structure (#330) - Release workflow: exclude GEDCOM test fixtures from line-ending normalization — GoReleaser was failing with "git is in a dirty state" because
* text=autowanted to renormalize GEDCOM test fixtures that had been deliberately excluded from repo-wide LF normalization in #656 without a matching-textattribute. Addedglx/testdata/gedcom/**/*.ged -textin.gitattributesso all GEDCOM fixtures are preserved byte-for-byte.testdata/gedcomREADME.md docs (not fixtures) renormalized to LF
Specification
- Relationship GEDCOM mapping table column header — The "GLX Field" column actually listed relationship type values; renamed to "GLX Relationship Type" for accuracy (PR #670)
- Calendar Prefix glossary placement — Moved "Calendar Prefix" entry from the
## Dsection to the## Csection where it belongs alphabetically (PR #670) - Date keywords list: clarify
AND—ANDis a connector insideBET YYYY AND YYYY, not a standalone keyword; updated validation description to reflect this (PR #670) - Assertion key-properties mention evidence alternatives — Entity Types index now lists
citations/sources/mediafor Assertion key properties instead ofcitationsalone (PR #670) - Assertion required-field table: evidence requirement row — Simplified the "at least one of citations, sources, or media" row and added a link to the Evidence Requirement section (PR #670)
specification/CLAUDE.mdentity-type guide — "Adding a New Entity Type" steps now include specific file paths (types.go, vocabulary file, JSON schema, entity spec, README card, glossary, CHANGELOG) (PR #670)- Vocabulary field-definition table: document
value_type— Addedvalue_typerow to the "Field Definition Structure" table invocabularies.mdsince it's used by standard vocabularies (crop.fields,external_ids.fields) (PR #670) - Removed unused
category: "occupation"from event-type examples — Thecategoryfield was shown in four custom-event-type examples but no standard vocabulary defines or uses occupation as a category. Removed from2-core-concepts.md,4-entity-types/vocabularies.md,5-standard-vocabularies/README.md, andwebsite/standard-vocabularies.md(PR #670) - Added glossary entries for Participant Assertion, Per-Participant Properties, and Temporal Existential Assertion — These concepts were used in entity specs but missing from the glossary (PR #670)
Schema
notesfield accepts string or array across all 9 entity schemas — The spec was updated in this release to documentnotes: string | string[](matching the GoNoteListtype), but the schemas still enforcedtype: "string", causing validation failures on any archive using array-form notes. Fixed inassertion.schema.json,citation.schema.json,event.schema.json,media.schema.json,person.schema.json,place.schema.json,relationship.schema.json,repository.schema.json, andsource.schema.json(12 occurrences total, including participant sub-objects in assertion/event/relationship)- FieldDefinition supports
value_typein all 8 property-vocabulary schemas — The spec documentsvalue_typeas a valid key on structured field components (used bymedia-properties.glxcrop sub-fields withvalue_type: integer), but the schemas only documentedlabelanddescription. Addedvalue_type: stringwith the standard enum (string | date | integer | boolean) tocitation-properties,event-properties,media-properties,person-properties,place-properties,relationship-properties,repository-properties, andsource-propertiesschemas participant-roles.schema.jsondocumentsapplies_toandgedcom— The standardparticipant-roles.glxusesapplies_to: [event | relationship]on most roles (15+ entries), but the schema did not document the field. Added bothapplies_to(array of enum strings withuniqueItems) andgedcom(string) to ParticipantRoleDefinition. Related to #499- Added
gedcomfield to 5 vocabulary schemas —confidence-levels.schema.json(→ QUAY, #515),media-types.schema.json(→ MEDI),participant-roles.schema.json(→ ASSO.RELA, #524),repository-types.schema.json(#555), andsource-types.schema.json(#561) now document the optionalgedcomkey that the other 11 vocabulary schemas already support, closing the schema support gap for in-flight work on these mappings
Go Types
- Vocabulary structs gain
GEDCOMfield —ConfidenceLevel,ParticipantRole,SourceType,RepositoryType, andMediaTypestructs ingo-glx/types.gonow carryGEDCOM string yaml:"gedcom,omitempty"matching the schema additions. Previously, a.glxfile usinggedcom:on one of these vocabularies would have the value silently dropped on round-trip.PlaceTypewas evaluated but not updated — there is no natural GEDCOM mapping for place types and no open issue tracking one FieldDefinitiongainsValueTypefield — AddedValueType string yaml:"value_type,omitempty"toFieldDefinitionso structured-property field components (used bymedia-properties.glxcrop sub-fields) round-trip correctly through the Go types
Removed
Person Properties
- BREAKING: Removed
born_on,born_at,died_on,died_atperson properties. Birth and death information now lives exclusively on Event entities of typebirth/death. Useglx migrateto convert existing archives (#360)
0.0.0-beta.9 - 2026-03-29
Added
Supply Chain Security
- Dependency review on PRs —
dependency-review-actionblocks PRs that introduce dependencies with moderate+ vulnerabilities - Renovate lockfile maintenance — weekly lockfile refresh keeps transitive dependencies at latest allowed versions
- govulncheck SARIF integration — vulnerability results now upload to GitHub Code Scanning for richer triage
- npm audit in CI — website dependencies are audited for known vulnerabilities on every push and PR
CLI
- Added
glx pathcommand - Find the shortest relationship path between two people using breadth-first search. Traverses all relationship types (parent-child, marriage, sibling, godparent, etc.). Supports--max-hopsto limit search depth and--jsonfor machine-readable output - Added
glx clustercommand - FAN (Friends, Associates, Neighbors) club analysis for brickwall research. Cross-references census households, shared events, and place overlap to identify associates of a target person. Ranks associates by connection strength with compound scoring. Supports--place,--before,--afterfilters and--jsonoutput - Added
glx census addcommand - Bulk census import helper that generates GLX entities from a structured YAML template. Reads census year, location, household members, and citation details to produce person records, a census event with participants, source/citation entities, and evidence-based assertions. Supports matching members to existing archive persons by ID or name,--dry-runpreview, and FAN notes - Added
conflictsanalysis category toglx analyze- Detects assertions with conflicting values for the same person/property combination (e.g., multiple conflicting birthplaces). Reports the number of distinct values and their confidence levels. Use--check conflictsto run independently. Fixes #156 - Analyze flags duplicate given names among siblings -
glx analyzenow detects when a parent's children share the same given name, which may indicate incorrect family reconstruction, a "replacement child" pattern, or a middle-name situation. Skips the pattern when earlier child died before the later was born. Fixes #164 - Added
--subjectfilter toglx query assertions- Filter assertions by subject entity ID or person name substring. Matches any subject type by exact ID; for person subjects, also matches by case-insensitive name search. Fixes #150 - Added
--birthplacefilter toglx query persons- Filter persons by birthplace using place ID or name substring (case-insensitive). Matches against bothborn_atvalue and resolved place name. Fixes #141 - Analyze flags uncited claims in notes -
glx analyzeevidence checks now detect assertion notes that reference sources (e.g., "per county history," "census shows") without a corresponding citation. Fixes #162
Changed
- Life History narrative mentions children -
glx summarynow includes children in the biographical narrative, listed by given name in birth order (e.g., "She had three children: Harriett, Elijah, and Mary."). Fixes #153
Fixed
- Validate catches dangling property references -
glx validatenow detects when property values likeborn_at,died_at, orresidencereference non-existent entities. Previously, standard vocabularies were not loaded during validation, so property reference checks were silently skipped. Fixes #147 - Consistent date display across timeline and summary - ISO dates like
1860-07-17now render asJuly 17, 1860in timeline tabular output, summary vital events, and life events. Previously, dates appeared in whichever format they were stored (GEDCOM or ISO), creating inconsistent mixed output. Fixes #139 - Stats lists duplicate entity IDs -
glx statsnow lists the specific duplicate IDs in its warning, consistent withglx analyze. Fixes #177 - Validate and archive loading skip non-.glx files -
glx validateand archive loading now only process files with the.glxextension. Previously,.yamland.ymlfiles in the archive directory were also parsed, causing spurious validation errors on non-GLX files like.wikitree.yml. Fixes #178 - Windows compatibility for symlinked vocabulary files - Archive loading now resolves Git symlink placeholders on Windows, where symlinks are stored as text files containing the target path. Previously, ~35 tests failed on Windows because example archives contain symlinked vocabulary files. Fixes #206
- Analyze flags missing marriage events per spouse -
glx analyzenow checks each spouse relationship independently instead of checking for any marriage event. Persons with multiple spouses where one has an event and another doesn't are now correctly flagged with the specific spouse name. Fixes #166 - Places command detects person property references -
glx placesno longer reports places as "Unreferenced" when they are used in person properties (born_at,died_at,buried_at,residence). Also checks assertion values for place-reference properties. Handles string, structured map, and temporal list property shapes. Fixes #145 - Analyze checks citations for census coverage -
glx analyzenow checks assertions' citations and sources (not just census event entities) when determining whether a census year is covered. Previously, census records documented only via citations were still suggested as missing, contradictingglx coverageoutput. Fixes #140 - BEF date prefix respected in census suggestions -
glx analyzeandglx coveragenow treatBEF <year>death dates as exclusive upper bounds. A person withdied_on: "BEF 1870"no longer gets 1870 census suggestions. Fixes #165 - Summary shows marriages in chronological order -
glx summarynow sorts spouses by full marriage date (earliest first, using the same date sort key asglx timeline) instead of relationship ID order. Correctly orders marriages within the same year. Undated marriages sort after dated ones. The Life History narrative also reflects the correct order. Fixes #136 - Life History narrative formats ISO dates as readable text - Dates like
1863-06-18now render as "on June 18, 1863" instead of "in 1863-06-18". Handles full dates, year-month, and prefixed dates (ABT, BEF, AFT) - Census suggestions capped at plausible lifespan -
glx analyzeandglx coverageno longer suggest census years beyondbirth_year + 100when no death date is known. Previously, a person born ~1832 would get suggestions for 1940 and 1950 censuses. Fixes #130 - Burial events infer death for census suggestions - When
died_onis not set but a burial event exists, the burial date is used as the death upper bound for census suggestions. Prevents suggesting post-death censuses for persons with burial records but no explicit death date. Fixes #134 - 1890 census annotated as mostly destroyed -
glx coverageandglx analyzenow note that the 1890 US Census was mostly destroyed in a 1921 fire, so researchers don't waste time searching for non-existent records. Fixes #131 - Timeline includes person's own birth and death -
glx timelinenow synthesizes birth/death entries fromborn_on/died_onperson properties when no corresponding event entity exists. Previously these were omitted, making the person's own vital events the only events missing from their timeline. Fixes #142
0.0.0-beta.8 - 2026-03-15
Added
CLI
- Added
glx analyzecommand - Automated research gap analysis engine that cross-references all entities in a GLX archive to surface evidence gaps (missing dates, no parents, no events), evidence quality issues (unsupported assertions, single-source persons, orphaned citations/sources), chronological inconsistencies (death before birth, parent younger than child, implausible lifespan), and research suggestions (census years to search, vital records to locate). Supports--checkto run a single category,--format jsonfor machine-readable output, and person filtering by ID or name - Added
glx diffcommand - Compare two GLX archive states with genealogy-aware diffing. Shows added, modified, and removed entities with field-level detail, confidence upgrade/downgrade tracking, and new evidence metrics. Supports summary, verbose, short, and JSON output modes. Use--personto filter changes for a specific person - Added
glx coveragecommand - Show source coverage matrix for a person, listing expected records (US census, vital, probate, land, military, church) and which are present vs missing. Flags high-priority gaps like the 1880 census. Supports--jsonoutput - Added
glx duplicatescommand - Detect potential duplicate person records using a weighted scoring model (name similarity with Levenshtein distance and nickname matching, birth/death year proximity, place match, shared relationships and events). Supports person-specific filtering and JSON output. Automatically skips persons already linked by relationships
Library
- Exported
ExtractFirstYearandExtractPropertyYear- Year-extraction utilities are now public API for use by CLI commands and external consumers
Validation
- Moved temporal consistency checks to
glx analyze- Death before birth, parent younger than child, and marriage before birth checks are now part of the analyze command's consistency category instead of the validator, keepingglx validatefocused on structural and referential integrity
Standard Vocabularies
- Added
vocabulary_typeto property definitions - Properties can now reference a controlled vocabulary (e.g.,vocabulary_type: gender_types) instead of a free-formvalue_type. Validation warns on out-of-vocabulary values. Mutually exclusive withvalue_typeandreference_type - Added
gender_typesvocabulary - First vocabulary-constrained property type. Standard entries: male, female, unknown, other — with GEDCOM SEX mappings. GEDCOM export now looks up gender→SEX via the vocabulary before falling back to hardcoded mappings - Added
marriage_typeevent property - Classification of marriage (civil, religious, common-law). Was used in GEDCOM import/export but missing from standard vocabulary - Added
primary_nameperson property - Simple display name fallback when structured name property is not available. Was used in event titles and data generation but missing from standard vocabulary - Added
blob_sizemedia property - Size in bytes of inline binary data from GEDCOM 5.5.1 BLOB records. Was used in GEDCOM media import but missing from standard vocabulary
Changed
- GEDCOM encoding conversion now streams for charmap encodings - CP1252/ISO-8859-1 decoding uses
transform.NewReaderinstead of reading the entire file into memory. Only ANSEL (which requires combining-mark reordering) buffers the full file. UTF-8 files pass through with near-zero overhead - ANSEL converter handles multiple combining diacriticals - Consecutive combining marks preceding a base letter are now all buffered and emitted after the base letter in Unicode order, instead of only handling a single combining mark
Fixed
- GEDCOM import now converts non-UTF-8 encodings - Files with
CHAR ANSI(Windows-1252),CHAR cp1252,CHAR ANSEL, orCHAR ISO-8859-1are now automatically converted to UTF-8 during import. Previously, non-ASCII characters (German umlauts, accented letters, copyright symbols) were stored as raw bytes, producing!!binaryYAML tags, garbled event titles, and{"type":"Buffer"}place names in the web UI - GEDCOM date import mangled when day-of-month matches level number - Dates like
2 AUG 1944(day 2) were imported as2 DATE 2 AUG 1944because the parser's value extraction matched the level number instead of the actual value. Fixed by walking past tokens positionally instead of using string search - Date year extraction now handles 1–3 digit years - Year extraction previously hardcoded a 4-digit assumption (
\d{4}), silently ignoring dates like800,476, orABT 476. All four extraction sites (query filtering, timeline sorting, temporal validation, event titles) now support 1–4 digit years. Day-of-month values (e.g.,15in15 MAR 1850) are correctly disambiguated. Timeline sort keys are zero-padded to 4 digits for proper chronological ordering. Fixes #108
0.0.0-beta.7 - 2026-03-10
Added
CLI
- Added
glx exportcommand - Export GLX archives to GEDCOM 5.5.1 or 7.0 format. Supports both single-file and multi-file archives as input. Reconstructs GEDCOM FAM records from GLX relationships, converts dates/places/names back to GEDCOM format, and preserves sources, repositories, media, citations, and notes. Use--format 70for GEDCOM 7.0 output - Added
glx timelinecommand - Display chronological events for a person, including direct events and family events (spouse/child births, parent deaths) via relationship traversal. Supports--no-familyflag to exclude family events; undated events shown in a separate section - Added
glx summarycommand - Comprehensive person profile showing identity, vital events, life events, family (spouses, parents, siblings), other relationships, and an auto-generated life history narrative - Added
glx ancestorsandglx descendantscommands - Display ancestor/descendant trees using box-drawing characters. Traverses parent-child relationships with--generationsflag to limit depth. Handles biological, adoptive, foster, and step-parent types with cycle detection - Added
glx vitalscommand - Display vital records (name, sex, birth, christening, death, burial) for a person by ID or name search, plus any other life events they participated in - Added
glx citecommand - Generate formatted citation text from structured fields (source title, type, repository, URL, accessed date, locator), eliminating repetitive manualcitation_textwriting - Added
--sourceand--citationfilters toglx query assertions- Filter assertions by source or citation ID to find all claims derived from a specific source - Improved
glx query persons --nameto search all name variants - Now matches across birth names, married names, maiden names, and as-recorded variants (temporal name lists), not just the primary name. Results show alternate names with "aka:" suffix
Event Entity
- Added optional
titlefield - Human-readable label for events (e.g., "1860 Census — Webb Household"). Auto-generated on GEDCOM import (e.g., "Birth of Robert Webb (1815)", "Marriage of John Smith and Jane Doe (1850)")
GEDCOM Import
- Non-standard date preservation - BCE dates, Julian/Hebrew/French Republican calendar dates, and dual-year dates are preserved as raw strings instead of being dropped
- TITL with DATE/PLAC sub-records - Title properties with dates and places are stored as temporal list items and roundtrip correctly
- Empty OCCU with PLAC fallback - OCCU records with empty values but PLAC sub-records now extract the place text as the occupation value
- HEAD-level NOTE preservation - Notes on GEDCOM HEAD records are now imported and exported
- Family-level RESI import - RESI records under FAM are now distributed to both spouses as residence properties
- Family-level NOTE import/export - NOTE records on FAM are now stored on the relationship's Notes field and roundtrip correctly
GEDCOM Export
- Inline SOUR citations on individual events - Birth, death, burial, and other individual events now preserve SOUR citations during export
- Single-spouse family marriages - FAM records with only HUSB or WIFE now export marriage relationships and events instead of being silently dropped
- Multiple MARR events per family - Families with multiple MARR records now preserve all marriage events
- Marriage TYPE export - Marriage
marriage_typeproperty now exported as TYPE sub-record on MARR - Family event TYPE/properties export - Family events (EVEN, ENGA, etc.) now export event_subtype and other event properties (TYPE, CAUS, AGE) that were previously lost
- HEAD metadata roundtrip - LANG, FILE, COPR sub-records from the original GEDCOM HEAD are now preserved through import/export
- Single-value RESI export - RESI stored as scalar (not list) now exports correctly instead of being silently dropped
- Multi-family children placed in all matching families - Children belonging to multiple FAM records (e.g., birth family + step-family) are now placed in all matching families instead of only the first match
Validation
- Added temporal consistency checks - Validator now warns on: death year before birth year, parent born after child, marriage event before participant's birth. Reported as warnings since dates are often estimates
Documentation
- Added Westeros example archive - Large-scale example featuring 790+ characters from A Song of Ice and Fire with full evidence chains, 200+ custom vocabulary types, and temporal properties. Hosted at github.com/genealogix/glx-archive-westeros
- Added Hands-On CLI Guide - Step-by-step walkthrough of every
glxcommand using the Westeros demo archive, with real output examples
Fixed
- SOUR citation duplication on multi-value properties - Assertion-based SOUR references now filter by matching value, preventing N×N duplication when a person has multiple values for TITL, OCCU, etc.
0.0.0-beta.6 - 2026-03-08
Added
CLI
- Added
glx placescommand - Analyze places for ambiguity and completeness: flags duplicate names, missing coordinates, missing types, hierarchy gaps, and unreferenced places with canonical hierarchy paths - Added
glx querycommand - Filter and list entities from a GLX archive with type-specific flags:--name,--born-before,--born-afterfor persons;--type,--before,--afterfor events;--confidence,--statusfor assertions - Added
glx statscommand - Summary dashboard showing entity counts, assertion confidence distribution, and entity coverage for quick feedback on archive health
Build & Release
- Added
make release-snapshottarget - Build cross-platform binaries locally without publishing, using GoReleaser snapshot mode - Updated release workflow to latest action versions -
actions/checkout@v4(withfetch-depth: 0for proper changelog),actions/setup-go@v5,goreleaser/goreleaser-action@v6
Person Entity
- Added name variation tracking - Expanded the
name.fields.typeclassification field with standard values for alternate spellings, abbreviations, and as-recorded forms (aka,maiden,anglicized,professional,as_recorded). Added documentation and examples for representing name variations like "R. Webb" vs. "Robert Webb"
Standard Vocabularies
- Added
original_place_namecitation property - Records the verbatim place name from a source before normalization to a place entity (e.g., "The Town Of Oakdale" vs the normalized place reference) - Added relationship types -
neighbor,coworker,housematefor census/social records;apprenticeship,employment,enslavement,relativefor occupational and generic kinship relationships - Added event types -
legal_separation,taxation,voter_registrationfor legal/administrative events;military_service,stillborn,affiliationfor service periods, stillbirths, and memberships - Added source types
population_register,tax_record,notarial_record- Common European and colonial record types - Expanded
militarysource type description - Now includes draft registrations and muster rolls
Participant Object
- Added
propertiesto participants - Participants across events, relationships, and assertions can now carry per-participant properties likeage_at_event, enabling shared events (census, passenger lists) to record individual data without creating separate events per person - Participant properties validated against parent entity vocabulary - Event participant properties validated against event_properties, relationship participant properties against relationship_properties, assertion participant properties against event_properties
Assertion Entity
- Added existential assertions - Assertions no longer require
propertyorparticipant; an assertion with onlysubjectand evidence asserts the entity's existence, optionally at a specificdate(#26)
GEDCOM Import
- Import HEAD metadata - GEDCOM HEAD record fields (export date, source file, copyright, language, source system/version/corporation, GEDCOM version, character set, notes) are now stored in a
metadatasection on the GLX archive instead of being discarded after logging - Import SUBM metadata - GEDCOM SUBM submitter information (name, address, phone, email, website) is now stored in
metadata.submitteron the GLX archive
Data Model
- Added
Metadatatype - New top-levelmetadatafield on GLX archives for storing import provenance information - Added
Submittertype - Nested within metadata to hold submitter contact details
Changed
Specification
- Removed hard-coded vocabulary counts - Replaced "N standardized type codes" with descriptive text to prevent stale counts as vocabularies grow
- Improved custom type example - Custom event type example now shows defining custom participant roles (
apprentice,master) alongside the custom event type - Clarified
subjectparticipant role - Documented as preferred overprincipal
Fixed
- Fixed confidence levels example format - Core concepts example now uses the correct
label/descriptionstructure instead of simple key-value strings - Fixed citation GEDCOM mapping - Corrected invalid
SOUR.CITN.EXIDtag toSOUR.EXID - Fixed core-concepts.md formatting - Property Vocabularies heading was merging with preceding table
- Fixed glossary Secondary Evidence example - Replaced "census records" (primary evidence) with "published indexes, compiled genealogies"
0.0.0-beta.5 - 2026-03-06
Added
Standard Vocabularies
- Added
urlandaccessedproperties for digital sources - Sources can now record aurlproperty, and citations can record anaccesseddate for when an online source was last verified (#21) - Added
raceperson property - Temporal string property for recording racial classifications as they appear in historical documents such as census records (#24) - Added
urlandexternal_idscitation properties - Citations can now record a direct URL to cited material and external identifiers (e.g., FamilySearch ARK) for record-level specificity (#23) - Added
typefield toexternal_idsproperty - Allexternal_idsproperties (person, source, citation, repository) now support a structuredfields.typeto record the issuing authority (e.g., FamilySearch URI from GEDCOM EXID.TYPE) (#32) - Added
typefield tonameproperty - Name property now supports afields.typeto classify name usage (e.g., birth, married, alias) (#25)
Assertion Entity
- Added
statusfield to assertion entity — Assertions can now record a research status (e.g.,proven,disproven,speculative) independently ofconfidence, allowing researchers to distinguish between certainty and verification state (#27)
GEDCOM Import
- Import NAME.TYPE subfield - GEDCOM
NAME.TYPEvalues (BIRTH, MARRIED, AKA, etc.) are now lowercased and stored in the name property'stypefield (#25) - Import EXID on citations - GEDCOM 7.0
EXIDtags on source citations are now imported asexternal_idscitation properties (#32) - Structured EXID import - GEDCOM EXID.TYPE is now stored in
fields.typeinstead of being concatenated into the ID string; applies to all entity types (#32)
Fixed
- Multiple GEDCOM NAME records no longer silently dropped (#29) - When a person has multiple NAME records (birth name, married name, etc.), all names are now stored as a temporal list instead of only keeping the last one
- FAM event processing no longer depends on HUSB/WIFE tag order (#15) - Family events (CENS, ENGA, MARB, etc.) are now collected in a first pass and processed after spouse IDs are extracted, so GEDCOM tag order no longer matters
- Census NOTE no longer discarded when SOUR exists (#30) - NOTE text on CENS records is now appended to existing citation notes when SOUR sub-records are present, instead of being silently lost
- Marriage/divorce events use
start_event/end_eventinstead of properties - GEDCOM MARR and DIV events are now correctly linked to relationships via the top-levelstart_eventandend_eventfields, eliminating non-vocabularymarriage_event/divorce_eventproperty warnings - Append residence on PLAC-without-DATE instead of overwriting - When residence came from a GEDCOM RESI tag or census-derived CENS data with a PLAC but no DATE, the residence property was overwritten instead of appended (#22)
0.0.0-beta.4 - 2026-03-04
Added
Standard Vocabularies
- Added
townshipplace type - Township is a common administrative division in U.S. census and land records, distinct fromtown(a geographic settlement vs. a civil subdivision of a county) (#16)
Fixed
Validation
- Suggest correct vocabulary key on hyphen/underscore mismatch - When a reference fails validation due to a hyphen/underscore swap (e.g.,
birth_datevsbirth-date), the error message now suggests the correct key (#19)
CLI
- Show directory contents in
glx initnon-empty error - Whenglx initfails because the target directory is not empty, the error message now lists up to 5 files found (e.g.,.DS_Store,.git), helping users diagnose unexpected blockers like hidden files or sync artifacts (#18) - Remove self-referencing
replacedirective that blocksgo install- Thego.modcontained a no-op self-referencing replace directive that preventedgo install github.com/genealogix/glx/glx@latestfrom working (#17)
GEDCOM Import
- Deduplicate evidence references - When a GEDCOM record references the same source multiple times,
extractEvidence()andextractEventDetails()now skip IDs already seen, preventing duplicate entries that violate unique constraints in downstream consumers (#13)
Documentation & Website
- Fix dead links and website issues - Rewrote 83 dead links across the site to point to GitHub URLs and VitePress paths, added solid background to navbar on home page, and fixed module path resolution (#10)
- Fix Go Report Card link - Corrected badge link in CLI README to point to the repository root (#11)
0.0.0-beta.3 - 2026-02-10
Added
Census Event Type
- Added
censusevent type to standard vocabulary - Census enumeration events (CENSGEDCOM tag) now included inevent-types.glx
Schema Embeds
CitationPropertiesSchemaandSourcePropertiesSchemaembed variables - Completes the pattern established by all other vocabulary schema embeds inembed.go
GEDCOM Import: Eliminate Meaningless Citations
- Bare source references no longer create empty citation entities - When a GEDCOM SOUR tag references a source without any citation-level detail (no PAGE, DATA, TEXT, QUAY, NOTE, or OBJE subrecords), the assertion or event now references the source directly via the
sourcesfield instead of creating a citation that only contains a source reference - Added
PropertySourcesconstant for event/relationship properties
Changed
Assertion Entity Improvements
Renamed claim to property
- Renamed
claimfield toproperty- The field name now matches the vocabulary terminology (property vocabularies) - Updated JSON schema, Go types (
Assertion.Claim→Assertion.Property), all specification examples, example archives, test data, and terminology throughout docs - Renamed test directories:
assertion-unknown-claim→assertion-unknown-property,assertion-participant-and-claim→assertion-participant-and-property,invalid-assertion-claims→invalid-assertion-properties
Typed Subject Reference
- Changed
subjectfrom string to typed reference object - Prevents entity ID collisions in large archives - Must specify exactly one of:
person,event,relationship, orplace - Before:
subject: person-john-smith→ After:subject: { person: person-john-smith } - Added
EntityRefGo type withType()andID()helper methods - Updated validation to ensure exactly one field is set and referenced entity exists
Media as Assertion Evidence
- Added
mediaas a third evidence option for assertions - Assertions can now reference media entities directly as evidence, alongside citations and sources - Useful for direct visual evidence like gravestone photos, handwritten documents, or family photographs
- JSON schema
anyOfevidence constraint updated to includemedia
Temporal date Field
- Added
datefield to assertions - Assertions can now specify a date or date range indicating when the asserted property value applies, enabling precise temporal targeting for properties like occupation, residence, and religion that change over time - Added
Datefield toAssertionGo struct anddateproperty to assertion JSON schema - Assertion
valuefield is now required whenpropertyis present
Vocabulary Consolidation
Adoption Modeling
- Removed redundant
adoptionrelationship type - Useadoptive-parent-childrelationship type instead - Clarified adoption semantics:
adoptionevent type records the legal proceeding;adoptive-parent-childrelationship type models the ongoing bond - Removed
RelationshipTypeAdoptionconstant from Go code
Godparent Modeling
- Clarified godparent dual usage - Participant role
godparentfor event participation (baptism sponsor); relationship typegodparentfor the ongoing bond - Added
godchildparticipant role for use in godparent relationships
Type System
Unified Participant Type
- Unified participant types - Consolidated
EventParticipant,RelationshipParticipant, andAssertionParticipantinto singleParticipantstruct- All three had identical structure:
person,role,notesfields Event.Participants,Relationship.Participants, andAssertion.Participantnow all use the unified type
- All three had identical structure:
Property Vocabularies
Media Properties
- New
media-properties.glxvocabulary - Standard properties for media entities:subjects- People depicted or referenced in the media (multi-value)width,height- Dimensions in pixels for images/videoduration- Duration in seconds for audio/videofile_size- File size in bytescrop- Crop coordinates as integers (top, left, width, height)medium- Physical medium type (photograph, document, film)original_filename- Original filename before importphotographer- Person who created the medialocation- Place where the media was created
- Added
Propertiesfield to Media struct andMediaPropertiesto GLXFile
Repository Properties
- New
repository-properties.glxvocabulary - Standard properties for repository entities:phones- Phone numbers for the repository (multi-value)emails- Email addresses for the repository (multi-value)fax- Fax numberaccess_hours- Hours of operation or access availabilityaccess_restrictions- Any restrictions on access (appointment required, subscription, etc.)holding_types- Types of materials held as YAML arrays (multi-value)external_ids- External identifiers from other systems like FamilySearch, WikiTree (multi-value)
- Added
RepositoryPropertiesto GLXFile - Moved contact fields (phone, email) from direct entity fields to
properties
Citation Properties
- New
citation-properties.glxvocabulary - Standard properties for citation entities:locator- Location within source (consolidates formerpageandlocatordirect fields; GEDCOM PAGE)text_from_source- Transcription or excerpt of relevant text (moved from direct entity field)source_date- Date when the source recorded the information (from GEDCOM DATA.DATE)
- Added
Propertiesfield to Citation struct,CitationPropertiesto GLXFile, and vocabulary specification section
Source Properties
- New
source-properties.glxvocabulary - Standard properties for source entities:abbreviation- Short reference name (from GEDCOM ABBR)call_number- Repository catalog number (from GEDCOM CALN)events_recorded- Types of events documented by this source (multi-value, from GEDCOM EVEN)agency- Responsible agency (from GEDCOM AGNC)coverage- Geographic/temporal scope of source contentexternal_ids- External system identifiers (multi-value)
- Added
Propertiesfield to Source struct,SourcePropertiesto GLXFile, andsource-properties.schema.json
Multi-Value Property Support
- Added
multi_valuefield to PropertyDefinition - Properties can now be marked as supporting multiple values - Validation correctly handles array values for multi-value properties
GEDCOM Import
Media/OBJE Import
- Implemented inline OBJE handling for all record types - Media references and embedded OBJE records on individuals, events, sources, families, submitters, census records, and person property tags are now imported (previously only marriage events and top-level OBJE were handled)
- Added
handleOBJEshared helper for XRef references, GEDCOM 7.0@VOID@pointers, and embedded OBJE - Added BLOB data handling, URL-type multimedia import, and OBJE processing in
extractEventDetails - Torture test media import improved from 2 to 32 entities (100% coverage)
Media File Import
- Media files are now copied into the archive during GEDCOM import - Relative FILE paths copied to
media/files/; BLOB data decoded and written to files - Media URIs rewritten to archive-relative paths; URL and absolute path references left as-is
- Filename deduplication with counter suffixes; missing source files produce warnings, not errors
Census (CENS) Support
- Implemented CENS tag handling for individual and family records - Census records treated as evidence sources, not events
- Each CENS creates a Source (type:
census) and Citation; extracts PLAC for temporalresidenceproperty - Family-level CENS applies census data to both husband and wife
- Added
createPropertyAssertionWithCitations()helper
Vocabulary-Driven Tag Resolution
- Added
gedcomfield toPropertyDefinitionstruct - Property vocabulary entries can now declare their corresponding GEDCOM tag - Added GEDCOM tag mappings to all 6 property vocabularies (person, event, citation, source, repository, media)
- Added
external_idsto person-properties.glx and event detail properties (age_at_event,cause,event_subtype) to event-properties.glx - Added
GEDCOMIndexreverse lookup infrastructure; replaced hardcoded mappings with vocabulary-driven lookups - Added
gedcomfield andfields/FieldDefinitionto all 8 property vocabulary JSON schemas - Updated vocabulary specification documentation with
gedcomfield and GEDCOM column
Evidence and Citation Handling
- Assertions require citations - Assertions are now only created when SOUR tags are present
- Embedded citation support - SOURCE_CITATION without pointer creates synthetic Source entity
- Properties-based storage - Source, media, and citation tags now stored in vocabulary-defined
propertiesinstead of notes - Citation linkage on media - SOUR on OBJE now properly links via
citation.Media
Validation
- Place hierarchy cycle detection - Validates that place parent references don't form cycles (e.g., A -> B -> C -> A). Reports exactly one error per cycle with the full cycle path in the error message.
Place Entity
- Moved
jurisdiction,place_format, andalternative_namesto properties - Now stored as vocabulary-defined properties instead of dedicated entity fields.alternative_namessimplified fromAlternativeName/DateRangetypes to a temporal, multi-value string property.
Relationship Entity
- Consolidated
descriptionintoproperties.description- Removed as a top-level field
Source Entity
- Consolidated
creatorfield intoauthors- Removedcreatorfrom spec, schema, and Go types
Library Package Restructuring
- Moved core library from
glx/lib/togo-glx/- The library is now at the repository root for clean external imports - Renamed package from
libtoglx- External consumers import asglxlib "github.com/genealogix/glx/go-glx"and useglxlib.GLXFile,glxlib.NewSerializer(), etc. - Updated all CLI files to use new import path and
glxlib.qualifier
CLI
- Changed
glx importdefault format - Now defaults to multi-file (-f multi) instead of single-file
JSON Schema URLs
- Standardized schema
$idURLs - All JSON schemas now use consistent GitHub raw content URLs; removed references toschema.genealogix.ioandgenealogix.orgdomains
Documentation
- Rewrote Migration from GEDCOM guide - Expanded from a skeleton to a comprehensive guide covering all supported GEDCOM tags, CLI flags, field mapping tables, common challenges, troubleshooting, and GEDCOM 5.5.1 vs 7.0 differences
- Clarified vocabulary file location is flexible - Spec, quickstart, and vocabulary docs now emphasize that vocabulary files can live anywhere in the archive, not only in
vocabularies/ - Streamlined Introduction - Simplified 1-introduction.md from 120 to 63 lines
- Restructured Core Concepts - Reorganized 2-core-concepts.md to emphasize flexibility; new section order: Archive-Owned Vocabularies → Entity Relationships → Data Types → Properties → Assertions → Evidence Chain → Collaboration
- Merged Data Types into Core Concepts - Integrated
6-data-types.mdas section 3; deleted standalone file - Added Glossary to specification - Moved from
docs/guides/glossary.mdto specification/6-glossary.md with "Property" and "Temporal Property" definitions - Updated table of contents and fixed broken links after restructuring
- Removed
.mdextensions from ~40 internal links for VitePress compatibility - Standardized GEDCOM mapping table headers across all 8 entity type files
- Added Properties sections to place.md and relationship.md
- Standardized entity file structure across all entity type docs
- Added Schema Reference sections to event, relationship, place, citation, and repository entity docs
- Added naming convention note (hyphens for file/entry names, underscores for YAML section keys) to core concepts
- Moved "Change Tracking with Git" section before "Next Steps" in core-concepts
- Removed 59 file path comments from YAML code blocks
- Standardized validation rules to reference vocabularies with links
- Added
participantsto all event examples that were missing the required field - Enhanced VitePress sidebar - Core Concepts promoted to its own collapsible sidebar section with 8 direct anchor links
- Updated quickstart.md - Examples updated to reflect schema changes
- Updated best-practices.md - Assertion examples updated to use typed
subjectreference andpropertyfield
Fixed
Specification
- Fixed Place hierarchy example that used duplicate YAML top-level keys
- Fixed examples using incorrect field names throughout specification (
description→notes,value→notes,file:→uri:,death_year→died_on,married_on→born_on,residence_dates→residence,registration_district→district) - Fixed assertion example using invalid date format (
circa 1825→ABT 1825) - Removed undocumented
birth_surnamefrom person name example - Fixed broken anchor link in repository.md (
#repository-properties→#repository-properties-vocabulary) - Standardized all event examples to use
subjectrole consistently (replaced remainingprincipalusages) - Fixed Event
datefield type fromstring/objecttostring(object form was never documented) - Fixed Event See Also to say Person "participates in events" instead of "contains event references"
- Fixed broken relative links in
1-introduction.mdandspecification/README.md - Fixed
residencereference type example in2-core-concepts.mdto use temporal format - Added minimum participant count (at least 2) to relationship fields table
- Removed stale
Created AtandCreated Byglossary entries - Fixed glossary Event and Event Type definitions that incorrectly included occupation and residence
- Fixed labels: "Event/Fact" → "Event", "living status" → "birth/death dates"
- Replaced
living: trueboolean example with non-misleading property names - Replaced "occupation" with "immigration" as event type example in 3 locations
- Fixed Event key properties ("description" → "notes") and Media key properties ("file path" → "URI") in entity-types README
- Fixed place types count from 14 to 15; added missing
localityto place-types.glx standard vocabulary - Fixed vocabulary directory structure example in core-concepts
- Repository deduplication - Repositories with the same name and location are now deduplicated during import
- Dependency-ordered record processing - Records now grouped by type and processed in dependency order
- Repository-to-source linking - Sources now correctly link to their repository even when REPO records appear after SOUR records in the file
- NOTE reference resolution - Shared NOTE records now resolved to actual text content during import
- CONT/CONC text continuation - Long text fields spanning multiple lines now properly combined
- CR line ending support - GEDCOM files using CR-only line endings (old Mac Classic format) now import correctly
Code Quality & Robustness
unmarshalVocabnow returns error on missing YAML key - Previously silently returned nil when the expected top-level key was absent, causing downstream validation to think no vocabulary entries existappendMediaIDsafe type assertion - Now handles[]any(from YAML deserialization) instead of panicking on a bare type assertion to[]stringextensionFromMimeTypedeterministic output - MIME types with multiple extensions (.jpg/.jpeg,.tif/.tiff) now return a consistent preferred extension instead of random map iteration order- Directory emptiness check error handling -
isDirectoryEmptynow only treatsio.EOFas "empty", not all errors (permissions, I/O failures now properly reported) - Media file copy error handling -
copyMediaFilenow checksos.IsNotExistbefore fallback to URL-decoded paths, preserving original errors for permissions/disk issues - BLOB character validation -
decodeGEDCOMBlobnow validates characters are in valid GEDCOM BLOB range ('.' to 'm') before decoding, preventing silent corruption - EXID ID validation - GEDCOM external ID extraction now validates
idfield exists before use, skipping entries without usable IDs - Event Properties initialization -
extractEventDetailsnow ensuresevent.Propertiesmap is initialized before writing, preventing panics - Archive validation wiring -
LoadArchiveWithOptionsnow correctly passesschemaValidateflag to serializer for referential integrity validation - Property vocabulary documentation - Fixed
value_typeandreference_typefield requirements (marked "No*" instead of "Yes*" to match "exactly one required" constraint) - Test assertion completeness -
TestRunValidate_MediaFileMissingnow captures stdout and verifies warning is actually produced glx validatesingle file behavior - Validating a single file now only validates that file's structure instead of loading the entire current directory. Cross-reference validation is skipped for single files with a warning message. Directory validation still performs full cross-reference checks.
Removed
- Removed
glx check-schemasCLI command - Moved tomake check-schemasMakefile target; this is a repo-internal dev tool, not a user-facing command
Citation Entity
- Removed
data_date,page,locator, andtext_from_sourcedirect fields — consolidated intoproperties - Removed
citation,coverage, andcreatordirect fields (creatorconsolidated intoauthors)
Event Entity
- Removed
descriptionfield (useproperties.description) andtagsfield
0.0.0-beta.2 - 2025-11-25
Added
GEDCOM Import (lib)
- GEDCOM 5.5.1 support - Import standard GEDCOM 5.5.1 files
- GEDCOM 7.0 support - Import GEDCOM 7.0 with new features
- GEDCOM 5.5.5 support - Import GEDCOM 5.5.5 specification samples
- Two-pass conversion - Entities first, then families for proper relationship handling
- Evidence chain mapping - GEDCOM SOUR tags → GLX Citations → GLX Assertions
- Place hierarchy building - Parse place strings into hierarchical Place entities
- Geographic coordinates - Extract MAP/LATI/LONG coordinates from GEDCOM
- Shared notes - Support for both GEDCOM 7.0 SNOTE and GEDCOM 5.5.1 NOTE records
- External IDs - Import GEDCOM 7.0 EXID tags (wikitree, familysearch, etc.)
- Comprehensive test coverage - 33 GEDCOM test files (5.5.1, 5.5.5, 7.0) successfully imported
- Large file support - Tested with files containing thousands of persons and events
- Edge case handling - Empty families, self-marriages, same-sex marriages, unknown genders
- Character encoding support - ASCII, UTF-8, Windows CP1252 (CRLF and LF)
GLX Serializer (lib)
- Single-file serialization - Convert GLX archives to single YAML files
- Multi-file serialization - Entity-per-file structure with random IDs
- Archive loading - Load both single-file and multi-file GLX archives
- Vocabulary embedding - Embed standard vocabularies using go:embed
- Vocabulary loading from directory - Load vocabularies from multi-file archives
- ID generation - Random 8-character hex IDs for entity filenames
- EntityWithID wrapper - Preserve entity IDs in multi-file format using _id field
- Collision detection - Retry logic for filename generation
- Configurable validation - Optional validation before serialization
- 12 standard vocabularies embedded in binary
- Round-trip preservation - Single→Multi→Single conversions preserve all data
CLI Commands (glx)
glx import- Import GEDCOM files to GLX format- Single-file and multi-file output formats
- Optional vocabulary inclusion (default: true)
- Optional validation (default: true)
- Verbose mode with import statistics
- Supports both GEDCOM 5.5.1 and 7.0
glx split- Convert single-file GLX to multi-file format- Splits archive into entity-per-file structure
- Includes standard vocabularies
- Preserves entity IDs
glx join- Convert multi-file GLX to single-file format- Combines multi-file archive into single YAML
- Restores entity IDs from _id fields
Schema Enhancements
- Properties field added to 5 entity types for extensibility:
- Source - Store GEDCOM ABBR, EXID, custom tags
- Citation - Store event type cited, role, entry date
- Repository - Store FAX, additional contacts, EXID
- Media - Store crop coordinates, alternative titles, EXID
- Assertion - Store assertion metadata
- Backward compatible - Properties fields are optional with omitempty
Project Organization
.claude/plans/directory for all planning documentsCLAUDE.mdproject context guide for AI assistants- Plans README documenting all planning files and current status
- Moved all planning docs from
docs/to.claude/plans/
Vocabularies & Standards
- Developer documentation - GEDCOM import docs in
glx/lib/doc.go - User documentation - Updated Migration from GEDCOM Guide
- Automated import instructions
- Testing and validation procedures
- Import result expectations
Fixed
GEDCOM Import
- Malformed line recovery - Parser now handles MyHeritage export bug
- Recovers from NOTE fields with missing CONT/CONC prefixes
- Gracefully imports files with HTML-formatted notes
- Test case: queen.ged (4,683 persons, line 15903 missing CONT prefix)
- Family event handling - Added missing ANUL, DIVF, EVEN to case statement
- Place type references - Fixed gedcom_place.go to use "state" instead of "state_province"
Vocabularies
- Event types vocabulary - Fixed probate description ("Probate of estate" not "of will")
- Place types vocabulary - Removed duplicate state_province alias (use "state" instead)
- Schema categories - Updated allowed categories in vocabulary schemas
- Event types: Added "legal", "migration"; changed "custom" → "other"
- Place types: Added "institution"; changed "custom" → "other"
- Source types vocabulary - Added to embedded vocabularies (was missing)
Code Quality
- Clean architecture - Removed file I/O from library layer
- Moved importGEDCOMFromFile to test helpers (gedcom_test_helpers.go)
- CLI handles file operations, lib works with io.Reader
- Better separation of concerns
- File organization - Renamed gedcom_7_0.go → gedcom_shared.go (more accurate)
Testing & CI
- Multi-file vocabulary loading - Fixed LoadMultiFile to properly load vocabularies from directory
- Vocabulary preservation - Vocabularies now correctly preserved in round-trip conversions
- CI test coverage - Updated GitHub Actions to explicitly run all tests
- Large file tests (habsburg.ged: 34,020 persons)
- Added 15-minute timeout for comprehensive test runs
- No tests skipped in CI (no -short flag)
- Test documentation - Fixed queen.ged README with correct software attribution
- GEDCOM TITL handling - Now uses proper
PersonPropertyTitleconstant instead of hardcoded string - GEDCOM name fields - Only populate
name.fieldsfrom explicit GEDCOM substructure tags (GIVN, SURN, etc.), not inferred from parsing the name string - Test data consistency - All testdata files updated to use unified name format
Removed
Attribute Event Types
- Removed attribute-type events from schema - Events are now strictly discrete occurrences with participants
- Removed from event.schema.json enum:
residence,occupation,title,nationality,religion,education - Removed
censusfrom event-types.glx vocabulary - These attributes are now represented as temporal properties on Person entities
- Removed from event.schema.json enum:
- Removed CENS (Census) event handling - Census records are skipped during GEDCOM import (TODO: re-implement as citations supporting property assertions)
- Converted RESI (Residence) to temporal property - GEDCOM RESI tags now create temporal
residenceproperties on Person entities instead of events
Quality Ratings Support
- Removed
quality_ratingsvocabulary - The GEDCOM 0-3 Quality Assessment scale was removed from the GLX specification- Deleted
quality-ratings.glxvocabulary file - Deleted
quality-ratings.schema.jsonschema file - Removed
qualityfield from Citation entity - Removed
QualityRatingtype from Go code
- Deleted
- Removed auto-generated assertion confidence - GEDCOM imports no longer auto-populate assertion confidence levels
- Confidence levels should reflect researcher judgment, not be inferred from QUAY values
- GEDCOM QUAY tags are now preserved in citation notes (e.g.,
GEDCOM QUAY: 2)
Assertion Entity Fields
- Removed
evidence_typefield - Evidence quality classification belongs on citations, not assertions - Removed
typefield - Redundant withclaimfield andtagsfor categorization - Removed
research_notesfield - Consolidated into singlenotesfield
Provenance Fields (All Entities)
- Removed
modified_at,modified_by,created_at,created_byfields - Redundant with git history; usegit logandgit blameinstead
Changed
Person Properties Schema
- Unified
nameproperty - Replaced fragmented name properties with single unified property- Old: Separate
given_name,family_nameproperties - New: Single
nameproperty withvalueand optionalfieldsbreakdown - Format:
name: { value: "John Smith", fields: { given: "John", surname: "Smith" } } - Supports temporal lists for name changes over time
- Fields include:
prefix,given,nickname,surname_prefix,surname,suffix
- Old: Separate
- Added
titleproperty - Nobility or honorific titles (temporal, like occupation)- Properly handles GEDCOM TITL tag imports
- Added
PersonPropertyTitleconstant
Vocabulary Updates
- person_properties vocabulary - Updated to reflect unified name structure
nameproperty now includesfieldssub-schema for structured breakdown- Added
titleproperty definition
Other
- Documentation structure - Separated user docs (docs/) from planning docs (.claude/plans/)
Technical Details
GEDCOM Import Coverage:
- 100% critical features implemented
- 94% high-priority features implemented
- PRODUCTION-READY status
- Comprehensive gap analysis completed
Serializer Features:
- Uses crypto/rand for ID generation
- 32 bits of randomness per ID (4.3 billion possible values)
- Collision probability: ~1 in 400,000 with 10,000 entities
- EntityWithID wrapper pattern for multi-file format
- All 12 standard vocabularies embedded with go:embed
Testing:
- All existing tests passing
- 48 new test cases for serializer
- 33 GEDCOM files tested for import (100% coverage of test files)
- Full round-trip serialization/deserialization tests
- Vocabulary preservation tests for both single-file and multi-file formats
- Comprehensive unit and integration tests
- Large file stress tests (3000+ persons, 4000+ events)
0.0.0-beta.1 - 2025-11-18
Fixed
- Fixed GitHub release workflow to build on beta tags (
v*.*.*-beta*pattern) - Fixed VitePress build by adding
shikidependency towebsite/package.json
Changed
- Removed roadmap section from README (no longer maintaining public roadmap)
Removed
- Removed archive folder containing old planning documents
0.0.0-beta.0 - 2025-11-14
Added
Specification & Standards
- Complete GENEALOGIX specification defining modern, evidence-first genealogy data standard
- 9 core entity types with full JSON Schema definitions:
- Person (individuals with biographical properties)
- Relationship (family connections with types and dates)
- Event (life events with sources and locations)
- Assertion (evidence-backed claims with quality assessment)
- Citation (evidence references with source quotations)
- Source (primary/secondary evidence documentation)
- Repository (physical storage information)
- Place (geographic locations with coordinate data)
- Participant (individuals involved in events)
- Repository-owned controlled vocabularies for extensibility
- Git-native architecture for version control and collaboration
- YAML-based human-readable format with schema validation
CLI Tool (glx)
glx init: Initialize new GLX repositories with optional single-file modeglx validate: Comprehensive validation with:- Schema compliance checking against JSON Schemas
- Cross-reference integrity verification across all files
- Vocabulary constraint validation
- Detailed error reporting with file/line locations
glx check-schemas: Utility for verifying schema metadata and structure- Support for both directory-based and single-file archives
- Cross-file entity resolution and validation
Documentation & Examples
- Comprehensive specification documentation (6 core documents)
- Complete examples demonstrating various use cases:
- Minimal single-file archive
- Basic family structure with multiple generations
- Complete family with all entity types
- Participant assertions workflow
- Temporal properties and date ranges
- Development guides covering:
- Architecture and design decisions
- Schema development practices
- Testing framework and test suite structure
- Local development environment setup
- User guides including:
- Quick-start guide for new users
- Best practices and recommendations
- Common pitfalls and troubleshooting
- Manual migration guide for converting from GEDCOM format
- Glossary of key terms and concepts
Testing & Quality Assurance
- Comprehensive test suite with:
- Valid example fixtures demonstrating correct usage
- Invalid example fixtures testing error handling
- Cross-reference validation tests
- Vocabulary constraint tests
- Schema compliance validation tests
- Automated CI/CD pipeline using GitHub Actions
- Full code coverage reporting
Project Infrastructure
- Apache 2.0 open-source license
- Community guidelines and code of conduct
- Contributing guidelines for developers
- GitHub issue and discussion templates
- Development container configuration for consistent environments
- Pre-configured VitePress documentation site