Changes in version 0.3.8 Bug fixes - brreg_sync() now populates paategninger state. The enheter bulk download (CSV) only carries påtegninger as a boolean presence flag, not the annotation content, so extract_paategninger() previously always produced empty state and brreg_annotations() returned nothing after a sync. The bootstrap now reads the flag (column annotations) and fetches the actual annotation content per flagged entity from the enheter endpoint. New internal helper fetch_entity_paategninger(). Changes in version 0.3.7 New features - New bundled dataset [annotation_infotypes]: maps brreg påtegning infotype codes to English descriptions. Sourced from the brreg API reference (NAVN, FADR) and codes observed in live data (role codes used for missing-role annotations); unknown codes pass through. - brreg_annotations() gains a translate argument. With translate = TRUE an infotype_desc column with English descriptions (from annotation_infotypes) is added after infotype. Default FALSE, so existing behaviour is unchanged. Documentation - field_dict documentation now lists numeric among the coercion types (added in 0.3.6 for capital_shares). Changes in version 0.3.6 Bug fixes - field_dict: capital_shares (kapital.antallAksjer) is now typed numeric instead of integer. The share count exceeds the 32-bit integer range for large-cap entities (e.g. Equinor ASA, 2,556,807,512 shares), so coerce_types() silently produced NA with an "NAs introduced by coercion to integer range" warning. It now retains the value, consistent with the other kapital.* fields. - Bundled role_types and role_groups: Norwegian role names (Observatør, Regnskapsfører, Forretningsfører, FFØR, Helse, miljø og sikkerhet) are now stored as UTF-8 escapes in data-raw/build_dictionaries.R and saved with UTF-8 encoding marking. String values are byte-identical to before; this only clears the R CMD check "non-ASCII strings" data warning. Internal - read_changelog(): the arrow branch now references the partition column as .data$sync_date and imports the rlang::.env pronoun, removing the "no visible binding for global variable" check note. Runtime behaviour is unchanged. - Tests: field_dict invariants aligned with the v0.3.5 dictionary — api_path is the unique key (multiple API-spelling variants map to a single col_name), and numeric is an accepted type. Changes in version 0.3.4 Roller CDC: field-level change detection - bootstrap_state() now initializes the CDC cursor to the current tip (max event ID) at bootstrap time via get_cdc_tip(). Previously the cursor remained at 0 after bootstrap, causing the first CDC poll to replay the entire event history (~4.4M roller events, ~24M enheter events). This caused OOM and timeout failures on Cloud Run. - bootstrap_state(roller_method = "cdc") skips the roller totalbestand download entirely. Writes an empty state table and builds state incrementally via per-org brreg_roles() calls. Reduces bootstrap from 32 GiB RAM / 30+ minutes to <10 seconds and negligible memory. - paginate_cdc() gains a max_pages parameter and a safety guard: if cursor_id == 0 (no prior sync), pagination caps at 5 pages and emits a cli::cli_warn(). Belt-and-suspenders — should never trigger after a correct bootstrap. - diff_roller_state() (new, exported) — computes field-level diffs between two flattened roller state tibbles. Returns a long-format changelog with change_type (entry/exit/change), field, value_from, value_to. Roles are keyed by a composite of (org_nr, role_group_code, role_code, holder_id) where holder_id is derived from person_id (person-held) or entity:{org_nr} (entity-held). - brreg_sync(roller_method = "bulk") — new default strategy for roller CDC. Downloads the full totalbestand (~131 MB), diffs against stored state, and writes a field-level changelog. Replaces the per-org API pattern for full-register syncs. - brreg_sync(roller_method = "cdc") — per-org API fallback for sub-daily syncs. Fetches current roles via brreg_roles() for each CDC event, diffs per-org. Slower but provides per-event timestamp attribution. - flatten_roles() gains 4 new columns: deregistered (avregistrert), ordering (rekkefolge), elected_by (valgtAv$kode), group_modified (sistEndret as Date). - brreg_board_summary() now excludes resigned and deregistered roles from all counts and gains n_employee_elected (count of board members with a non-NA elected_by value). Performance - flatten_roles_bulk_fast() (internal) — vectorized two-pass flatten for bulk totalbestand. Pre-allocates vectors and fills by index. 4.1× faster than the per-entity flatten_roles() path (4,192 vs 1,028 roles/sec). - read_roles_json() (internal) — dispatches to yyjsonr when available (10× parse speed, 70× lower memory vs jsonlite). yyjsonr added to Suggests. - lookup_role_vec() and lookup_role_group_vec() (internal) — vectorized code-to-label lookups replacing per-row match(). Bug fixes - extract_entity_name() no longer returns NA for entity-held roles. The brreg API returns enhet.navn as a JSON array ["ERNST & YOUNG AS"], not a named object. jsonlite parses this as an unnamed list, which the old code did not handle. Added unnamed list branch. - read_roles_json() now decompresses .gz files to a temp file before passing to yyjsonr. yyjsonr cannot read gzipped files directly; the previous code crashed with a buffer allocation error on the 131 MB totalbestand. - paginate_cdc_bounded() (internal) caps roller CDC pagination at 5 pages (50K events) when using roller_method = "bulk". The previous unbounded paginate_cdc() fetched the entire CDC history (1.1M+ events) from cursor 0 on first bootstrap, causing 30-minute timeouts. - parse_sync_page() no longer produces tibble column size mismatches when CDC pages contain events without endringer (Ny/Sletting). raw_changes[[i]] <- NULL was deleting list elements instead of preserving NULL placeholders (R double-bracket assignment semantics). Fix: list() as empty placeholder. Affected all enheter and underenheter sync since v0.3.2. - add_role_key() no longer crashes on 0-row tibbles. Previously, case_when(df$person_id ...) received NULL instead of NA when passed a 0-column tibble from a 404 API response. - apply_roller_events_cdc() skips diff_roller_state() when both old and new state are empty (entity not in state AND 404 from API). Changes in version 0.3.3 Bug fixes - brreg_update_fields() no longer silently drops Ny, Sletting, and Fjernet CDC events. Events with no endringer array now emit a synthetic row with operation = NA, field = NA, new_value = NA, preserving event metadata for downstream filtering and counting. Previously, filtering brreg_update_fields() output for change_type == "Ny" returned zero rows. - flatten_page_patches() and parse_patch() now handle RFC 6902 move operations correctly: the value is written to the destination path and a synthetic remove row is emitted for the source path (from $from). copy operations already worked but now follow the same explicit dispatch path. - Stale roxygen docstring for brreg_update_fields() removed references to RcppSimdJson and parallel processing (both removed in 0.3.2). Documentation now accurately describes the sequential fetch-and-flatten loop and the synthetic row behaviour for Ny/ Sletting/Fjernet events. Field dictionary - field_dict grows from 62 to 70 rows. New entries: - fravalgRevisjonDato → audit_exemption_date (Date) - fravalgRevisjonBeslutningsDato → audit_exemption_decision_date (Date) - registreringsdatoMerverdiavgiftsregisteretEnhetsregisteret → vat_registration_date_er (Date) - registreringsdatoAntallAnsatteEnhetsregisteret → employee_reg_date_er (Date) - registreringsdatoAntallAnsatteNavAaregisteret → employee_reg_date_nav (Date) - oppstartsdato → start_date (Date) — underenhet operations start date - registrertIPartiregisteret → in_party_register (logical) - respons_klasse → response_class (character) — API response metadata class Sync engine - find_state_column() gains mappings for all 8 new field_dict entries. CDC field changes for audit exemption dates, employee registration dates, VAT registration date in Enhetsregisteret, underenhet start dates, and party register membership are no longer silently skipped during brreg_sync(). Changes in version 0.3.2 Event-sourcing sync engine - brreg_sync() — maintains a local mirror of the Enhetsregisteret by applying incremental CDC events to persistent parquet state files. On first run, bootstraps from bulk download. Subsequent runs poll from the last cursor position. Write ordering (changelog → state → cursor) ensures crash-safe idempotent replay. - brreg_sync_status() — displays state file sizes, cursor positions, last sync time, and changelog partition count. - Four state files maintained: enheter.parquet, underenheter.parquet, roller.parquet, paategninger.parquet. - Hive-partitioned changelog under state/changelog/sync_date=.../ for efficient date-range queries via arrow::open_dataset(). Registry annotations (påtegninger) - brreg_annotations() — query the påtegninger state table by org_nr and/or infotype code. Påtegninger are registry-level annotations about entity data quality — the earliest formal signal of entity distress, preceding forced dissolution by weeks to months. - brreg_annotation_summary() — count entities with active annotations grouped by infotype. - Påtegninger treated as a conceptually distinct fourth data stream alongside enheter, underenheter, and roller. Unified change tracking - brreg_changes() — query the changelog for field-level mutations across all four streams. Filter by track (field names), registry, change_type, date range, and org_nr. - brreg_change_summary() — count changes by registry, type, field. - brreg_flows() now auto-detects the changelog when called with no arguments: brreg_flows() reads from the sync changelog, brreg_flows(data) uses the original bulk + CDC path. Changes in version 0.3.1 New functions - brreg_network() — build entity ego-network graphs as tbl_graph objects. Depth 0 (seed only), depth 1 (sub-units, children, roles, legal roles via API), depth 2 (board interlocks via local bulk data). Extensible collector pattern for future relationship types. - brreg_underenheter() — convenience wrapper to get all sub-units (BEDR/AAFY) belonging to a parent entity. - brreg_children() — get child enheter in the organisational hierarchy (e.g. Stortinget → Riksrevisjonen). - brreg_status() — check local bulk data availability for all three registry types. Changes - brreg_entity() now defaults to registry = "auto", trying enheter first then falling back to underenheter on 404. Output gains a registry column. Explicit registry = "enheter" or registry = "underenheter" skips the fallback. - Bulk data resolution uses Arrow lazy-load for all three types (was roller-only). Session cache in .brregEnv avoids re-reading parquet files across repeated calls. Per-type lazy pipeline in depth-2 expansion early-exits when no new entities are discovered. Infrastructure - Docker CI matrix simplified to R 4.4.1 only (R 4.3.3 image was never built; multi-version coverage via standard R-CMD-check). Changes in version 0.3.0 Documentation - pkgdown site with 10 reference groups deployed to GitHub Pages. - 5 vignettes: Getting started, Norwegian business data, Building firm panels, Corporate governance research, Package architecture. - ARCHITECTURE.md (390 lines) documenting full data flow. - CONTRIBUTING.md, CODE_OF_CONDUCT.md, GitHub issue templates. - Hex sticker logo at man/figures/logo.svg. - Lifecycle experimental badge in README. - r-universe registration for binary installs. - Install instructions updated: pak (recommended), r-universe, remotes. Test coverage - New test files for brreg_manifest(), brreg_replay(), brreg_series(), and as_brreg_tsibble(). Changes in version 0.2.0 New features Snapshot engine - brreg_snapshot() downloads and saves dated bulk register extracts as Hive-partitioned Parquet files. Supports type = "enheter", "underenheter", and "roller" (via /roller/totalbestand). Raw .gz files are preserved alongside processed Parquet for provenance. - brreg_import() adds user-supplied historical CSVs as snapshot partitions, normalizing column names via field_dict. - brreg_snapshots() lists available snapshots with dates, sizes, and paths. - brreg_open() opens the partitioned dataset as a lazy Arrow Dataset (requires the arrow package). - brreg_data_dir() returns the snapshot store path. Override with options(brreg.data_dir = "/custom/path"). - brreg_cleanup() prunes old partitions by count or age. Provenance manifest - brreg_manifest() reads the JSON provenance catalog recording every download: endpoint URL, download timestamp, Last-Modified header (data vintage date), ETag, file hash, record count, and file paths. Used to bridge snapshots to CDC updates without gaps. Panel construction - brreg_panel() constructs firm x period panels at annual, quarterly, monthly, or custom cadence from accumulated snapshots. Uses LOCF (last observation carried forward) date resolution. - brreg_replay() reconstructs register state at arbitrary dates from a base snapshot + CDC update stream. Applies Ny/Endring/Sletting events chronologically via dplyr::rows_upsert() pattern. - brreg_events() diffs two snapshots: entries, exits, field changes with both old and new values. - brreg_series() computes aggregate time series for any combination of variables (.vars), summary functions (.fns), and grouping columns (by). Output columns named {variable}_{function}. - as_brreg_tsibble() converts panel or series output to tsibble with regular = FALSE for the tidyverts ecosystem. Entity and sub-entity access - brreg_entity() and brreg_search() gain registry = "underenheter" parameter for sub-entity (establishment) lookups and search. - brreg_roles_legal() performs reverse role lookup: what roles does entity X hold in other entities (parent, shareholder, partner). Bulk downloads - brreg_download() supports type = "roller" via /roller/totalbestand (131 MB gzipped JSON). - brreg_download(format = "json") downloads the full JSON bulk for enheter/underenheter via /enheter/lastned. - Algorithmic JSON unnesting flattens all list columns to atomic types (character vectors collapsed, data frames serialized, HAL links dropped). Shared rename_and_coerce() pipeline for both CSV and JSON paths. Concordance - brreg_harmonize_kommune() remaps municipality codes across Norway's 2020 municipal reform and 2024 county reversals using SSB KLASS. - brreg_harmonize_nace() remaps NACE SN2007 to SN2025 via SSB KLASS. Governance research - brreg_board_network() builds director interlock networks as tbl_graph objects (requires tidygraph). - brreg_survival_data() prepares firm survival data with time-to-event and right-censoring indicators compatible with survival::Surv(). CDC updates - brreg_updates(type = "roller") fetches role change events in CloudEvents format (different schema from enheter/underenheter). Dependency changes - Added jsonlite to Imports (roller bulk JSON parsing). - Added nanoparquet (>= 0.3.0), tsibble to Suggests. - Removed igraph, plm, fixest, collapse, sf, ggraph from Suggests. Changes in version 0.1.0 - Initial release. - Entity lookup (brreg_entity()), filtered search (brreg_search()), board/officer roles (brreg_roles(), brreg_board_summary()). - Full register bulk download (brreg_download()). - Incremental CDC updates (brreg_updates()). - Code-to-label translation (brreg_label(), get_brreg_dic()). - Organization number validation (brreg_validate()). - Reference datasets: field_dict, legal_forms, role_types, role_groups.