FastEHR.dataloader.tokenizers_local.base

Classes

TokenizerBase

Base class for custom tokenizers

Module Contents

class FastEHR.dataloader.tokenizers_local.base.TokenizerBase

Base class for custom tokenizers

property vocab_size
property fit_description
static event_frequency(meta_information, include_measurements=True, include_diagnoses=True) polars.DataFrame

Get polars dataframe with three columns: event, count and relative frequencies

Returns ┌──────────────────────────┬─────────┬───────────┐ │ EVENT ┆ COUNT ┆ FREQUENCY │ │ — ┆ — ┆ — │ │ str ┆ u32 ┆ f64 │ ╞══════════════════════════╪═════════╪═══════════╡ │ <event name 1> ┆ n1 ┆ p1 │ │ <event name 2> ┆ n2 ┆ p2 │ │ … ┆ … ┆ … │ └──────────────────────────┴─────────┴───────────┘

abstract fit(event_counts: polars.DataFrame, **kwargs)
encode(sequence: list[str])

Take a <> of strings, output a list of integers

decode(sequence: list[str])