FastEHR.database.collector ========================== .. py:module:: FastEHR.database.collector Classes ------- .. autoapisummary:: FastEHR.database.collector.SQLiteDataCollector Module Contents --------------- .. py:class:: SQLiteDataCollector(db_path: str) A class to interface with an SQLite database to collect and collate patient records. This class provides functionality for extracting structured patient data, aggregating medical events, and computing metadata for pre-processing from an SQLite database. Inherits from: * :class:`Static` - Handles static patient data, such as birth year and ethnicity. * :class:`Diagnoses` - Handles diagnosis-related records. * :class:`Measurements` - Handles event-based measurements, which may optionally include an associated value. Attributes ---------- db_path : str Path to the SQLite database file. connection : sqlite3.Connection. SQLite connection object, initialized when `connect()` is called. cursor : sqlite3.Cursor. Cursor for executing SQL queries. Methods ------- connect() Establish the SQLite database connection. disconnect() Close the SQLite database connection. _extract_distinct() Extracts distinct values of a given column across multiple tables. _extract_AGG() Performs grouped aggregations over tables. _t_digest_values() Uses the `t-digest algorithm to approximate percentiles of a given measurement. ` _generate_lazy_by_distinct() Generates Polars LazyFrames for distinct patient or practice identifiers. _collate_lazy_tables() Merges static and dynamic patient records into a single LazyFrame. get_meta_information() Collects metadata from the SQLite database, including distributions of diagnoses and measurements. Initializes the SQLiteDataCollector. Parameters ---------- db_path : str Path to the SQLite database file. .. py:attribute:: db_path .. py:attribute:: connection :value: None .. py:attribute:: cursor :value: None .. py:method:: connect() Establishes a connection to the SQLite database. If the connection is already established, this method does nothing. Raises ------ sqlite3.Error If an error occurs while connecting to the database. .. py:method:: disconnect() Closes the SQLite database connection. This method ensures that both the connection and cursor are properly closed. .. py:method:: get_meta_information(practice_ids: Optional[list] = None, static: bool = True, diagnoses: bool = True, measurement: bool = True) -> dict Collects metadata from the SQLite database, such as distributions of diagnoses and measurements. Parameters ---------- practice_ids : list, optional List of practice IDs to filter metadata collection (default is None). static : bool, optional Whether to collect static patient information (default is True). diagnoses : bool, optional Whether to collect diagnosis-related metadata (default is True). measurement : bool, optional Whether to collect measurement-related metadata (default is True). Returns ------- dict A dictionary containing metadata tables.