# Loading data pykinbiont represents growth curves as {py:class}`~pykinbiont.GrowthData` (shared time grid) or {py:class}`~pykinbiont.IrregularGrowthData` (per-curve time points). ## GrowthData `GrowthData` stores a matrix of `n_curves × n_timepoints` OD values at a **shared** time grid. This matches the typical output of a plate reader. ### From a CSV file The expected CSV layout is: **first column = time**, remaining columns = wells/curves, with column headers used as labels: ``` time,Well_A1,Well_A2,Well_B1 0.0,0.012,0.011,0.010 0.5,0.014,0.013,0.012 ... ``` ```python from pykinbiont import GrowthData data = GrowthData.from_csv("plate_reader.csv") print(f"{len(data.labels)} curves, {len(data.times)} time points") ``` ### From a pandas DataFrame ```python import pandas as pd from pykinbiont import GrowthData df = pd.read_csv("plate_reader.csv") data = GrowthData.from_dataframe(df) ``` `from_dataframe` treats the first column as time regardless of its name. ### From NumPy arrays ```python import numpy as np from pykinbiont import GrowthData times = np.linspace(0, 20, 100) # shape (100,) curves = np.stack([curve_A1, curve_A2]) # shape (2, 100) data = GrowthData( curves=curves, times=times, labels=["A1", "A2"], ) ``` The `curves` array must be 2-D with shape `(n_curves, n_timepoints)`. ### Subsetting Select a subset of wells by label: ```python subset = data[["Well_A1", "Well_B1"]] print(subset.labels) # ["Well_A1", "Well_B1"] ``` ## IrregularGrowthData Use `IrregularGrowthData` when curves have **different time points** (e.g., multiple experiments merged, or manual sampling at unequal intervals). pykinbiont automatically resamples all curves onto a shared `[0, 1]` union grid via linear interpolation. ```python import numpy as np from pykinbiont import IrregularGrowthData # Each curve has its own time vector times_A = np.array([0.0, 1.0, 2.5, 5.0, 10.0, 20.0]) times_B = np.array([0.0, 0.5, 1.0, 2.0, 4.0, 8.0, 16.0]) od_A = np.array([0.01, 0.02, 0.05, 0.20, 0.80, 1.10]) od_B = np.array([0.01, 0.015, 0.03, 0.08, 0.35, 0.90, 1.15]) igd = IrregularGrowthData( raw_curves=[od_A, od_B], raw_times=[times_A, times_B], labels=["Strain_A", "Strain_B"], step=0.01, # union grid resolution in normalised [0,1] time ) print(igd.curves.shape) # (2, n_grid) — resampled print(igd.times[:5]) # normalised [0,1] grid ``` The original data is preserved in `igd.raw_curves` and `igd.raw_times`. `fit()` and `preprocess()` accept `IrregularGrowthData` directly. ## Accessing arrays | Attribute | Shape | Description | |---|---|---| | `data.curves` | `(n, T)` | OD matrix (read-only) | | `data.times` | `(T,)` | Shared time grid (read-only) | | `data.labels` | `list[str]` | Curve identifiers | | `data.clusters` | `(n,)` or `None` | Cluster assignments (after `preprocess`) | | `data.centroids` | `(k, T)` or `None` | Cluster centroids (after `preprocess`) | | `data.wcss` | `float` or `None` | Within-cluster sum of squares |