Loading data#
pykinbiont represents growth curves as GrowthData (shared time grid)
or IrregularGrowthData (per-curve time points).
GrowthData#
GrowthData stores a matrix of n_curves × n_timepoints OD values at a shared time grid.
This matches the typical output of a plate reader.
From a CSV file#
The expected CSV layout is: first column = time, remaining columns = wells/curves, with column headers used as labels:
time,Well_A1,Well_A2,Well_B1
0.0,0.012,0.011,0.010
0.5,0.014,0.013,0.012
...
from pykinbiont import GrowthData
data = GrowthData.from_csv("plate_reader.csv")
print(f"{len(data.labels)} curves, {len(data.times)} time points")
From a pandas DataFrame#
import pandas as pd
from pykinbiont import GrowthData
df = pd.read_csv("plate_reader.csv")
data = GrowthData.from_dataframe(df)
from_dataframe treats the first column as time regardless of its name.
From NumPy arrays#
import numpy as np
from pykinbiont import GrowthData
times = np.linspace(0, 20, 100) # shape (100,)
curves = np.stack([curve_A1, curve_A2]) # shape (2, 100)
data = GrowthData(
curves=curves,
times=times,
labels=["A1", "A2"],
)
The curves array must be 2-D with shape (n_curves, n_timepoints).
Subsetting#
Select a subset of wells by label:
subset = data[["Well_A1", "Well_B1"]]
print(subset.labels) # ["Well_A1", "Well_B1"]
IrregularGrowthData#
Use IrregularGrowthData when curves have different time points (e.g., multiple experiments
merged, or manual sampling at unequal intervals). pykinbiont automatically resamples all curves
onto a shared [0, 1] union grid via linear interpolation.
import numpy as np
from pykinbiont import IrregularGrowthData
# Each curve has its own time vector
times_A = np.array([0.0, 1.0, 2.5, 5.0, 10.0, 20.0])
times_B = np.array([0.0, 0.5, 1.0, 2.0, 4.0, 8.0, 16.0])
od_A = np.array([0.01, 0.02, 0.05, 0.20, 0.80, 1.10])
od_B = np.array([0.01, 0.015, 0.03, 0.08, 0.35, 0.90, 1.15])
igd = IrregularGrowthData(
raw_curves=[od_A, od_B],
raw_times=[times_A, times_B],
labels=["Strain_A", "Strain_B"],
step=0.01, # union grid resolution in normalised [0,1] time
)
print(igd.curves.shape) # (2, n_grid) — resampled
print(igd.times[:5]) # normalised [0,1] grid
The original data is preserved in igd.raw_curves and igd.raw_times.
fit() and preprocess() accept IrregularGrowthData directly.
Accessing arrays#
Attribute |
Shape |
Description |
|---|---|---|
|
|
OD matrix (read-only) |
|
|
Shared time grid (read-only) |
|
|
Curve identifiers |
|
|
Cluster assignments (after |
|
|
Cluster centroids (after |
|
|
Within-cluster sum of squares |