API reference#

Complete reference for all public classes and functions in pykinbiont.

Data containers#

GrowthData#

class pykinbiont.GrowthData(curves, times, labels, clusters=None, centroids=None, wcss=None)[source]#

Bases: object

Container for growth curves at common time points.

Parameters:
  • curves (ndarray) – Shape (n_curves, n_timepoints), float64. Each row is one curve.

  • times (ndarray) – Shape (n_timepoints,), float64. Shared time grid.

  • labels (list[str]) – Identifier per curve, length n_curves.

  • clusters (Optional[ndarray]) – Cluster assignment per curve (1-based), shape (n_curves,). Populated by preprocess() when cluster=True. None until then.

  • centroids (Optional[ndarray]) – Per-cluster shape centroids in z-normalised space, shape (n_clusters, n_timepoints). Populated alongside clusters.

  • wcss (Optional[float]) – Within-cluster sum of squares from k-means. None until clustering runs.

curves: numpy.ndarray#
times: numpy.ndarray#
labels: list[str]#
clusters: numpy.ndarray | None = None#
centroids: numpy.ndarray | None = None#
wcss: float | None = None#
static from_csv(path)[source]#

Load from CSV. First column = times; remaining columns = curves.

Return type:

GrowthData

Parameters:

path (str)

static from_dataframe(df)[source]#

Load from DataFrame. First column = times; remaining = curves.

Return type:

GrowthData

Parameters:

df (pandas.DataFrame)

__getitem__(labels)[source]#

Return a new GrowthData with only the requested curves.

Return type:

GrowthData

Parameters:

labels (list[str])

IrregularGrowthData#

class pykinbiont.IrregularGrowthData(raw_curves, raw_times, labels, curves=None, times=None, step=0.01, clusters=None, centroids=None, wcss=None)[source]#

Bases: object

Growth curves with per-curve irregular time points.

Resampling to a shared [0,1] union grid is performed automatically at construction time (pure Python / numpy, no Julia required).

Parameters:
  • raw_curves (list[ndarray]) – Original OD values, one 1-D float64 array per curve.

  • raw_times (list[ndarray]) – Original (un-normalised) time points, one 1-D float64 array per curve.

  • labels (list[str]) – Identifier per curve.

  • curves (Optional[ndarray]) – Resampled matrix on the [0,1] union grid, shape (n_curves, n_grid). Set automatically — do not pass manually.

  • times (Optional[ndarray]) – The [0,1] union grid, shape (n_grid,). Set automatically — do not pass manually.

  • step (float) – Union grid resolution (default 0.01).

  • clusters (Optional[ndarray]) – Populated by preprocess() when cluster=True.

  • centroids (Optional[ndarray]) – Populated by preprocess() when cluster=True.

  • wcss (Optional[float]) – Populated by preprocess() when cluster=True.

raw_curves: list[numpy.ndarray]#
raw_times: list[numpy.ndarray]#
labels: list[str]#
curves: numpy.ndarray#
times: numpy.ndarray#
step: float#
clusters: numpy.ndarray | None#
centroids: numpy.ndarray | None#
wcss: float | None#
__getitem__(labels)[source]#
Return type:

IrregularGrowthData

Parameters:

labels (list[str])

Configuration#

FitOptions#

class pykinbiont.FitOptions(smooth=False, smooth_method='lowess', smooth_pt_avg=7, boxcar_window=5, lowess_frac=0.05, gaussian_h_mult=2.0, gaussian_time_grid=None, average_replicates=False, blank_subtraction=False, blank_value=0.0, blank_from_labels=False, correct_negatives=False, negative_method='remove', negative_threshold=0.01, scattering_correction=False, calibration_file='', scattering_method='interpolation', cut_stationary_phase=False, stationary_percentile_thr=0.05, stationary_pt_smooth_derivative=10, stationary_win_size=5, stationary_thr_od=0.02, cluster=False, n_clusters=3, cluster_trend_test=True, cluster_prescreen_constant=False, cluster_tol_const=1.5, cluster_q_low=0.05, cluster_q_high=0.95, cluster_exp_prototype=False, kmeans_n_init=10, kmeans_max_iters=300, kmeans_tol=1e-06, kmeans_seed=0, loss='RE', multistart=False, n_restart=50, aic_correction=True, pt_smooth_derivative=7, opt_params=<factory>)[source]#

Bases: object

All configuration for preprocessing and fitting.

Every field mirrors the Julia FitOptions struct exactly — the Julia docs serve as the authoritative reference for field semantics.

Parameters:
  • smooth (bool)

  • smooth_method (str)

  • smooth_pt_avg (int)

  • boxcar_window (int)

  • lowess_frac (float)

  • gaussian_h_mult (float)

  • gaussian_time_grid (numpy.ndarray | None)

  • average_replicates (bool)

  • blank_subtraction (bool)

  • blank_value (float)

  • blank_from_labels (bool)

  • correct_negatives (bool)

  • negative_method (str)

  • negative_threshold (float)

  • scattering_correction (bool)

  • calibration_file (str)

  • scattering_method (str)

  • cut_stationary_phase (bool)

  • stationary_percentile_thr (float)

  • stationary_pt_smooth_derivative (int)

  • stationary_win_size (int)

  • stationary_thr_od (float)

  • cluster (bool)

  • n_clusters (int)

  • cluster_trend_test (bool)

  • cluster_prescreen_constant (bool)

  • cluster_tol_const (float)

  • cluster_q_low (float)

  • cluster_q_high (float)

  • cluster_exp_prototype (bool)

  • kmeans_n_init (int)

  • kmeans_max_iters (int)

  • kmeans_tol (float)

  • kmeans_seed (int)

  • loss (str)

  • multistart (bool)

  • n_restart (int)

  • aic_correction (bool)

  • pt_smooth_derivative (int)

  • opt_params (dict)

smooth: bool = False#
smooth_method: str = 'lowess'#
smooth_pt_avg: int = 7#
boxcar_window: int = 5#
lowess_frac: float = 0.05#
gaussian_h_mult: float = 2.0#
gaussian_time_grid: numpy.ndarray | None = None#
average_replicates: bool = False#
blank_subtraction: bool = False#
blank_value: float = 0.0#
blank_from_labels: bool = False#
correct_negatives: bool = False#
negative_method: str = 'remove'#
negative_threshold: float = 0.01#
scattering_correction: bool = False#
calibration_file: str = ''#
scattering_method: str = 'interpolation'#
cut_stationary_phase: bool = False#
stationary_percentile_thr: float = 0.05#
stationary_pt_smooth_derivative: int = 10#
stationary_win_size: int = 5#
stationary_thr_od: float = 0.02#
cluster: bool = False#
n_clusters: int = 3#
cluster_trend_test: bool = True#
cluster_prescreen_constant: bool = False#
cluster_tol_const: float = 1.5#
cluster_q_low: float = 0.05#
cluster_q_high: float = 0.95#
cluster_exp_prototype: bool = False#
kmeans_n_init: int = 10#
kmeans_max_iters: int = 300#
kmeans_tol: float = 1e-06#
kmeans_seed: int = 0#
loss: str = 'RE'#
multistart: bool = False#
n_restart: int = 50#
aic_correction: bool = True#
pt_smooth_derivative: int = 7#
opt_params: dict#

ModelSpec#

class pykinbiont.ModelSpec(models, params, lower=None, upper=None)[source]#

Bases: object

Which models to fit and their initial parameters.

Parameters:
  • models (list[Any]) – List of AbstractGrowthModel instances.

  • params (list[list[float]]) – Initial parameter guess per model (empty list [] for LogLinModel/DDDEModel).

  • lower (Optional[list[Optional[list[float]]]]) – Per-model lower bounds; None slot means unconstrained for that model.

  • upper (Optional[list[Optional[list[float]]]]) – Per-model upper bounds; None slot means unconstrained for that model.

models: list[Any]#
params: list[list[float]]#
lower: list[list[float] | None] | None = None#
upper: list[list[float] | None] | None = None#

Models#

AbstractGrowthModel#

class pykinbiont.AbstractGrowthModel[source]#

Bases: object

Base class for all growth models. Do not instantiate directly.

NLModel#

class pykinbiont.NLModel(name, func=None, param_names=<factory>)[source]#

Bases: AbstractGrowthModel

Non-linear (closed-form) growth model.

Parameters:
  • name (str) – Unique identifier used in results.

  • func (Optional[Callable]) – Python callable with signature (p, t) -> y where p is a 1-D array of parameters and t is a scalar or 1-D array of time points. None for built-in registry models (Julia has the function).

  • param_names (list[str]) – Human-readable name for each parameter.

name: str#
func: Callable | None = None#
param_names: list[str]#

ODEModel#

class pykinbiont.ODEModel(name, func=None, param_names=<factory>, n_eq=1)[source]#

Bases: AbstractGrowthModel

ODE growth model (SciML in-place form f!(du, u, p, t)).

Parameters:
  • name (str) – Unique identifier used in results.

  • func (Optional[Callable]) – Python callable with signature f(du, u, p, t) (in-place). None for built-in registry models.

  • param_names (list[str]) – Human-readable name for each parameter.

  • n_eq (int) – Number of state equations.

name: str#
func: Callable | None = None#
param_names: list[str]#
n_eq: int = 1#

LogLinModel#

class pykinbiont.LogLinModel[source]#

Bases: AbstractGrowthModel

Sentinel for log-linear (exponential phase) fitting. No parameters.

DDDEModel#

class pykinbiont.DDDEModel(max_degree=4, lambda_min=-5.0, lambda_max=-1.0, lambda_step=0.5)[source]#

Bases: AbstractGrowthModel

Data-Driven Differential Equation discovery model (sparse regression).

Parameters:
  • max_degree (int) – Maximum polynomial degree in the candidate basis.

  • lambda_min (float) – log₁₀ of the minimum STLSQ sparsity threshold.

  • lambda_max (float) – log₁₀ of the maximum STLSQ sparsity threshold.

  • lambda_step (float) – Step in log₁₀ space between threshold values.

  • Note (requires DataDrivenDiffEq + DataDrivenSparse + ModelingToolkit in)

  • Julia. (the Julia environment. Manage these with Pkg.add in)

max_degree: int = 4#
lambda_min: float = -5.0#
lambda_max: float = -1.0#
lambda_step: float = 0.5#

MODEL_REGISTRY#

pykinbiont.MODEL_REGISTRY = {}#

dict subclass that populates itself from Julia on first access.

Results#

GrowthFitResults#

class pykinbiont.GrowthFitResults(data, results)[source]#

Bases: object

Top-level result returned by fit().

Parameters:
data: GrowthData | IrregularGrowthData#
results: list[CurveFitResult]#
to_dataframe()[source]#

Summary table: one row per curve with best model, AIC, params.

Return type:

DataFrame

__iter__()[source]#
Return type:

Iterator[CurveFitResult]

__len__()[source]#
Return type:

int

__getitem__(i)[source]#
Return type:

CurveFitResult

Parameters:

i (int)

CurveFitResult#

class pykinbiont.CurveFitResult(label, best_model, best_params, param_names, best_aic, fitted_curve, times, loss, all_results)[source]#

Bases: object

Fitting result for a single growth curve.

Parameters:
label: str#
best_model: str#
best_params: list[float]#
param_names: list[str]#
best_aic: float#
fitted_curve: numpy.ndarray#
times: numpy.ndarray#
loss: float#
all_results: list[dict]#

Functions#

fit#

pykinbiont.fit(data, spec, opts=None)[source]#

Fit every curve in data to every model in spec and select best by AICc.

Preprocessing (smoothing, blank subtraction, clustering, …) is applied according to opts before fitting.

Parameters:
  • data (Union[GrowthData, IrregularGrowthData]) – GrowthData or IrregularGrowthData container.

  • spec (ModelSpec) – Models to try and their initial parameters.

  • opts (Optional[FitOptions]) – All preprocessing and fitting configuration. Defaults to FitOptions() (no preprocessing, log-linear model selection).

Returns:

One CurveFitResult per curve plus the post-preprocessing GrowthData (with cluster assignments if clustering was requested).

Return type:

GrowthFitResults

preprocess#

pykinbiont.preprocess(data, opts)[source]#

Apply the preprocessing pipeline to data and return a new GrowthData.

When opts.cluster=True, the returned GrowthData has .clusters, .centroids, and .wcss populated.

Parameters:
Returns:

Preprocessed data (new instance, input is never modified).

Return type:

Union[GrowthData, IrregularGrowthData]

save_results#

pykinbiont.save_results(results, dir, prefix='kinbiont')[source]#

Write fitting results to CSV files inside dir.

Three files are written: - <prefix>_summary.csv: one row per curve (best model, AICc, params). - <prefix>_fitted_curves.csv: long-format (label, time, observed, fitted). - <prefix>_all_models.csv: one row per (curve, candidate model).

Parameters:
  • results (GrowthFitResults) – GrowthFitResults returned by fit().

  • dir (str) – Directory path (created if it does not exist).

  • prefix (str) – File name prefix. Default "kinbiont".

Return type:

dict[str, str]

Returns:

  • dict with keys "summary", "fitted_curves", "all_models" pointing

  • to the written file paths.

configure#

pykinbiont.configure(project_path)[source]#

Point pykinbiont at a local KinBiont.jl development clone.

Registers the local directory as a dev package with juliapkg so that Julia will load your source tree instead of the registry-installed version. The path is persisted to ~/.config/pykinbiont/config.json and applied automatically on every subsequent Python / kernel start, so you only need to call this once per machine.

Important

If Julia is already running in the current kernel this call has no immediate effect — restart the kernel so juliacall can pick up the new environment before calling fit() or preprocess().

Parameters:

project_path (str) – Path to a local KinBiont.jl source directory.

Return type:

None

init#

pykinbiont.init(project_path=None)[source]#

Start Julia and load Kinbiont.

Calling this explicitly is optional — all conversion and fitting functions trigger it automatically on first use.

Parameters:

project_path (Optional[str]) – Optional path to a local KinBiont.jl directory. Equivalent to calling configure(project_path) before init().

Return type:

object

get_jl#

pykinbiont.get_jl()[source]#

Return the Julia Main module, auto-initialising if needed.

Return type:

object