Overview
TimeDataModel is a lightweight Python library for working with time series data. It provides
a metadata-rich container whose underlying DataFrame is optional — the same TimeSeries class
covers both data-bearing series and metadata-only declarations. Backed by Polars
internally and fully interoperable with pandas, NumPy, Polars, and PyArrow — together with enums
and geographic types for annotating your data.
The library is designed for energy, weather, and forecasting workflows where data arrives from many sources at different times — but it is general enough for any domain that works with timestamped numerical data.
Core data structures
TimeSeries
A single univariate time series. The underlying Polars DataFrame always has a "value" column
and one or more timestamp columns whose layout is determined by the inferred DataShape
(see below).
import pandas as pd
from timedatamodel import TimeSeries, Frequency
df = pd.DataFrame({
"valid_time": pd.date_range("2024-01-01", periods=48, freq="h", tz="UTC"),
"value": [float(i) for i in range(48)],
})
ts = TimeSeries.from_pandas(
df,
frequency=Frequency.PT1H,
name="wind_power",
unit="MW",
)
Supported metadata fields: name, unit, data_type, timeseries_type, frequency,
timezone, description.
Key operations:
Operation |
Example |
|---|---|
Slicing |
|
Unit conversion |
|
Format export |
|
Format import |
|
Validation |
|
Metadata-only TimeSeries
Construct a TimeSeries with df=None to declare the structure of a series before
any data exists — useful for cataloging or registering series in advance. All metadata
fields (name, unit, data_type, timeseries_type, frequency, timezone,
description) are available; data-bearing methods (to_pandas, head, convert_unit,
…) raise ValueError. Use ts.has_df to check.
from timedatamodel import TimeSeries, DataType
# Metadata-only — no DataFrame yet.
ts = TimeSeries(
name="wind_power",
unit="MW",
data_type=DataType.FORECAST,
)
assert ts.has_df is False
assert ts.shape is None
# Later, once data arrives, build a fresh data-bearing instance.
ts_with_data = TimeSeries(df, name=ts.name, unit=ts.unit, data_type=ts.data_type)
DataShape is inferred from the DataFrame at construction time; metadata-only
instances have shape is None.
Identity-shaped concepts (labels, tags) and spatial context (locations) belong to
consumer layers (e.g. energydatamodel, energydb), not to time-series metadata
itself.
Data shapes
The DataShape enum controls which timestamp columns are present in the underlying DataFrame.
This lets a single class cover standard time series as well as versioned and audit-trail
patterns:
Shape |
Columns |
Use case |
|---|---|---|
|
|
Standard time series |
|
|
Record when each value was produced |
|
|
Track when a value was revised |
|
|
Full audit trail |
All timestamp columns are stored as pl.Datetime("us", time_zone="UTC").
Frequency
The Frequency enum defines ISO 8601 duration values. Three are calendar-based (their
exact duration depends on the calendar); the rest are fixed-interval:
Value |
Description |
Calendar-based |
|---|---|---|
|
1 year |
yes |
|
3 months (quarter) |
yes |
|
1 month |
yes |
|
1 week |
|
|
1 day |
|
|
1 hour |
|
|
30 minutes |
|
|
15 minutes |
|
|
10 minutes |
|
|
5 minutes |
|
|
1 minute |
|
|
1 second |
|
|
No fixed frequency |
DataType taxonomy
Every time series can carry a DataType annotation describing the nature of the data. The
taxonomy has two roots and 10 leaf types:
ACTUAL
├── OBSERVATION ── directly measured or recorded values
└── DERIVED ────── computed from observations (e.g. aggregated measurements)
CALCULATED
├── ESTIMATION
│ ├── FORECAST ─────── future predictions with a known issue time
│ ├── PREDICTION ───── statistical/ML predictions (general)
│ ├── SCENARIO ─────── what-if analyses under assumed conditions
│ ├── SIMULATION ───── outputs from physical or numerical models
│ └── RECONSTRUCTION ─ hindcasts or gap-filled historical data
└── REFERENCE
├── BASELINE ──────── reference scenario for comparison
├── BENCHMARK ─────── industry-standard or best-practice reference
└── IDEAL ─────────── theoretical optimum or design values
Each DataType member exposes .parent, .children, .is_leaf, and .root properties.
Geographic types
Geographic primitives are exported from the package for consumer layers (e.g.
energydatamodel) that need to attach spatial context to entities. They are no longer
attached to TimeSeries itself.
GeoLocation — a single point defined by latitude and longitude:
from timedatamodel import GeoLocation
loc = GeoLocation(latitude=59.33, longitude=18.07) # Stockholm
Provides distance_to() (Haversine), bearing_to(), midpoint(), and offset().
GeoArea — a polygon region (requires the [geo] extra):
from timedatamodel import GeoArea
area = GeoArea.from_coordinates([(59.0, 17.5), (59.0, 18.5), (59.5, 18.5), (59.5, 17.5)])
Provides contains_point(), overlaps(), centroid, and bounding_box().
Other types
DataPoint
A NamedTuple with two fields: timestamp (a datetime) and value (a float or None).
TimeSeriesType
Describes the index structure of a time series:
FLAT— standard non-overlapping timestampsOVERLAPPING— overlapping windows (e.g. rolling forecasts with a(issue_time, valid_time)multi-index)
Next steps
Basic Usage — quick-start examples for common operations
Examples — hands-on notebooks
API Reference — full API reference