Helping functions for Forecasting¶

`helpers`¶

helper methods for forecasting

approximate_index(dataset, findvalue)[source]¶

Return index value in dataset, with optimized find procedure. This assumes a dataset with continuous, increasing values. Typically, these are timestamps.

Parameters:	dataset (list) – a continuous list of values (f.e. timestamps) findvalue (int) – the value, of which to find the index.

cached_data(name, data_function=None, max_age=0)[source]¶

store and retrieve data from a cache on the filesystem. The function will try to retrieve the cached data. If there is None or the data is too old, data_function will be called and the result is stored in the cache.

Parameters:	name (string) – name of cache file data_function (function) – A function, which outputs the data to be stored. If the function is `None` and the cache is invalid, the funtion will return `None`. max_age (int) – The maximum age (real time) in seconds, the cache is allowed to have before turning invalid.
Returns:	data or `None`

interpolate_year(day)[source]¶: input: int between 0,365 output: float between 0,1 interpolates a year day to 1=winter, 0=summer

perdelta(start, end, delta)[source]¶

generator function, which outputs dates. works like range(start, stop, step) for dates

Parameters:	start,end (datetime) – dates between which to iterate delta (timedelta) – the stepwidth

`dataloader`¶

class DataLoader[source]¶

This class reads data from CSV formatted in a specific way. The files are cached in memory to enable fast, re-reads

classmethod load_from_file(filepath, column_name, delim='t', date_name='Datum', sampling_interval=600)[source]¶

load a time series from a csv file. This assumes, that the csv is formatted in the following way:

Date header	Row Header1	Row Header2	Row Header N
Timestamp0	Row1Value0	Row2Value0	RowNValue0
Timestamp1	...	...	...

If the values in the file isn’t sampled evenly, because it contains skips, blackouts, etc.. the data will be sampled evenly by copying certain data (see evenly_sampled()).

Parameters:	column_name (string) – The name of the column (in the csv) to retrieve delim (string) – The delimiter between values of a row. Default is Tab. date_name (string) – The name of the Date header of the date row sampling_interval (int) – The interval the data in the file is sampled.

classmethod evenly_sampled(data, date_name='Datum', sampling_interval=600)[source]¶

Will return a version of data, in which every value has a corresponding timestamp, which is roughly sampling_interval seconds away from the last value. This is a maximum interval, if the data contains closer values together than sampling_interval, no actions will be taken.

The data which is used to fill up gaps is tried to gather intelligently. It is specifically designed for electrical data and takes values from one week ago, if present, else one day or the last value if everything else fails.

Parameters:	data (dict) – dictionary with column names as keys and column data as values date_name (string) – name of the date row sampling_interval (int) – the number of seconds between each consecutive sample

Helping functions for Forecasting¶

helpers¶

dataloader¶

`helpers`¶

`dataloader`¶