Accessing Timeseries Data in Python

Streaming access to timeseries packages in Python

🚧

Note: this functionality is currently in Beta and is subject to change as we improve and mature the service.

The Pennsieve Python client can leverage the Pennsieve Agent to stream timeseries data directly into Python Data Frames. The following diagram broadly outlines the process that happens you request a range of data using the python client (or the agent CLI).

Simplified process for streaming time-series data into Python data frames. When an uploaded file is processed on Pennsieve, it is stored as a set of compressed blocks with a time-range index. A request from the client for a specific range downloads the corresponding blocks, caches those locally, and streams the data into a Python Data Frame.

Simplified process for streaming time-series data into Python data frames. When an uploaded file is processed on Pennsieve, it is stored as a set of compressed blocks with a time-range index. A request from the client for a specific range downloads the corresponding blocks, caches those locally, and streams the data into a Python Data Frame.

Example Workflow in Python

First, import the Pennsieve Library (make sure it is installed) and create a Pennsieve object

from pennsieve import Pennsieve
p = Pennsieve()

Set the dataset that you are using after making sure you are using the correct Pennsieve Profile

p.use_dataset('N:dataset:a3718f3e-b312-48a9-b549-xxxxxxxxxxxx')

Request the list of channels associated with a package

channels = p.timeseries.getChannels(p.dataset, 'N:package:d2a472e2-f9ce-43dc-b05b-xxxxxxxxxxxx', True)

The response will provide useful information such as the start and end times of each channel. Now, request a specific range of data on a specific channel. You can specify whether the provided start and end-times are relative to the beginning or absolute and whether you want to force refresh the cached data. If you want to return all channels, provide an empty array for the channel ids.

res = p.timeseries.getRangeForChannels(
    p.dataset, 'N:package:d2a472e2-f9ce-43dc-b05b-xxxxxxxxxxxx', [channels[0].id], 1441492006480000,1443657207096000, False, False)

Finally, you can check the data-frame.