Skip to content

CVector-Energy/cvec-python

Repository files navigation

CVec Client Library

The "cvec" package is the Python SDK for CVector Energy.

Getting Started

Installation

Assuming that you have a supported version of Python installed, you can first create a venv with:

python -m venv .venv

Then, activate the venv:

. .venv/bin/activate

Then, you can install cvec from PyPI with:

pip install cvec

Using cvec

Import the cvec package. We will also use the datetime module.

import cvec
from datetime import datetime

Construct the CVec client. The host and api_key can be given through parameters to the constructor or from the environment variables CVEC_HOST and CVEC_API_KEY:

cvec = cvec.CVec()

Spans

A span is a period of interest, such as an experiment, a baseline recording session, or an alarm. The initial state of a Span is implicitly defined by a period where a given metric has a constant value.

The newest span for a metric does not have an end time, since it has not ended yet (or has not ended by the finish of the queried period).

To get the spans on my_tag_name since 2025-05-14 10am, run:

for span in cvec.get_spans("mygroup/myedge/node", start_at=datetime(2025, 5, 14, 10, 0, 0)):
    print("%s\t%s" % (span.value, span.raw_start_at))

The output will be like:

offline   2025-05-19 16:28:02.130000+00:00
starting  2025-05-19 16:28:01.107000+00:00
running   2025-05-19 15:29:28.795000+00:00
stopping  2025-05-19 15:29:27.788000+00:00
offline   2025-05-19 14:14:43.752000+00:00

Metrics

A metric is a named set of time-series data points pertaining to a particular resource (for example, the value reported by a sensor). Metrics can have numeric or string values. Boolean values are mapped to 0 and 1. The get_metrics function returns a list of metric metadata.

To get all of the metrics that changed value at 10am on 2025-05-14, run:

for item in cvec.get_metrics(start_at=datetime(2025, 5, 14, 10, 0, 0), end_at=datetime(2025, 5, 14, 11, 0, 0)):
  print(item.name)

Example output:

mygroup/myedge/compressor01/status
mygroup/myedge/compressor01/interlocks/emergency_stop
mygroup/myedge/compressor01/stage1/pressure_out/psig
mygroup/myedge/compressor01/stage1/temp_out/c
mygroup/myedge/compressor01/stage2/pressure_out/psig
mygroup/myedge/compressor01/stage2/temp_out/c
mygroup/myedge/compressor01/motor/current/a
mygroup/myedge/compressor01/motor/power_kw

Metric Data

The main content for a metric is a set of points where the metric value changed. These are returned with columns for name, time, value_double, value_string.

To get all of the value changes for all metrics at 10am on 2025-05-14, run:

cvec.get_metric_data(start_at=datetime(2025, 5, 14, 10, 0, 0), end_at=datetime(2025, 5, 14, 11, 0, 0))

Example output:

                                                        name                             time  value_double value_string
0      mygroup/myedge/mode                                   2025-05-14 10:10:41.949000+00:00     24.900000     starting
1      mygroup/myedge/compressor01/interlocks/emergency_stop 2025-05-14 10:27:24.899000+00:00     0.0000000         None
2      mygroup/myedge/compressor01/stage1/pressure_out/psig  2025-05-14 10:43:38.282000+00:00     123.50000         None
3      mygroup/myedge/compressor01/stage1/temp_out/c         2025-05-14 10:10:41.948000+00:00     24.900000         None
4      mygroup/myedge/compressor01/motor/current/a           2025-05-14 10:27:24.897000+00:00     12.000000         None
...                                   ...                              ...           ...          ...
46253  mygroup/myedge/compressor01/stage1/temp_out/c         2025-05-14 10:59:55.725000+00:00     25.300000         None
46254  mygroup/myedge/compressor01/stage2/pressure_out/psig  2025-05-14 10:59:56.736000+00:00     250.00000         None
46255  mygroup/myedge/compressor01/stage2/temp_out/c         2025-05-14 10:59:57.746000+00:00     12.700000         None
46256  mygroup/myedge/compressor01/motor/current/a           2025-05-14 10:59:58.752000+00:00     11.300000         None
46257  mygroup/myedge/compressor01/motor/power_kw            2025-05-14 10:59:59.760000+00:00     523.40000         None

[46257 rows x 4 columns]

Pandas Data Frames

Use the get_metric_arrow function to efficiently load data into a pandas DataFrame like this:

import pandas as pd
import pyarrow as pa

reader = pa.ipc.open_file(cvec.get_metric_arrow(names=["tag1", "tag2"]))
df = reader.read_pandas()

Adding Metric Data

To add new metric data points, you create a list of MetricDataPoint objects and pass them to add_metric_data. Each MetricDataPoint should have a name, a time, and either a value_double (for numeric values) or a value_string (for string values).

from datetime import datetime
from cvec.models import MetricDataPoint

# Assuming 'cvec' client is already initialized

# Create some data points
data_points = [
    MetricDataPoint(
        name="mygroup/myedge/compressor01/stage1/temp_out/c",
        time=datetime(2025, 7, 29, 10, 0, 0),
        value_double=25.5,
    ),
    MetricDataPoint(
        name="mygroup/myedge/compressor01/status",
        time=datetime(2025, 7, 29, 10, 0, 5),
        value_string="running",
    ),
]

# Add the data points to CVec
cvec.add_metric_data(data_points)

CSV Import Tool

The repository includes a command-line script for importing CSV data into CVec. The script is located at scripts/csv_import.py.

Usage

python scripts/csv_import.py [options] csv_file

Options

  • csv_file: Path to the CSV file to import (required)
  • --prefix PREFIX: Prefix to add to metric names (separated by '/')
  • --host HOST: CVec host URL (overrides CVEC_HOST environment variable)
  • --api-key API_KEY: CVec API key (overrides CVEC_API_KEY environment variable)

CSV Format

The CSV file must have:

  • A header row with column names
  • A timestamp column (case-insensitive: "timestamp", "Timestamp", etc.)
  • One or more metric columns

Example CSV:

timestamp,rain_rate,actual_inflow,predicted_inflow
2025-01-01 00:00:00,0.5,100.2,95.8
2025-01-01 01:00:00,1.2,150.5,145.3
2025-01-01 02:00:00,0.8,120.1,118.7

Examples

# Basic import
python scripts/csv_import.py data.csv

# Add prefix to metric names (rain_rate becomes "weather/rain_rate")
python scripts/csv_import.py data.csv --prefix "weather"

# Specify CVec connection details
python scripts/csv_import.py data.csv --host "https://your-cvec-host.com" --api-key "your-api-key"

The script automatically:

  • Detects numeric vs string values
  • Supports multiple timestamp formats
  • Provides detailed progress information
  • Handles errors gracefully

CVec Class

The SDK provides an API client class named CVec with the following functions.

__init__(?host, ?api_key, ?default_start_at, ?default_end_at)

Setup the SDK with the given host and API Key. The host and API key are loaded from environment variables CVEC_HOST and CVEC_API_KEY if they are not given as arguments to the constructor. The tenant ID is automatically fetched from the host's /config endpoint. The default_start_at and default_end_at can provide a default query time interval for API methods.

get_spans(name, ?start_at, ?end_at, ?limit)

Return time spans for a metric. Spans are generated from value changes that occur after start_at (if specified) and before end_at (if specified). If start_at is None (e.g., not provided as an argument and no class default default_start_at is set), the query for value changes is unbounded at the start. Similarly, if end_at is None, the query is unbounded at the end.

Each Span object in the returned list represents a period where the metric's value is constant and has the following attributes:

  • value: The metric's value during the span.
  • name: The name of the metric.
  • raw_start_at: The timestamp of the value change that initiated this span's value. This will be greater than or equal to the query's start_at if one was specified.
  • raw_end_at: The timestamp marking the end of this span's constant value. For the newest span, the value is None. For other spans, it's the raw_start_at of the immediately newer data point, which is next span in the list.
  • id: Currently None. In a future version of the SDK, this will be the span's unique identifier.
  • metadata: Currently None. In a future version, this can be used to store annotations or other metadata related to the span.

Returns a list of Span objects, sorted in descending chronological order (newest span first). If no relevant value changes are found, an empty list is returned.

get_metric_data(?names, ?start_at, ?end_at)

Return all data-points within a given [start_at, end_at) interval, optionally selecting a given list of metric names. The return value is a Pandas DataFrame with four columns: name, time, value_double, value_string. One row is returned for each metric value transition.

add_metric_data(data_points, ?use_arrow)

Add multiple metric data points to the database.

  • data_points: A list of MetricDataPoint objects to add.
  • use_arrow: An optional boolean. If True, data is sent to the server using the more efficient Apache Arrow format. This is recommended for large datasets. Defaults to False.

get_metrics(?start_at, ?end_at)

Return a list of metrics that had at least one transition in the given [start_at, end_at) interval. All metrics are returned if no start_at and end_at are given.

get_modeling_metrics(?start_at, ?end_at)

Fetch modeling metrics from the modeling database. This method returns a list of available modeling metrics that had transitions in the specified time range.

  • start_at: Optional start date for the query range (uses class default if not specified)
  • end_at: Optional end date for the query range (uses class default if not specified)

Returns a list of Metric objects containing modeling metrics.

get_modeling_metrics_data(?names, ?start_at, ?end_at)

Fetch actual data values from modeling metrics within a time range. This method returns the actual data points (values) for the specified modeling metrics, similar to get_metric_data() but for the modeling database.

  • names: Optional list of modeling metric names to filter by
  • start_at: Optional start time for the query (uses class default if not specified)
  • end_at: Optional end time for the query (uses class default if not specified)

Returns a list of MetricDataPoint objects containing the actual data values.

get_modeling_metrics_data_arrow(?names, ?start_at, ?end_at)

Fetch actual data values from modeling metrics within a time range in Apache Arrow format. This method returns the actual data points (values) for the specified modeling metrics in Arrow IPC format, which is more efficient for large datasets.

  • names: Optional list of modeling metric names to filter by
  • start_at: Optional start time for the query (uses class default if not specified)
  • end_at: Optional end time for the query (uses class default if not specified)

Returns Arrow IPC format data that can be read using pyarrow.ipc.open_file().

get_eav_tables()

Get all EAV (Entity-Attribute-Value) tables for the tenant. EAV tables store semi-structured data where each row represents an entity with flexible attributes.

Returns a list of EAVTable objects, each containing:

  • id: The table's UUID
  • tenant_id: The tenant ID
  • name: Human-readable table name
  • created_at: When the table was created
  • updated_at: When the table was last updated

Example:

tables = cvec.get_eav_tables()
for table in tables:
    print(f"{table.name} (id: {table.id})")

get_eav_columns(table_id)

Get all columns for a specific EAV table.

  • table_id: The UUID of the EAV table

Returns a list of EAVColumn objects, each containing:

  • eav_table_id: The parent table's UUID
  • eav_column_id: The column's ID (used for queries)
  • name: Human-readable column name
  • type: Data type ("number", "string", or "boolean")
  • created_at: When the column was created

Example:

columns = cvec.get_eav_columns("00000000-0000-0000-0000-000000000000")
for column in columns:
    print(f"  {column.name} ({column.type}, id: {column.eav_column_id})")

select_from_eav(table_name, ?column_names, ?filters)

Query pivoted data from EAV tables using human-readable names. This is the recommended method for most use cases as it allows you to work with table and column names instead of UUIDs.

  • table_name: Name of the EAV table to query
  • column_names: Optional list of column names to include. If None, all columns are returned.
  • filters: Optional list of EAVFilter objects to filter results

Each EAVFilter must use column_name and can specify:

  • column_name: The column name to filter on (required)
  • numeric_min: Minimum numeric value (inclusive)
  • numeric_max: Maximum numeric value (exclusive)
  • string_value: Exact string value to match
  • boolean_value: Boolean value to match

Returns a list of dictionaries (maximum 1000 rows), each representing a row with an id field and fields for each requested column.

Example:

from cvec import CVec, EAVFilter

# Query with filters
filters = [
    EAVFilter(column_name="Weight", numeric_min=100, numeric_max=200),
    EAVFilter(column_name="Status", string_value="ACTIVE"),
]

rows = cvec.select_from_eav(
    table_name="Production Data",
    column_names=["Date", "Weight", "Status"],
    filters=filters,
)

for row in rows:
    print(row)

select_from_eav_id(table_id, ?column_ids, ?filters)

Query pivoted data from EAV tables using table and column IDs directly. This is a lower-level method for cases where you already have the UUIDs and want to avoid name lookups.

  • table_id: UUID of the EAV table to query
  • column_ids: Optional list of column IDs to include. If None, all columns are returned.
  • filters: Optional list of EAVFilter objects to filter results

Each EAVFilter must use column_id and can specify:

  • column_id: The column ID to filter on (required)
  • numeric_min: Minimum numeric value (inclusive)
  • numeric_max: Maximum numeric value (exclusive)
  • string_value: Exact string value to match
  • boolean_value: Boolean value to match

Returns a list of dictionaries (up to 1000), each representing a row. Each row has a field for each column plus an id (the "row ID").

Example:

from cvec import EAVFilter

filters = [
    EAVFilter(column_id="abcd", numeric_min=100, numeric_max=200),
    EAVFilter(column_id="efgh", string_value="ACTIVE"),
]

rows = cvec.select_from_eav_id(
    table_id="00000000-0000-0000-0000-000000000000",
    column_ids=["abcd", "efgh", "ijkl"],
    filters=filters,
)

for row in rows:
    print(f"ID: {row['id']}, Values: {row}")

About

Python SDK for CVector Data Platform

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors