Title: IoT Sensor Telemetry Protocol
Version: 0.90 (breakingly unstable, until 1.00)
Status: Running Code
Created: 2026-02-07
Authors: Matthew Gream
Licence: Attribute-ShareAlike 4.0 International
https://creativecommons.org/licenses/by-sa/4.0
Location: https://libiotdata.org
https://github.com/matthewgream/libiotdata
This document specifies a bit-packed telemetry protocol for battery- and transmission- constrained IoT sensor systems, with particular emphasis on LoRa-based remote environmental monitoring, and seamlessly deployable in point-to-point and mesh-relay topologies.
The protocol has a reference implementation in C (libiotdata) which is the
normative source for any ambiguity in this specification. In the tradition of
RFC 1, the specification is informed by running code.
Discussion of this document and the reference implementation takes place on the project's GitHub repository.
- 1. Introduction
- 2. Conventions and Terminology
- 3. Design Principles
- 4. Packet Structure Overview
- 5. Header
- 6. Presence Bytes
- 7. Variant Definitions
- 8. Field Encodings
- 8.1. Battery
- 8.2. Link
- 8.3. Environment
- 8.4. Solar
- 8.5. Depth
- 8.6. Flags
- 8.7. Position
- 8.8. Datetime
- 8.9. Temperature (standalone)
- 8.10. Pressure (standalone)
- 8.11. Humidity (standalone)
- 8.12. Wind (bundle)
- 8.13. Wind Speed (standalone)
- 8.14. Wind Direction (standalone)
- 8.15. Wind Gust (standalone)
- 8.16. Rain (bundle)
- 8.17. Rain Rate (standalone)
- 8.18. Rain Size (standalone)
- 8.19. Air Quality (bundle)
- 8.20. Air Quality Index (standalone)
- 8.21. Air Quality PM (standalone)
- 8.22. Air Quality Gas (standalone)
- 8.23. Radiation (bundle)
- 8.24. Radiation CPM (standalone)
- 8.25. Radiation Dose (standalone)
- 8.26. Clouds
- 8.27. Image
- 9. TLV Data
- 10. Canonical JSON Representation
- 11. Receiver Considerations
- 12. Packet Size Reference
- 13. Implementation Notes
- 14. Security Considerations
- 15. Future Work
- 16. Versioning and Forward Compatibility
- Appendix A. 6-Bit Character Table
- Appendix B. Quantisation Worked Examples
- Appendix C. Complete Encoder Example
- Appendix D. Transmission Medium Considerations
- Appendix E. System Implementation Considerations
- Appendix F. Example Weather Station Output
- Appendix G. Mesh Protocol
- Appendix H. System Architecture Considerations
- Appendix I. Comparison with Alternative Encodings and Embedded Libraries
- Appendix J. Known Limitations and Open Issues
Remote environmental monitoring systems — weather stations, snow depth sensors, ice thickness gauges — are frequently deployed in locations without mains power or wired connectivity. These devices are constrained along three axes simultaneously:
-
Power — battery and/or small solar panel, particularly in locations with limited winter daylight.
-
Communications — LoRa, 802.11ah, SigFox, cellular SMS, low-frequency RF, or similar low-power point-to-point, wide-area, or mesh networks with effective payload limits of tens of bytes per transmission. Regulatory limits on transmission time (typically 1% duty cycle in EU ISM bands) mean that every byte transmitted has a direct cost in time-on-air and energy.
-
Compute — small, inexpensive embedded microcontrollers running at tens of megahertz with tens or hundreds of kilobytes of RAM and program storage, where code size and complexity are real constraints, and where there are no, or limited, operating system or protocol support.
Existing serialisation approaches — JSON, Protobuf, CBOR, even raw C structs — waste bits on byte alignment, field delimiters, schema metadata, or fixed-width fields for data that could be represented in far fewer bits.
The IoT Sensor Telemetry Protocol (iotdata) addresses this by defining a bit-packed wire format where each field is quantised to the minimum number of bits required for its operational range and resolution. A typical weather station packet — battery, link quality, temperature, pressure, humidity, wind speed, direction and gust, rain rate and drop size, solar irradiance and UV index, plus 8 flag bits — fits in 16 bytes. A full-featured packet adding air quality, cloud cover, radiation CPM and dose, position latitude and longitude, and timestamp fits in 32 bytes.
The protocol is designed for transmit-only devices. There is no negotiation, handshake, or acknowledgement at this layer. A sensor wakes, encodes its readings, transmits, and sleeps. Transmissions are typically infrequent (minutes to hours), bursty, and rely on lower-layer integrity (checksums or CRC) without lower-layer reliability (retransmission or acknowledgement).
The protocol can be deployed in point-to-point arrangements, where edge devices transmit directly to one or more gateways or in a mesh arrangement, where intermediate relays automatically and periodically (re)configure to determine primary and backup paths to gateways. Edge devices need no awareness of the mesh protocol and can operate identically with or without it.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Bit numbering: All bit diagrams in this document use MSB-first (big-endian) bit order. Bit 7 of a byte is the most significant bit and is transmitted first. Multi-bit fields are packed MSB-first: the most significant bit of a field occupies the earliest bit position in the stream.
Bit offset: Bit positions within a packet are numbered from 0, starting at the MSB of the first byte. Bit 0 is the MSB of byte 0; bit 7 is the LSB of byte 0; bit 8 is the MSB of byte 1; and so on.
Byte boundaries: Fields are NOT byte-aligned unless they happen to fall on a byte boundary. The packet is a continuous bit stream; byte boundaries have no structural significance. The final byte is zero-padded in its least-significant bits if the total bit count is not a multiple of 8.
Quantisation: The process of mapping a continuous or large-range value to a
reduced set of discrete steps that fit in fewer bits. All quantisation in this
protocol uses round() (round half away from zero), unless otherwise specified,
and can be carried out as floating-point or integer-only.
The following principles guided the protocol design:
-
Bit efficiency over simplicity. Every field is quantised to the minimum bits that preserve operationally useful resolution. There is no byte-alignment padding between fields.
-
Presence flags over fixed structure. Optional fields are indicated by presence bits, so a battery-only packet is 6 bytes and a full-telemetry packet is 24 bytes — the same protocol serves both.
-
Variants over negotiation. Different sensor types (snow, ice, weather) can prioritise different fields in the compact first presence byte. The 4-bit variant field in the header selects the field mapping. No runtime negotiation is needed.
-
Source-agnostic fields. Position may come from GNSS, WiFi geolocation, cell tower triangulation, or static configuration. Datetime may come from GNSS, NTP, or a local RTC. The wire encoding is the same regardless of source.
-
Extensibility via TLV. Diagnostic data, firmware metadata, and user-defined payloads use a trailing TLV (type-length-value) section that does not affect the fixed field layout. These are typically designed to be system data, rather than sensor data.
-
Encode-only on the sensor. The encoder is small enough for resource-constrained MCUs. JSON serialisation and other server-side features are optional and can be excluded from embedded builds. The reference implementation can build to 1 KB and non-reference implementations to less than 512 bytes.
-
Transport-delegated integrity. The protocol carries no checksum, CRC, length field, or encryption. These functions are delegated to the underlying medium (LoRa CRC, LoRaWAN MIC, cellular security, etc.). A redundant CRC would cost 16-32 bits — significant when the entire payload may be 46 bits. Packet loss is tolerated: the sequence number (Section 5) enables detection without requiring retransmission.
-
No global interoperability. It is expressly not a goal to support interoperability between implementations, e.g. between vendors. Rather, the design intends to provide an optimal framework and reference for a given deployment across a suite of devices. Interoperability may be a goal for future versions.
An iotdata packet consists of the following sections, in order:
+--------+------------+-------------+------------+
| Header | Presence | Data Fields | TLV Fields |
| 32 bits| 8 to 32 b. | variable | optional |
+--------+------------+-------------+------------+
All sections are packed as a continuous bit stream with no alignment gaps between them.
-
Header (32 bits): Always present. Identifies the variant, station, and sequence number.
-
Presence (8 to 32 bits): Always present. One to four presence bytes chained via extension bits indicate which data fields follow. data fields and TLV data follow.
-
Data fields (variable): Zero or more sensor data fields, packed in the order defined by the variant's field table.
-
TLV fields (variable, optional): Zero or more type-length-value data entries.
The minimum valid packet is 5 bytes (header + one presence byte with no fields set), though such a packet carries no sensor data and serves only as a heartbeat. In practice the minimum useful packet is 6 bytes (header + presence + battery = 46 bits).
The header is always the first 32 bits of a packet.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Var | Station ID | Sequence |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Variant (4 bits, offset 0): Index into the variant field table (Section 7). Values 0-14 are usable for sensor oriented data; value 15 is RESERVED for the mesh protocol control messages. A non mesh capable device encountering variant 15 SHOULD reject the packet.
Station ID (12 bits, offset 4): Identifies the transmitting station. Range 0-4095. Station IDs are assigned by the deployment operator; this protocol does not define an allocation mechanism.
Sequence (16 bits, offset 16): Monotonically increasing packet counter, wrapping from 65535 to 0. The receiver MAY use this to detect lost packets. The wrap-around is expected and MUST NOT be treated as an error.
The header could be reduced to 24 bits by a reduction in the Station ID (from 12 to 8 bits, saving 4 bits) and the Sequence (from 16 to 12 bits, saving 4 bits). This would retain station diversity (at 256, rather than 4096) and loss detection (at 4096 packet window, rather than 65536). Such a modification is not contemplated in this version of the protocol.
Immediately following the header, one or more presence bytes indicate which data fields are included in the packet. Presence bytes form an extension chain: each byte has an extension bit that, when set, indicates another presence byte follows.
7 6 5 4 3 2 1 0
+---+---+---+---+---+---+---+---+
|Ext|TLV| S5| S4| S3| S2| S1| S0|
+---+---+---+---+---+---+---+---+
-
Ext (bit 7): Extension flag. If set, Presence Byte 1 follows immediately. If clear, no further presence bytes exist.
-
TLV (bit 6): TLV data flag. If set, one or more TLV entries (Section 9) follow after all data fields. Builds excluding TLV might use this as a Data field, but this is not contemplated in this version of the protocol.
-
S0-S5 (bits 5-0): Data fields 0 through 5. Each bit, when set, indicates that the corresponding field (as defined by the variant's field table) is present in the packet. The field data appears in field order: S0 first, then S1, S2, and so on.
Present only when the Ext bit in the preceding presence byte is set.
7 6 5 4 3 2 1 0
+---+---+---+---+---+---+---+---+
|Ext| S6| S5| S4| S3| S2| S1| S0|
+---+---+---+---+---+---+---+---+
-
Ext (bit 7): Extension flag. If set, another presence byte follows. This allows chaining of an arbitrary number of presence bytes.
-
S0-S6 (bits 6-0): Data fields for this presence byte. The first extension byte (pres1) carries fields 6-12, the second (pres2) carries fields 13-19, and so on.
The maximum number of data fields available depends on the number of presence bytes:
| Presence Bytes | Total Data Fields | Formula |
|---|---|---|
| 1 (pres0) | 6 | 6 |
| 2 (pres0+1) | 13 | 6 + 7 |
| 3 (pres0+1+2) | 20 | 6 + 7 + 7 |
| 4 (pres0+1+2+3) | 27 | 6 + 7 + 7 + 7 |
The reference implementation supports up to 4 presence bytes (27 data fields). In practice, the default weather station variant uses 2 presence bytes for 12 data fields. It is unlikely an implementation would pratically require more than 2-3 presence bytes.
The encoder only emits the minimum number of presence bytes needed for the fields actually present. If all set fields fit in pres0 (fields 0-5), no extension bytes are emitted, even if the variant defines fields in pres1. This optimisation reduces packet size for common transmissions that include only the most frequently updated fields.
Data fields are packed in strict field order. First, all set fields from Presence Byte 0 are packed in order S0, S1, ..., S5. Then, if Presence Byte 1 is present, fields S6 through S12 are packed. The TLV section (if present) always comes last, after all data fields.
The meaning of each field position — which sensor field type it represents — is determined entirely by the variant table (Section 7).
The variant field in the header selects a field mapping that determines which field type occupies each presence bit position. This mechanism allows different sensor types to prioritise their most commonly transmitted fields in Presence Byte 0, while less frequent fields (such as position and datetime) occupy later presence bytes and only trigger extension bytes when actually transmitted.
All field encodings (Section 8) are universal and independent of variant. The variant affects only which encoding type is associated with which field position, and which label is used in human-readable output and JSON serialisation.
Fields may be repeated, such as to specify multiple temperature entries which have different meanings (for example, the temperature of the microcontroller vs. the temperature of the environment). This is supported by the protocol, but not the current reference implementation (which will be modified at some future date to do so).
In the reference implementation, each variant is defined as:
typedef struct {
iotdata_field_type_t type; /* encoding type for this field */
const char *label; /* JSON key and display label */
} iotdata_field_def_t;
typedef struct {
const char *name;
uint8_t num_pres_bytes;
iotdata_field_def_t fields[IOTDATA_MAX_DATA_FIELDS];
} iotdata_variant_def_t;The fields[] array is flat: entries 0-5 map to Presence Byte 0, entries 6-12
to Presence Byte 1, entries 13-19 to Presence Byte 2, and so on. Unused trailing
fields should have type IOTDATA_FIELD_NONE.
The built-in default variant (variant 0) is a general-purpose weather station
layout. It is enabled by defining IOTDATA_VARIANT_MAPS_DEFAULT at compile
time. It is illustrative and not mandated for this use case: there are no
standardised variants, as global interoperability is not a goal.
| Pres Byte | Field | Type | Label | Bits |
|---|---|---|---|---|
| 0 | S0 | BATTERY | battery | 6 |
| 0 | S1 | LINK | link | 6 |
| 0 | S2 | ENVIRONMENT | environment | 24 |
| 0 | S3 | WIND | wind | 22 |
| 0 | S4 | RAIN | rain | 12 |
| 0 | S5 | SOLAR | solar | 14 |
| 1 | S6 | CLOUDS | clouds | 4 |
| 1 | S7 | AIR_QUALITY_INDEX | air_quality | 9 |
| 1 | S8 | RADIATION | radiation | 28 |
| 1 | S9 | POSITION | position | 48 |
| 1 | S10 | DATETIME | datetime | 24 |
| 1 | S11 | FLAGS | flags | 8 |
This layout prioritises the most commonly transmitted weather data (battery, environment, wind, rain, solar, link quality) in Presence Byte 0, minimising packet size for routine transmissions. The less frequently updated fields (position, datetime, radiation) are placed in Presence Byte 1 and only add to the packet when present.
Note that the weather station variant uses the ENVIRONMENT, WIND, RAIN, and RADIATION bundle types (see Sections 8.3, 8.12, 8.16, 8.23) rather than their individual component types. See Section 8 for a discussion of when to use bundled vs individual field types.
Applications can define their own variant tables at compile time using the
IOTDATA_VARIANT_MAPS and IOTDATA_VARIANT_MAPS_COUNT defines. This completely
replaces the default variant table.
/* Define custom variants */
const iotdata_variant_def_t my_variants[] = {
[0] = {
.name = "soil_sensor",
.num_pres_bytes = 1,
.fields = {
{ IOTDATA_FIELD_BATTERY, "battery" },
{ IOTDATA_FIELD_LINK, "link" },
{ IOTDATA_FIELD_TEMPERATURE, "soil_temp" },
{ IOTDATA_FIELD_HUMIDITY, "soil_moist" },
{ IOTDATA_FIELD_DEPTH, "soil_depth" },
{ IOTDATA_FIELD_NONE, NULL },
},
},
};Compile with:
cc -DIOTDATA_VARIANT_MAPS=my_variants -DIOTDATA_VARIANT_MAPS_COUNT=1 ...Custom variants may use any combination of the available field types and may place them in any field position. Up to 15 variants can be registered as variant IDs 0-14; with variant 15 reserved for the mesh protocol (see Appendix G).
| Variant | Name | Pres Bytes | Fields | Notes |
|---|---|---|---|---|
| 0 | weather_station | 2 | 12 | Default (built-in) |
| 1-14 | (application) | — | — | User-defined via custom maps |
| 15 | MESH PROTOCOL | — | — | Mesh protocol (Appendix G) |
A receiver encountering an unknown variant SHOULD not process the packet and flag it as using an unknown variant (see Section 11.4).
Each field type has a specified bit layout that is independent of which presence field it occupies. Fields are always packed MSB-first.
The protocol provides over 20 built-in field types. Some of these exist in both individual and bundled forms, to aid efficiency for cases where like data (e.g. temperature, pressure and humidity) are always concurrently measured and transmitted.
-
Environment (Section 8.3) is a convenience bundle that packs temperature, pressure, and humidity into a single 24-bit field. The same three measurements are also available as individual field types: Temperature (8.9), Pressure (8.10), and Humidity (8.11). The encodings and quantisation are identical.
-
Wind (Section 8.12) is a convenience bundle that packs wind speed, direction, and gust into a single 22-bit field. The same three measurements are also available as individual field types: Wind Speed (8.13), Wind Direction (8.14), and Wind Gust (8.15). The encodings and quantisation are identical.
-
Rain (Section 8.16) is a convenience bundle that packs rain rate, and rain size into a single 12-bit field. The same two measurements are also available as individual field types: Rate Rate (8.17), and Rain Size (8.18). The encodings and quantisation are identical.
-
Air Quality (Section 8.19) is a convenience bundle that packs air quality index, air quality pm, and air quality gas into a single multi-bit field. The same three measurements are also available as individual field types: Air Quality Index (8.20), Air Quality PM (8.21), and Air Quality Gas (8.22). The encodings and quantisation are identical.
-
Radiation (Section 8.23) is a convenience bundle that packs radiation cpm, and radiation dose into a single 28-bit field. The same two measurements are also available as individual field types: Radiation CPM (8.24), and Radiation Dose (8.25). The encodings and quantisation are identical.
A variant definition chooses which form to use. The default weather station variant uses many of the bundled forms as the sensors generate the entire bundle of values concurrently. A custom variant might use the individual forms to include only the specific measurements it needs, or to place them in different priority positions, or where they are sourced from different sensors at different times. For example, the commonly used BME280/680 sensor can generate temperature, pressure and humidity readings concurrently.
Note that at this point, some bundles have no standalone forms, such as the Solar bundle with Irradiance and Ultraviolet measurements. This may be addressed in future versions of this protocol.
6 bits total.
0 1 2 3 4 5
+---+---+---+---+---+---+
| Level |Chg|
| (5 bits) |(1)|
+---+---+---+---+---+---+
Level (5 bits): Battery charge level, quantised from 0-100% to 0-31.
Encode: q = round(level_pct / 100.0 * 31.0)
Decode: level_pct = round(q / 31.0 * 100.0)
Resolution: ~3.2 percentage points.
Charging (1 bit): 1 = charging, 0 = discharging/not charging.
6 bits total.
0 1 2 3 4 5
+---+---+---+---+---+---+
| RSSI | SNR |
| (4 bits) | (2) |
+---+---+---+---+---+---+
RSSI (4 bits): Range: -120 to -60 dBm. Resolution: 4 dBm (15 steps).
Encode: q = (rssi_dbm - (-120)) / 4
Decode: rssi_dbm = -120 + q * 4
SNR (2 bits): Range: -20 to +10 dB. Resolution: 10 dB (3 steps: -20, -10, 0, +10).
Encode: q = round((snr_db - (-20.0)) / 10.0)
Decode: snr_db = -20.0 + q * 10.0
This field is source-agnostic: while designed for LoRa link metrics, the same encoding is suitable for 802.11ah or other low-power RF links with comparable RSSI and SNR ranges.
24 bits total.
0 1 2
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Temperature | Pressure | Humidity |
| (9 bits) | (8 bits) | (7 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Temperature (9 bits): Range: -40.00°C to +80.00°C. Resolution: 0.25°C (480 steps, 9 bits = 512 values).
Encode: q = round((temp_c - (-40.0)) / 0.25)
Decode: temp_c = -40.0 + q * 0.25
Pressure (8 bits): Range: 850 to 1105 hPa. Resolution: 1 hPa (255 steps).
Encode: q = pressure_hpa - 850
Decode: pressure_hpa = q + 850
Humidity (7 bits): Range: 0 to 100%. Resolution: 1% (7 bits = 128 values, 0-100 used).
Encode/Decode: direct (no quantisation needed).
14 bits total.
0 1
0 1 2 3 4 5 6 7 8 9 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Irradiance | UV Idx |
| (10 bits) | (4 bits)|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Irradiance (10 bits): Range: 0 to 1023 W/m². Resolution: 1 W/m². Direct encoding.
UV Index (4 bits): Range: 0 to 15. Direct encoding.
10 bits total.
Range: 0 to 1023 cm. Resolution: 1 cm. Direct encoding.
This is a generic depth field. The variant label determines its semantic meaning (snow depth, ice thickness, water level, etc.). The wire encoding is identical regardless of label.
8 bits total.
0 1 2 3 4 5 6 7
+---+---+---+---+---+---+---+---+
| Flags (8 bits) |
+---+---+---+---+---+---+---+---+
General-purpose bitmask. Bit assignments are deployment-specific and are not defined by this protocol. Example uses include: low battery warning, sensor fault indicators, tamper detection, or configuration acknowledgement flags.
48 bits total.
0 1 2
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Latitude (24 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Longitude (24 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Latitude (24 bits): Range: -90.0° to +90.0°.
Encode: q = round((lat - (-90.0)) / 180.0 * 16777215.0)
Decode: lat = q / 16777215.0 * 180.0 + (-90.0)
Resolution: 180.0 / 16777215 ≈ 0.00001073° ≈ 1.19 metres at the equator.
Longitude (24 bits): Range: -180.0° to +180.0°.
Encode: q = round((lon - (-180.0)) / 360.0 * 16777215.0)
Decode: lon = q / 16777215.0 * 360.0 + (-180.0)
Resolution: 360.0 / 16777215 ≈ 0.00002146° ≈ 2.39 metres at the equator, reducing with cos(latitude).
This field is source-agnostic. The position may originate from a GNSS receiver, WiFi geolocation, cell tower triangulation, or static configuration. The protocol does not indicate the source or its accuracy; see Section 11.2 and 11.3 for discussion.
24 bits total.
0 1 2
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ticks (24 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Ticks (24 bits): Time offset from January 1 00:00:00 UTC of the current year, measured in 5-second ticks.
Encode: ticks = seconds_from_year_start / 5
Decode: seconds = ticks * 5
Maximum value: 16,777,215 ticks = 83,886,075 seconds ≈ 970.9 days.
Resolution: 5 seconds.
The year is NOT transmitted. The receiver resolves the year using its own clock; see Section 11.1 for the year resolution algorithm.
This field is source-agnostic. The time may originate from a GNSS receiver, NTP synchronisation, or a local RTC. The protocol does not indicate the source or its drift characteristics; see Section 11.3.
9 bits total.
Range: -40.00°C to +80.00°C. Resolution: 0.25°C (480 steps, 9 bits = 512 values).
Encode: q = round((temp_c - (-40.0)) / 0.25)
Decode: temp_c = -40.0 + q * 0.25
This is the same encoding as the temperature component of the Environment bundle (Section 8.3). Use this standalone type in variants that need temperature without pressure and humidity.
8 bits total.
Range: 850 to 1105 hPa. Resolution: 1 hPa (255 steps).
Encode: q = pressure_hpa - 850
Decode: pressure_hpa = q + 850
This is the same encoding as the pressure component of the Environment bundle (Section 8.3).
7 bits total.
Range: 0 to 100%. Resolution: 1% (7 bits = 128 values, 0-100 used).
Encode/Decode: direct (no quantisation needed).
This is the same encoding as the humidity component of the Environment bundle (Section 8.3).
22 bits total.
0 1 2
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Speed | Direction | Gust |
| (7 bits) | (8 bits) | (7 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
This is a convenience bundle that packs wind speed, direction, and gust speed into a single field. The component encodings are identical to the standalone Wind Speed (8.13), Wind Direction (8.14), and Wind Gust (8.15) types.
Speed (7 bits): Range: 0 to 63.5 m/s. Resolution: 0.5 m/s.
Encode: q = round(speed_ms / 0.5)
Decode: speed_ms = q * 0.5
Direction (8 bits): Range: 0° to 355° (true bearing). Resolution: ~1.41° (360/256).
Encode: q = round(direction_deg / 360.0 * 256.0) & 0xFF
Decode: direction_deg = q / 256.0 * 360.0
Gust (7 bits): Range: 0 to 63.5 m/s. Resolution: 0.5 m/s.
Encode/Decode: same as Speed.
7 bits total.
Range: 0 to 63.5 m/s. Resolution: 0.5 m/s.
Encode: q = round(speed_ms / 0.5)
Decode: speed_ms = q * 0.5
Same encoding as the speed component of the Wind bundle (8.12).
8 bits total.
Range: 0° to 355° (true bearing). Resolution: ~1.41° (360/256).
Encode: q = round(direction_deg / 360.0 * 256.0) & 0xFF
Decode: direction_deg = q / 256.0 * 360.0
Same encoding as the direction component of the Wind bundle (8.12).
7 bits total.
Range: 0 to 63.5 m/s. Resolution: 0.5 m/s.
Encode: q = round(gust_ms / 0.5)
Decode: gust_ms = q * 0.5
Same encoding as the gust component of the Wind bundle (8.12).
12 bits total.
0 1
0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+
| Rate | Size |
| (8 bits) | (4) |
+-+-+-+-+-+-+-+-+-+-+-+-+
This is a convenience bundle that packs rain rate and size into a single field. The component encodings are identical to the standalone Rain Rate (8.17), and Rain Size (8.18) types.
Rate (8 bits):
Range: 0 to 255 mm/hr. Resolution: 1 mm/hr. Direct encoding.
Size (4 bits):
Range: 0 to 6.0mm. Resolution: 0.25 mm.
Encode: q = round(rain_size / 0.25)
Decode: rain_size = q * 0.25
8 bits total.
Range: 0 to 255 mm/hr. Resolution: 1 mm/hr. Direct encoding.
4 bits total.
Range: 0 to 6.0mm. Resolution: 0.25 mm.
Encode: q = round(rain_size / 0.25)
Decode: rain_size = q * 0.25
Variable length (minimum 21 bits).
This is a convenience bundle that packs air quality index, particulate matter, and gas readings into a single field. The component encodings are identical to the standalone Air Quality Index (8.19), Air Quality PM (8.20), and Air Quality Gas (8.21) types.
+-----------+-----------+-----------+
| AQ Index | AQ PM | AQ Gas |
| (9 bits) | (4+ bits) | (8+ bits) |
+-----------+-----------+-----------+
The three sub-fields are packed in order: index, PM, and gas. Each sub-field includes its own presence mask, so absent PM channels and gas slots consume no bits beyond the mask itself.
Minimum: 9 (index) + 4 (PM mask, no channels) + 8 (gas mask, no slots) = 21 bits. Typical SEN55 full reading: 9 + 36 + 24 = 69 bits.
9 bits total.
Range: 0 to 500 AQI (Air Quality Index). Resolution: 1 AQI. Direct encoding (9 bits = 512 values, 0-500 used).
4 to 36 bits total (variable).
0
0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+- - - - -+
|P|P|P|P| ch0 | ch1 ... (8 bits per present channel)
|1|25|4|10| |
+-+-+-+-+-+-+-+-+- - - - -+
4-bit presence mask followed by 8 bits for each present PM channel. Resolution: 5 µg/m³.
Presence mask (4 bits):
- Bit 0: PM1 present
- Bit 1: PM2.5 present
- Bit 2: PM4 present
- Bit 3: PM10 present
Each channel (8 bits):
Range: 0 to 1275 µg/m³. Resolution: 5 µg/m³ (255 steps).
Encode: q = value_ugm3 / 5
Decode: value_ugm3 = q * 5
The 5 µg/m³ resolution matches the ±5 µg/m³ precision of typical laser-scattering PM sensors (e.g. Sensirion SEN55, Plantower PMS5003).
Typical sensors output all four channels simultaneously; a presence mask of 0xF (all present) with 4 × 8 = 32 data bits is the common case, giving 36 bits total.
8 to 84 bits total (variable).
0
0 1 2 3 4 5 6 7 8 9 ...
+-+-+-+-+-+-+-+-+- - - - - - -+
|V|N|C|C|H|O|R|R| slot0 | slot1 ...
|O|O|O|O|C|3|6|7| |
|C|X|2| |H| | | | |
+-+-+-+-+-+-+-+-+- - - - - - -+
8-bit presence mask followed by data for each present gas slot. Each slot has a fixed bit width and resolution determined by its position in the mask.
Presence mask (8 bits):
- Bit 0: VOC Index
- Bit 1: NOx Index
- Bit 2: CO₂
- Bit 3: CO
- Bit 4: HCHO (formaldehyde)
- Bit 5: O₃ (ozone)
- Bit 6: Reserved
- Bit 7: Reserved
Slot encodings:
| Slot | Gas | Bits | Resolution | Range | Unit |
|---|---|---|---|---|---|
| 0 | VOC | 8 | 2 index pts | 0-510 | idx |
| 1 | NOx | 8 | 2 index pts | 0-510 | idx |
| 2 | CO₂ | 10 | 50 ppm | 0-51,150 | ppm |
| 3 | CO | 10 | 1 ppm | 0-1,023 | ppm |
| 4 | HCHO | 10 | 5 ppb | 0-5,115 | ppb |
| 5 | O₃ | 10 | 1 ppb | 0-1,023 | ppb |
| 6 | Rsvd | 10 | — | — | — |
| 7 | Rsvd | 10 | — | — | — |
Encode: q = value / resolution
Decode: value = q * resolution
VOC and NOx index slots carry Sensirion SGP4x-style algorithm indices (1-500 typical). The 2-point resolution is well within the ±15/±50 index point device-to-device variation.
CO₂ at 50 ppm resolution covers the full SCD4x range (0-40,000 ppm) and exceeds its ±40 ppm + 5% accuracy.
HCHO at 5 ppb resolution matches the ~10 ppb accuracy of typical electrochemical formaldehyde sensors (e.g. Sensirion SEN69C, Dart WZ-S).
A typical Sensirion SEN55 station (VOC + NOx) sends 8 + 8 + 8 = 24 bits. A SEN66 station (VOC + NOx + CO₂) sends 8 + 8 + 8 + 10 = 34 bits.
28 bits total.
0 1 2
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CPM | Dose |
| (14 bits) | (14 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
This is a convenience bundle that packs radiation CPM and dose into a single field. The component encodings are identical to the standalone Radiation CPM (8.24) and Radiation Dose (8.25) types.
CPM (14 bits):
Range: 0 to 16383 counts per minute (CPM). Resolution: 1 CPM. Direct encoding.
This field carries the raw count rate from a Geiger-Müller tube or similar radiation detector.
Dose (14 bits):
Range: 0 to 163.83 µSv/h. Resolution: 0.01 µSv/h (16,383 steps).
Encode: q = round(dose_usvh / 0.01)
Decode: dose_usvh = q * 0.01
This field carries the computed dose rate. The relationship between CPM and dose rate is detector-specific and is not defined by this protocol.
14 bits total.
Range: 0 to 16383 counts per minute (CPM). Resolution: 1 CPM. Direct encoding.
This field carries the raw count rate from a Geiger-Müller tube or similar radiation detector.
14 bits total.
Range: 0 to 163.83 µSv/h. Resolution: 0.01 µSv/h (16,383 steps).
Encode: q = round(dose_usvh / 0.01)
Decode: dose_usvh = q * 0.01
This field carries the computed dose rate. The relationship between CPM and dose rate is detector-specific and is not defined by this protocol.
4 bits total.
Range: 0 to 8 okta. Resolution: 1 okta. Direct encoding (4 bits = 16 values, 0-8 used).
Clouds measures cloud cover in okta (eighths of sky covered), following the standard meteorological convention where 0 = clear sky and 8 = fully overcast.
Variable length. Minimum 2 bytes (length + control), maximum 256 bytes (length + control + 254 bytes of pixel data).
0 1
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- ... -+
| Length (8) | Control (8) | Pixel Data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- ... -+
This is the only variable-length data field in the protocol. The length byte at the start tells the decoder how many additional bytes follow, allowing the field to be skipped without understanding its contents.
Length (8 bits): Number of bytes that follow the length byte, including the control byte and all pixel data. Range: 1-255.
A length of 1 indicates a control byte only with no pixel data. This is not useful in practice but is legal.
The total field size in bytes is 1 + Length. The total field size in bits is
(1 + Length) × 8.
Control (8 bits): Describes the pixel format, image dimensions, compression method, and flags. The decoder reads this byte to determine how to interpret all subsequent bytes.
0 1 2 3 4 5 6 7
+---+---+---+---+---+---+---+---+
| Format| Size | Comp | Flags |
| (2) | (2) | (2) | (2) |
+---+---+---+---+---+---+---+---+
Format (bits 7-6): Pixel depth.
| Value | Name | Bits/pixel | Description |
|---|---|---|---|
| 0 | BILEVEL | 1 | Black and white (1-bit per pixel) |
| 1 | GREY4 | 2 | 4-level greyscale |
| 2 | GREY16 | 4 | 16-level greyscale |
| 3 | — | Reserved |
For BILEVEL, each pixel is a single bit: 0 = black, 1 = white. Pixels are packed MSB-first within each byte, left-to-right across each row, rows top-to-bottom.
For GREY4, each pixel is 2 bits: 0 = black, 1 = dark grey, 2 = light grey, 3 = white. Pixels are packed MSB-first, four pixels per byte.
For GREY16, each pixel is 4 bits (one nibble): 0 = black, 15 = white. Pixels are packed high-nibble-first, two pixels per byte.
Size (bits 5-4): Image dimensions (width × height).
| Value | Dimensions | Pixels | Raw bytes (1bpp) | Raw bytes (4bpp) |
|---|---|---|---|---|
| 0 | 24 × 18 | 432 | 54 | 216 |
| 1 | 32 × 24 | 768 | 96 | 384 |
| 2 | 48 × 36 | 1,728 | 216 | 864 |
| 3 | 64 × 48 | 3,072 | 384 | 1,536 |
All sizes use a 4:3 aspect ratio. The size tier determines both width and height; non-standard dimensions are not supported.
Compression (bits 3-2): Compression method applied to pixel data.
| Value | Name | Description |
|---|---|---|
| 0 | RAW | Uncompressed pixel data |
| 1 | RLE | Run-length encoding (Section 8.27.1) |
| 2 | HEATSHRINK | Heatshrink LZSS (Section 8.27.2) |
| 3 | Reserved |
Flags (bits 1-0):
| Bit | Name | Description |
|---|---|---|
| 1 | FRAGMENT | This image is a fragment; more fragments follow |
| 0 | INVERT | Display with inverted polarity (0=white, 1=black) |
The FRAGMENT flag enables multi-packet image transmission for cases where the pixel data exceeds the available payload. Fragments share the same control byte; the receiver reassembles using the packet sequence number and station_id. For v1, single-frame images (FRAGMENT = 0) are the expected case.
The INVERT flag indicates that the pixel sense is reversed. This is useful for difference-frame images where motion pixels are naturally encoded as 1 (white on black background). The flag allows the display layer to render with the correct visual polarity without the encoder needing to invert the pixel data.
The Image field defines a container for a rectangular pixel grid. It does not specify what the pixels represent. The sensor implementation decides what is most informative — a full-frame downscale, a cropped region-of-interest around detected motion, a background-subtracted difference mask, a depth map, or any other rectangular image. The field carries the result; the semantics are a property of the sensor and variant, not the encoding.
Unlike all other data fields in the protocol, Image has a variable bit width. The decoder handles this as follows:
- The presence bit for the Image slot is set.
- The decoder reads the first byte (Length).
- The decoder consumes
Lengthadditional bytes. - Decoding continues at the next field's bit offset.
Implementations that do not support Image can skip the field by reading the
length byte and advancing by Length bytes, without interpreting the control
byte or pixel data. This preserves forward compatibility: a decoder compiled
without Image support can still decode all other fields in the packet.
When Compression = RLE, the pixel data is encoded as a sequence of run-length pairs. Each pair is a single byte:
0 1 2 3 4 5 6 7
+---+---+---+---+---+---+---+---+
|Val| Run Length (7) |
+---+---+---+---+---+---+---+---+
For BILEVEL format, Val (bit 7) is the pixel value (0 or 1) and Run Length (bits 6-0) is the number of consecutive pixels with that value, minus 1 (range 1-128 pixels per run).
For GREY4 and GREY16 formats, the encoding switches to a byte-pair scheme: the first byte is a raw pixel value (2 or 4 bits, zero-padded to 8 bits) and the second byte is the run count minus 1. This produces 2 bytes per run but handles the wider pixel values cleanly.
Runs that exceed 128 pixels (BILEVEL) or 256 pixels (greyscale) are split into consecutive run entries with the same value.
The decoder reconstructs the pixel grid left-to-right, top-to-bottom, consuming runs until width × height pixels have been produced.
RLE is particularly effective for BILEVEL images with large uniform regions, such as background-subtracted motion frames, where compression ratios of 2:1 to 6:1 are typical.
When Compression = HEATSHRINK, the pixel data (in its raw packed form) has been compressed using the heatshrink LZSS algorithm.
The heatshrink parameters are fixed by this protocol and MUST NOT be varied per-packet:
- Window size: 8 (256-byte window)
- Lookahead size: 4 (16-byte lookahead)
These parameters are chosen for minimal RAM usage at the decoder (approximately 256 bytes for decompression state) while still providing useful compression. The decoder does not need to be told the parameters; they are implicit in the field type.
Heatshrink is most useful for GREY4 and GREY16 formats where pixel data has more entropy than BILEVEL and simple RLE is less effective.
The length byte (8 bits) limits the field value to 255 bytes after the length byte itself: 1 control byte plus up to 254 bytes of pixel data.
The following table shows which format/size combinations fit within 254 bytes without compression:
| Size | BILEVEL (1bpp) | GREY4 (2bpp) | GREY16 (4bpp) |
|---|---|---|---|
| 24 × 18 | 54 B ✓ | 108 B ✓ | 216 B ✓ |
| 32 × 24 | 96 B ✓ | 192 B ✓ | 384 B ✗ |
| 48 × 36 | 216 B ✓ | 432 B ✗ | 864 B ✗ |
| 64 × 48 | 384 B ✗ | 768 B ✗ | 1,536 B ✗ |
Combinations marked ✗ require compression to fit. In practice, BILEVEL at 32 × 24 (96 bytes raw, typically 40-60 bytes with RLE) is the recommended default for single-frame LoRa transmission. It provides sufficient resolution to distinguish human silhouettes, vehicles, and animals while leaving substantial room for other iotdata fields in the same packet.
The LoRa payload limit (222 bytes at SF7/125kHz, 115 bytes at SF9, 51 bytes at SF10) further constrains the practical combinations. For higher spreading factors, 24 × 18 BILEVEL with RLE is the safest choice.
-
Default choice: BILEVEL format, 32 × 24 size, RLE compression. This produces 40-60 byte thumbnails for typical motion frames, fits comfortably in a single LoRa packet at any spreading factor, and requires trivial encode/decode logic.
-
ROI cropping: If the sensor detects motion in a small region of the camera frame, cropping to that region before downscaling preserves more detail than downscaling the entire frame. The Image field does not carry crop coordinates; these are a property of the sensor's processing pipeline, not the transport encoding.
-
Difference frames: For background-subtracted motion images, set the INVERT flag if the natural encoding is white-on-black (motion pixels = 1). The resulting BILEVEL image compresses exceptionally well with RLE due to large background regions.
-
Greyscale use: GREY16 at 24 × 18 with heatshrink (216 bytes raw, typically 100-150 bytes compressed) provides a richer visual at the cost of decode complexity. Use when the MCU has sufficient resources and the additional visual detail is valuable.
-
Multi-frame spanning: The FRAGMENT flag enables splitting a large thumbnail across multiple packets. The gateway reassembles fragments using {station_id, sequence} ordering. This adds complexity and fragility (any lost fragment invalidates the image) and is not recommended for v1 deployments.
In the canonical JSON output, the Image field is represented as a structured
object under its variant label (e.g. "image", "thumbnail", "motion_image",
depending on the variant map definition):
{
"image": {
"format": "bilevel",
"size": "32x24",
"compression": "rle",
"fragment": false,
"invert": false,
"pixels": "base64-encoded-pixel-data"
}
}The gateway performs decompression before base64-encoding the pixels field, so
downstream consumers receive uniform raw pixel data regardless of the
compression method used on the wire.
format: One of"bilevel","grey4","grey16".size: One of"24x18","32x24","48x36","64x48".compression: One of"raw","rle","heatshrink".fragment: Boolean.invert: Boolean.pixels: Base64-encoded decompressed pixel data.
The compression field records the wire method for diagnostics but is not
needed for rendering.
The TLV (Type-Length-Value) section provides an extensible mechanism for diagnostic data, firmware metadata, user-defined payloads, and future sensor metadata. It is present only when the TLV bit (bit 6 of Presence Byte 0) is set. By preference, it should not be used for sensor data per se: such data should have a designated field type.
The TLV section begins immediately after the last data field, at whatever bit offset that field ended. There is no alignment padding.
Each TLV entry begins with a 16-bit header:
0 1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
|Fmt| Type (6) |Mor| Length (8) |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
Format (1 bit): 0 = raw bytes. 1 = packed 6-bit string.
Type (6 bits): Application-defined type identifier, range 0-63. See Section 9.4 for types.
More (1 bit): 1 = another TLV entry follows this one. 0 = this is the last TLV entry.
Length (8 bits): For raw format: number of data bytes (0-255). For string format: number of characters (0-255).
When Format = 0, the data section is Length bytes (Length × 8 bits), packed
MSB-first with no alignment.
Total TLV entry size: 16 + (Length × 8) bits.
When Format = 1, each character is encoded as 6 bits using the character table in Appendix A. This saves 25% compared to 8-bit ASCII for the supported character set (alphanumeric plus space).
Total TLV entry size: 16 + (Length × 6) bits.
Characters outside the 6-bit table MUST NOT be transmitted. An encoder MUST reject strings containing unencodable characters.
| Type | Name | Format | Description |
|---|---|---|---|
| 0x01-0x0F | (reserved) | — | Reserved for globally designated TLVs |
| 0x01 | VERSION | string | Firmware and hardware version identification |
| 0x02 | STATUS | raw | Uptime, lifetime uptime, restart count and reason |
| 0x03 | HEALTH | raw | CPU temperature, supply voltage, heap, active time |
| 0x04 | CONFIG | string | Configuration key-value pairs |
| 0x05 | DIAGNOSTIC | string | Free-form diagnostic message |
| 0x06 | USERDATA | string | User interaction event |
| 0x08-0x0F | (reserved) | — | Reserved for future globally designated TLVs |
| 0x10-0x1F | (reserved) | — | Reserved for future quality/metadata TLVs |
| 0x20- | (available) | — | Available for proprietary TLVs |
Types 0x01-0x0F are reserved for globally designated types, as specified in, and extended by, this document. They have encoding functions provided in the reference implementation. Types 0x10-0x1F are reserved for sensor metadata (see Section 11.3 and Section 15) and may have future reference implementation support. Types 0x20 onwards are available for application use.
The following TLV types are globally designated and have fixed semantics across all variants and deployments. Implementations SHOULD use these types for their intended purpose to aid interoperability between sensors, gateways, and downstream consumers.
All global TLV types are optional. A sensor includes them when the information is available and the payload budget permits. The recommended transmission strategy varies by type:
- VERSION: Once at boot (first packet after restart).
- STATUS: Every Nth packet (e.g. every 10th), or periodically.
- HEALTH: Less frequently (e.g. every 50th), or when significantly changed.
- CONFIG: Once at boot, or after configuration changes.
- DIAGNOSTIC: When a notable condition occurs.
- USERDATA: When a user interaction event occurs.
Variable length, string format.
Identifies the firmware and hardware versions running on the device. This is essential for fleet management: knowing which devices are running which firmware version after an OTA campaign, or identifying hardware revisions with known issues.
The content uses the same space-delimited key-value convention as Config (Section 9.5.4), encoded with the 6-bit packed character set (Appendix A):
KEY1 VALUE1 KEY2 VALUE2 ...
Recommended keys:
| Key | Description |
|---|---|
| FW | Firmware version (build number or encoded version) |
| HW | Hardware revision |
| BL | Bootloader version |
| ID | Device model or type identifier |
| SN | Serial number or unique identifier |
Examples:
FW 142 HW 3— firmware build 142, hardware revision 3FW 20401 HW 2 BL 5— firmware 2.4.1 (encoded as 20401), hardware rev 2, bootloader 5ID SNOWV2 FW 38 HW 1— device model SNOWV2, firmware 38FW 12 HW 1 SN A04F— with serial number
The key namespace is the same as Config: application-defined, short uppercase
identifiers. The keys listed above are recommendations, not requirements. A
minimal implementation may send only FW and HW.
Since version information is static within a boot cycle, this TLV is typically sent only in the first packet after a restart. The gateway or upstream system can cache it per station_id.
Since the 6-bit character set does not include dots or hyphens, semantic version
strings such as 2.4.1 cannot be encoded directly. Recommended alternatives:
- Concatenated digits:
20401for 2.4.1 (convention: MMPPP where MM=major×100+minor, PPP=patch). - Plain build number:
142(monotonically increasing). - Separate keys:
FWMAJ 2 FWMIN 4 FWPAT 1(verbose but explicit).
The build number approach is simplest and sufficient for most deployments.
JSON representation:
{
"type": 1,
"format": "version",
"data": {
"FW": "142",
"HW": "3"
}
}The gateway parses the space-delimited tokens into key-value pairs, identical to the Config JSON representation.
9 bytes, raw format.
Reports device boot lifecycle: how long since last restart, how long the device has been alive in total across all boots, how many times it has restarted, and why the most recent restart occurred.
0 1 2
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Session Uptime (24 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Lifetime Uptime (24 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Restarts (16 bits) | Reason (8) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Session Uptime (3 bytes, uint24, big-endian): Time since the most recent boot, measured in 5-second ticks. This matches the resolution and encoding of the Datetime field (Section 8.8).
Encode: ticks = uptime_seconds / 5
Decode: seconds = ticks * 5
Maximum: 16,777,215 ticks = 83,886,075 seconds ≈ 970.9 days.
Lifetime Uptime (3 bytes, uint24, big-endian): Total accumulated uptime across all boots since first commissioning, measured in 5-second ticks. Same encoding as session uptime.
This value requires non-volatile storage (NVS, EEPROM, or flash). The device persists the accumulated total periodically (e.g. every hour or at shutdown) and adds the current session uptime when encoding the TLV.
Devices that do not track lifetime uptime MUST transmit 0x000000. The receiver interprets this as "not tracked" rather than "zero uptime".
Restarts (2 bytes, uint16, big-endian): Total number of device starts since first commissioning, including the current boot. Wraps at 65535. A value of 1 indicates the device has never restarted since first power-on.
Reason (1 byte, uint8): Reason for the most recent restart. Bit 7 determines the interpretation:
-
Bit 7 clear (0x00-0x7F): Globally defined reason codes, specified by this protocol. All implementations MUST use these values for the corresponding conditions.
-
Bit 7 set (0x80-0xFF): Vendor-specific or device-specific reason codes. The interpretation depends on the device type and firmware. Receivers that do not recognise a vendor-specific code SHOULD display it as a numeric value.
Globally defined reason codes:
| Value | Name | Description |
|---|---|---|
| 0x00 | UNKNOWN | Reason not available or not determined |
| 0x01 | POWER_ON | Cold boot (initial power application) |
| 0x02 | SOFTWARE | Intentional software-initiated reset |
| 0x03 | WATCHDOG | Watchdog timer expiry |
| 0x04 | BROWNOUT | Supply voltage dropped below threshold |
| 0x05 | PANIC | Unrecoverable software fault or exception |
| 0x06 | DEEPSLEEP | Wake from deep sleep (normal operation) |
| 0x07 | EXTERNAL | External reset pin or button |
| 0x08 | OTA | Reset following over-the-air firmware update |
| 0x09-0x7F | (reserved) | Reserved for future globally defined reasons |
Most microcontrollers expose the reset reason register at boot. For example,
ESP32 provides esp_reset_reason() and STM32 provides __HAL_RCC_GET_FLAG().
The encoder maps the platform-specific value to the nearest globally defined
code where possible, or to a vendor-specific code (0x80+) for platform-specific
conditions that have no global equivalent.
The DEEPSLEEP reason (0x06) is expected in normal operation for battery-powered sensors that sleep between transmission cycles. A high restart count with DEEPSLEEP reason is healthy; a high restart count with WATCHDOG or PANIC reason indicates a fault.
JSON representation:
{
"type": 2,
"format": "status",
"data": {
"session_uptime": 86400,
"lifetime_uptime": 1209600,
"restarts": 12,
"reason": "watchdog"
}
}The gateway destructures the 9-byte raw data into named fields. Uptime values
are converted to seconds (ticks × 5) for the JSON output. A lifetime_uptime of 0
is omitted from the JSON or represented as null to indicate "not tracked". The
reason field is a lowercase string using the name column from the reason table
for globally defined codes (0x00-0x7F), or the numeric value for vendor-specific
codes (e.g. "reason": 131).
7 bytes, raw format.
Reports runtime hardware state: thermal, electrical, memory, and duty cycle metrics. These change during operation and are useful for detecting overheating, power supply issues, memory leaks, and validating power budgets.
0 1 2
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CPU Temp (8) | Supply Voltage (16) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Free Heap (16) | Active (16) :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: Active cont. |
+-+-+-+-+-+-+-+-+
CPU Temperature (1 byte, int8, signed): Internal die temperature in degrees Celsius. Range: -40 to +85°C. Resolution: 1°C.
Most MCUs have an internal temperature sensor: ESP32 provides
temperatureRead(), STM32 provides an internal ADC channel. The reading
reflects die temperature, which is typically 5-15°C above ambient depending on
workload and packaging.
Devices without an internal temperature sensor MUST transmit 0x7F (127). This value is outside the normal operating range and the receiver interprets it as "not available".
Supply Voltage (2 bytes, uint16, big-endian): Raw supply rail voltage in millivolts. Range: 0-65535 mV.
This is distinct from the Battery field (Section 8.1) which reports a percentage level. Supply voltage provides absolute electrical data: solar panel output voltage, regulator headroom, voltage sag under transmit load, or direct battery voltage before any regulation.
For devices powered via a regulated 3.3V rail, this may be a fixed value and is less informative. For solar-powered devices with a wide input range, this is a key diagnostic.
Free Heap (2 bytes, uint16, big-endian): Remaining free heap memory in bytes. Range: 0-65535.
ESP32 provides esp_get_free_heap_size(). For devices with more than 65535
bytes free, report 65535 (capped). A steadily decreasing free heap over time
indicates a memory leak.
Devices without dynamic memory allocation or without a mechanism to query free heap MUST transmit 0xFFFF (65535). Since this is also the cap value, the receiver treats it as "healthy or not tracked".
Session Active (2 bytes, uint16, big-endian): Accumulated time spent in active state (not in deep sleep) since the most recent boot, measured in 5-second ticks.
Maximum: 65535 ticks = 327675 seconds ≈ 91.0 hours.
The firmware increments this counter each time it wakes from sleep, accumulating the duration of each active period. Comparing session active to session uptime (from Status, Section 9.5.2) yields the duty cycle:
duty_cycle = session_active / session_uptime
A sensor with 86400s session uptime but 200 active ticks (1000s) has a duty cycle of ~1.2%, confirming that power budgets are being met.
For devices that do not sleep (always-on gateways, relay nodes), session active equals session uptime and this field provides no additional information. Such devices may omit the Health TLV or set session active to 0x0000 to indicate "not tracked".
JSON representation:
{
"type": 3,
"format": "health",
"data": {
"cpu_temp": 34,
"supply_mv": 3842,
"free_heap": 42816,
"session_active": 1050
}
}The gateway destructures the 7-byte raw data into named fields. The cpu_temp
is signed degrees Celsius. The supply_mv is millivolts. The free_heap is
bytes. The session_active is converted to seconds (ticks × 5).
A cpu_temp of 127 is omitted from the JSON or represented as null to
indicate "not available".
Variable length, string format.
Reports current device configuration as space-delimited key-value pairs, encoded using the 6-bit packed character set (Appendix A).
The content is a sequence of alternating tokens separated by single spaces:
KEY1 VALUE1 KEY2 VALUE2 ...
Odd-position tokens (1st, 3rd, 5th, ...) are keys. Even-position tokens (2nd, 4th, 6th, ...) are values. The total token count MUST be even (every key has a corresponding value).
Keys and values MUST NOT contain spaces. Keys and values may use any length and any mix of characters available in the 6-bit character set. Short uppercase identifiers are recommended for keys to minimise wire size, but this is a convention, not a requirement.
Examples:
TX 30 SF 7 PW 14 CH 23— radio configurationINT 10 BAT LOW— 10-second interval, battery threshold LOWMODE NORMAL THRESH 50— operating mode and thresholdFW 142 HW 3— firmware version 142, hardware revision 3
The key namespace is application-defined and not standardised by this protocol. Different sensor types may use different keys. The receiver presents the pairs as-is; it does not need to understand the key semantics.
In the rare case where configuration values contain characters outside the 6-bit set, raw format (Format = 0) MAY be used with 8-bit ASCII bytes following the same space-delimited convention. This should be avoided where possible.
JSON representation:
{
"type": 4,
"format": "config",
"data": {
"TX": "30",
"SF": "7",
"PW": "14",
"CH": "23"
}
}The gateway parses the space-delimited tokens into alternating key-value pairs and presents them as a JSON object. Both keys and values are strings.
Variable length, string format.
A free-form diagnostic message from the device. This is the device's mechanism for reporting conditions that do not map to any structured field: error messages, warning strings, state transitions, or any other human-readable diagnostic information.
The message is encoded using the 6-bit packed character set (Appendix A). This covers uppercase alphanumeric characters, digits, and space — sufficient for diagnostic messages.
Examples:
SENSOR FAULT I2CLOW SIGNALSD FULLLORA TX FAIL 3BME280 CRC ERR
There is no structure imposed on the message content. The protocol does not
define severity levels, error codes, or categories. Conventions such as
prefixing with a subsystem name (I2C, LORA, SD) are recommended but not
required.
In the rare case where a diagnostic message contains characters outside the 6-bit set, raw format (Format = 0) MAY be used with 8-bit ASCII bytes. This should be avoided where possible as it increases the wire size by 33%.
Multiple diagnostic messages may be sent by chaining TLV entries using the More bit. Each entry carries one message.
JSON representation:
{ "type": 5, "format": "string", "data": "SENSOR FAULT I2C" }The format field reflects the wire encoding ("string" or "raw" in the
exceptional case). The data field is always a decoded text string regardless
of wire format.
Variable length, string format.
Reports a user-initiated event or interaction, encoded using the 6-bit packed character set (Appendix A).
This covers any event that originates from physical user interaction with the device rather than from automated sensor readings: button presses, switch changes, mode selections, tamper detection, or manual triggers.
The content is a free-form string describing the event. Examples:
BTN A— button A pressedBTN B LONG— button B long-pressMODE 2— user selected operating mode 2TAMPER— enclosure tamper switch triggeredARM— user armed the deviceCAL START— user initiated calibrationDOOR OPEN— door sensor triggered
No structure is imposed on the message content. The sensor firmware defines the event vocabulary appropriate to its hardware and application.
In the rare case where an event description contains characters outside the 6-bit set, raw format (Format = 0) MAY be used with 8-bit ASCII bytes. This should be avoided where possible.
JSON representation:
{ "type": 6, "format": "string", "data": "BTN A" }The format field reflects the wire encoding. The data field is a decoded
text string.
Gateways and servers typically convert binary packets to JSON for storage,
forwarding, and human inspection. The reference implementation provides
bidirectional conversion (iotdata_decode_to_json and
iotdata_encode_from_json) with the following canonical mapping.
The JSON field names are derived from the variant's field labels, so the same
binary encoding may produce different JSON keys depending on variant. For
example, a default weather station variant produces "wind" as a bundled JSON
object, while a custom variant using individual wind fields produces separate
"wind_speed", "wind_direction", and "wind_gust" keys. Similarly, the
"depth" field type may produce "snow_depth", "soil_depth", or any other
label depending on the variant definition.
{
"variant": 0,
"station": 42,
"sequence": 1234,
"packed_bits": 120,
"packed_bytes": 15,
"battery": {
"level": 84,
"charging": false
},
"link": {
"rssi": -96,
"snr": 10.0
}
"environment": {
"temperature": 21.5,
"pressure": 1013,
"humidity": 45
},
"wind": {
"speed": 5.0,
"direction": 180,
"gust": 8.5
},
"rain": {
"rate": 3,
"size": 2.5,
},
"solar": {
"irradiance": 850,
"ultraviolet": 7
},
}TLV entries are represented as an array under "data". Each entry contains
"type", "format", and "data" fields.
The "type" field is the numeric TLV type identifier. The "format" field
indicates how the "data" field should be interpreted.
TLV types that have a defined JSON representation (Section 9.5) are destructured
by the gateway into structured objects or decoded strings. The "format" field
reflects the structured type rather than the wire encoding:
| Type | Format | data contains |
|---|---|---|
| 0x01 | "version" |
Structured object: firmware/hardware key-values |
| 0x02 | "status" |
Structured object: uptimes, restarts, reason |
| 0x03 | "health" |
Structured object: temp, voltage, heap, active |
| 0x04 | "config" |
Structured object: key-value pairs |
| 0x05 | "string" |
Decoded diagnostic text string |
| 0x06 | "string" |
Decoded userdata text string |
Example with all global types:
{
"data": [
{
"type": 1,
"format": "version",
"data": {
"FW": "142",
"HW": "3"
}
},
{
"type": 2,
"format": "status",
"data": {
"session_uptime": 86400,
"lifetime_uptime": 1209600,
"restarts": 12,
"reason": "watchdog"
}
},
{
"type": 3,
"format": "health",
"data": {
"cpu_temp": 34,
"supply_mv": 3842,
"free_heap": 42816,
"session_active": 1050
}
},
{
"type": 4,
"format": "config",
"data": {
"TX": "30",
"SF": "7",
"PW": "14"
}
},
{ "type": 5, "format": "string", "data": "LOW SIGNAL" },
{ "type": 6, "format": "string", "data": "BTN A" }
]
}Note that a single packet would not typically contain all of these. A normal transmission might include only sensor data fields with no TLV entries at all, or one or two TLV entries such as Status and a Diagnostic message. Packets may also contain repeated entries, for example, multiple Diagnostic or Userdata TLVs.
TLV types that do not have a defined JSON representation — including proprietary types (0x20+), reserved types, and any type the gateway does not recognise — fall back to a generic encoding based on the wire format bit:
| Wire format | "format" |
"data" contains |
|---|---|---|
| raw (0) | "raw" |
Base64-encoded byte string |
| string (1) | "string" |
Decoded text string |
Examples:
{
"data": [
{ "type": 32, "format": "raw", "data": "A0b4901=" },
{ "type": 33, "format": "string", "data": "HELLO WORLD" }
]
}This ensures that all TLV entries are representable in JSON even if the gateway has no knowledge of the type's semantics. The raw Base64 or decoded string is passed through for downstream consumers to interpret.
The "format" field serves as a discriminator for how to parse the "data"
field. The complete set of values:
| Value | data type |
Source |
|---|---|---|
"raw" |
string (Base64) | Fallback for unrecognised raw TLV types |
"string" |
string (text) | String-format TLVs (wire or defined) |
"version" |
object | Version TLV (0x01) |
"status" |
object | Status TLV (0x02) |
"health" |
object | Health TLV (0x03) |
"config" |
object | Config TLV (0x04) |
Note that "string" appears both as the defined format for Diagnostic (0x05)
and Userdata (0x06), and as the fallback for unrecognised string-format TLVs.
This is intentional — the representation is identical in both cases (a plain
text string), so no distinction is needed.
The JSON representation MUST support lossless round-trip conversion: encoding a packet to binary, decoding to JSON, re-encoding from JSON, and comparing the resulting binary MUST produce an identical byte sequence. The reference implementation test suite verifies this property.
This section describes algorithms and considerations that apply to the receiving side (gateway, server, or any device decoding packets).
The datetime field encodes seconds from the start of the current year but does not transmit the year. The receiver MUST resolve the year using the following algorithm:
- Let
T_rxbe the receiver's current UTC time. - Let
Ybe the year component ofT_rx. - Decode the datetime field to obtain
Sseconds from year start. - Compute
T_decoded = Y-01-01T00:00:00Z + S seconds. - If
T_decodedis more than 6 months in the future relative toT_rx, subtract one year:T_decoded = (Y-1)-01-01T00:00:00Z + S.
This handles the year boundary: a packet timestamped December 31 and received January 1 is correctly attributed to the previous year.
The 24-bit field at 5-second resolution supports approximately 971 days, so the encoding does not wrap within a single year.
The accuracy of the decoded timestamp depends on the accuracy of the transmitter's time source. For GNSS-synchronised devices this is typically sub-second; for free-running RTC devices it may drift by seconds per day. See Section 11.3.
The position field (Section 8.7) encodes latitude and longitude without indicating the source. The practical accuracy varies significantly by source:
-
GNSS (GPS/Galileo/GLONASS): typically 2-5 metre accuracy, well within the ~1.2m quantisation of the 24-bit encoding.
-
WiFi geolocation: typically 15-50 metre accuracy. The quantisation error is negligible relative to the source error.
-
Cell tower: typically 100-1000 metre accuracy.
-
Static configuration: the operator programmes the known coordinates at deployment time. Accuracy depends on the method used (surveyed, map click, etc.).
In a closed system where the operator controls all devices and knows their position sources, this ambiguity is acceptable. For open or interoperable systems, the source and accuracy SHOULD be communicated via a sensor metadata TLV (Section 11.3).
Similarly, for fixed-position sensors, the position field is typically transmitted once at startup or periodically at a low rate, not on every packet. The receiver SHOULD cache the last known position for a station and associate it with subsequent packets that omit position.
The core protocol deliberately omits sensor metadata such as:
- Sensor type (NTC thermistor, BME280, SHT40, etc.)
- Measurement accuracy or precision class
- Position source (GNSS, static config, etc.)
- Time source (GNSS, NTP, free-running RTC, etc.)
- Calibration date or coefficients
In a closed system — where one operator controls all devices and the gateway software — this information is known out-of-band. The operator knows that station 42 uses a BME280 for environment readings with ±1°C accuracy, has a static position programmed at deployment, and synchronises time via NTP. No wire overhead is needed.
For interoperable systems — where devices from different manufacturers or deployments share a common receiver — sensor metadata becomes important. The protocol reserves TLV types 0x10-0x1F for future standardised metadata TLVs that could convey:
- Source type per field (e.g. "position source = GNSS" or "temperature sensor = BME280")
- Accuracy class or error bounds
- Calibration metadata
The design of these metadata TLVs is deferred to a future revision of this specification. Implementers requiring interoperability before that revision MAY use application-defined TLV types (0x20-) for this purpose, with the understanding that these are not standardised.
This approach follows a deliberate design philosophy: add wire overhead only when it is needed. A snow depth sensor transmitting to its own gateway every 15 minutes on a coin cell battery should not pay the cost of metadata bytes that the receiver already knows.
A receiver encountering a variant number that it does not have a table entry for SHOULD:
- Fall back to variant 0's field mapping for decoding.
- Flag the packet as using an unknown variant in its output (e.g. a warning in the print output or a field in the JSON).
- NOT reject the packet, since the field encodings are universal and the data is likely still meaningful.
In the reference implementation, iotdata_get_variant() returns variant 0's
table as a fallback for any unknown variant number.
Receivers should be aware of the quantisation errors inherent in the encoding. These are systematic and deterministic — not noise — and SHOULD be accounted for in any downstream processing.
| Field | Bits | Range | Resolution | Max quant error |
|---|---|---|---|---|
| Battery level | 5 | 0-100% | ~3.23% | ±1.6% |
| Link RSSI | 4 | -120 to -60 dBm | 4 dBm | ±2 dBm |
| Link SNR | 2 | -20 to +10 dB | 10 dB | ±5 dB |
| Temperature | 9 | -40 to +80°C | 0.25°C | ±0.125°C |
| Pressure | 8 | 850-1105 hPa | 1 hPa | ±0.5 hPa |
| Humidity | 7 | 0-100% | 1% | ±0.5% |
| Wind speed | 7 | 0-63.5 m/s | 0.5 m/s | ±0.25 m/s |
| Wind direction | 8 | 0-355° | ~1.41° | ±0.7° |
| Wind gust | 7 | 0-63.5 m/s | 0.5 m/s | ±0.25 m/s |
| Rain rate | 8 | 0-255 mm/hr | 1 mm/hr | ±0.5 mm/hr |
| Rain size | 4 | 0-6.0 mm/d | 0.25 mm/d | ±0.5 mm/d |
| Solar Irradiance | 10 | 0-1023 W/m² | 1 W/m² | ±0.5 W/m² |
| Solar UV Index | 4 | 0-15 | 1 | ±0.5 |
| Clouds | 4 | 0-8 okta | 1 okta | ±0.5 okta |
| AQ Index | 9 | 0-500 AQI | 1 AQI | ±0.5 AQI |
| AQ PM channels | 8 | 0-1275 µg/m³ | 5 µg/m³ | ±2.5 µg/m³ |
| AQ Gas VOC idx | 8 | 0-510 | 2 idx pts | ±1 idx pt |
| AQ Gas NOx idx | 8 | 0-510 | 2 idx pts | ±1 idx pt |
| AQ Gas CO₂ | 10 | 0-51,150 ppm | 50 ppm | ±25 ppm |
| AQ Gas CO | 10 | 0-1,023 ppm | 1 ppm | ±0.5 ppm |
| AQ Gas HCHO | 10 | 0-5,115 ppb | 5 ppb | ±2.5 ppb |
| AQ Gas O₃ | 10 | 0-1,023 ppb | 1 ppb | ±0.5 ppb |
| Radiation CPM | 16 | 0-65535 CPM | 1 CPM | ±0.5 CPM |
| Radiation dose | 14 | 0-163.83 µSv/h | 0.01 µSv/h | ±0.005 µSv/h |
| Depth | 10 | 0-1023 cm | 1 cm | ±0.5 cm |
| Latitude | 24 | -90° to +90° | ~0.00001073° | ~0.6 m |
| Longitude | 24 | -180° to +180° | ~0.00002146° | ~1.2 m (eq) |
| Datetime | 24 | 0-83.9M seconds | 5 s | ±2.5 s |
These quantisation errors are generally smaller than the measurement uncertainty of the sensors themselves. For example, a typical BME280 temperature sensor has ±1°C accuracy, well above the 0.125°C quantisation error.
The encode formulae in Section 8 define behaviour for values within the stated range of each field. The protocol does not mandate a specific behaviour for out-of-range inputs (e.g. a temperature of -45°C when the defined range is -40°C to +80°C, or a wind speed of 70 m/s when the maximum is 63.5 m/s).
Implementations SHOULD adopt one of the following strategies, applied consistently across all field types:
-
Clamp to range. Values below the minimum are encoded as the minimum; values above the maximum are encoded as the maximum. This preserves the invariant that every input produces a valid encoded value. Clamped values are silently distorted — the receiver cannot distinguish a clamped reading from a genuine reading at the boundary.
-
Reject and omit. The encoder refuses to encode the field and clears its presence bit. The receiver sees the field as absent rather than as a potentially misleading value. This is appropriate for safety-critical deployments where an out-of-range reading may indicate a sensor fault.
-
Clamp and flag. As (1), but the encoder also sets a deployment-specific flag bit (Section 8.6) or emits a DIAGNOSTIC TLV indicating the condition.
The reference implementation uses strategy (1): all values are clamped to the representable range with no indication to the receiver. Deployments that require out-of-range detection SHOULD use the Flags field or a DIAGNOSTIC TLV to communicate the condition.
For fields with a defined "not available" sentinel (e.g. CPU Temperature = 0x7F in the Health TLV), the sentinel MUST NOT be produced by clamping. If the clamped value would collide with the sentinel, the encoder MUST use strategy (2) instead.
The protocol is designed for environments where packet corruption is handled at the link layer (LoRa CRC, LoRaWAN MIC, cellular integrity checks). However, decoders may encounter malformed packets due to firmware bugs, version mismatches, partial reception on links without CRC, or deliberate fuzzing. This section defines the expected decoder behaviour.
A decoder that encounters any condition it cannot resolve MUST discard the entire packet. Partial decoding — where some fields are extracted and others are silently skipped or defaulted — is NOT RECOMMENDED, as it can produce internally inconsistent records (e.g. a wind direction without a wind speed, or a position from a different transmission cycle than the temperature).
A decoder MAY log or count discarded packets for diagnostic purposes. The discard reason SHOULD be made available to the operator.
The following conditions MUST result in packet discard:
-
Packet too short. A packet shorter than 5 bytes (32-bit header + 1 presence byte) is not a valid iotdata packet.
-
Unknown variant. A variant ID that does not appear in the decoder's variant table. See also Section 11.4. The decoder cannot determine field widths or ordering without a variant definition, so no fields can be extracted.
-
Truncated fields. The presence bits indicate a field is present, but the remaining packet data is insufficient to contain it. This typically indicates corruption or a version mismatch where the receiver's field table does not match the transmitter's.
-
Truncated TLV. The TLV bit (bit 6 of Presence Byte 0) is set, but the remaining data after the last data field is insufficient to contain a valid TLV header (16 bits), or a TLV entry's length field extends past the end of the packet.
-
Extension byte overflow. A presence byte chain exceeds the decoder's maximum supported depth (4 bytes in the reference implementation). A decoder SHOULD discard rather than attempt to process an unexpectedly deep presence chain, as it may indicate corruption of the extension bits.
The following conditions are anomalous but not fatal. A decoder SHOULD process the packet and MAY flag the anomaly:
-
Quantised value at range boundary. A decoded value at exactly the minimum or maximum of its defined range is valid. It may represent a clamped out-of-range input (see Section 11.5), but the decoder cannot distinguish this from a genuine boundary reading.
-
Out-of-range quantised value. A raw quantised value that exceeds the number of defined steps (e.g. humidity = 120 in a 7-bit field with range 0–100) indicates corruption or a version mismatch. The decoder SHOULD clamp the value to the defined range and MAY flag the anomaly. Discarding is also acceptable.
-
Sequence number discontinuity. A gap in the sequence number indicates lost packets, not a malformed packet. The receiver SHOULD track and report gaps but MUST NOT discard the current packet.
-
Unknown TLV type. A TLV entry with an unrecognised type code is not an error. The decoder MUST skip the entry using its length field and continue processing subsequent TLV entries and SHOULD preserve the entry in its generic form (Section 10) for downstream consumers.
-
Trailing bytes. If all presence-indicated fields and TLV entries have been decoded and bytes remain in the packet, the decoder SHOULD ignore the trailing data. This allows future protocol extensions to append data without breaking existing decoders.
Decoders MUST validate buffer bounds before every field read. The bit-packing
functions in the reference implementation accept a max_bits parameter and
return an error if a read would exceed it. Implementations that omit bounds
checking risk buffer overflows from crafted or corrupted packets.
A decoder operating on untrusted input (e.g. a gateway receiving packets from unknown stations) SHOULD treat all packets as potentially malformed and MUST NOT assume that a valid header implies a valid payload.
The following table shows exact bit and byte counts for common packet configurations, using variant 0 (weather_station).
| Scenario | Fields | Bits | Bytes |
|---|---|---|---|
| Heartbeat (no data) | header + pres0 | 40 | 5 |
| Minimal (battery only) | + battery | 46 | 6 |
| Battery + environment | + battery, environment | 70 | 9 |
| Typical pres0 (bat+env+wind+rain) | + battery, environment, wind, rain_rate | 104 | 13 |
| Full pres0 (all 6 fields) | + battery, env, wind, rain, solar, link | 124 | 16 |
| Full station (all 12 fields) | + all 12 field types (pres0 + pres1) | 253 | 32 |
For comparison, the equivalent data in JSON would typically be 200-600 bytes, and in a packed C struct with byte alignment would be 40-60 bytes.
The reference implementation (libiotdata) is written in C11 and targets both
embedded systems (ESP32-C3, STM32, nRF52, Raspberry Pi) and Linux
gateways/servers. It consists of:
iotdata.h— Public API, constants, and type definitions.iotdata.c— Encoder, decoder, JSON, print, and dump.tests/test_default.c— Test suite for the default variant.tests/test_custom.c— Test suite for custom variant maps.tests/test_failures.c— Test suite for failure modes.tests/test_version.c— Test smoke evaluation for build versions.tests/test_example.c— Test example for a periodic weather station.Makefile— Buildslibiotdata.astatic library and tests.
Build:
make # Build library and both test suites
make tests # Build and run default and custom tests
make test-example # Build and run example test
make test-versions # Build and run versions tests
make lib # Build static library only
make minimal # Measure minimal encoder-only buildDependencies: C11 compiler, libm, and cJSON (optional, only required for
JSON serialisation).
The encoder uses a "store then pack" strategy:
iotdata_encode_begin()initialises the context with header values and a buffer pointer.iotdata_encode_battery(),iotdata_encode_environment(), etc. validate inputs and store native-typed values in the context. Fields may be added in any order.iotdata_encode_end()performs all bit-packing in a single pass, consulting the variant field table to determine field order and presence byte layout.
This separation means that the encoder validates eagerly (at the encode_*()
call site where the developer can handle the error) and packs lazily (in one
pass, knowing the complete field set).
/* Example: encode a weather station packet */
#define IOTDATA_VARIANT_MAPS_DEFAULT
#include "iotdata.h"
iotdata_encoder_t enc;
uint8_t buf[64];
size_t len;
iotdata_encode_begin(&enc, buf, sizeof(buf), 0, 42, seq++);
iotdata_encode_battery(&enc, 84, false);
iotdata_encode_link(&enc, -95, 8.5f);
iotdata_encode_environment(&enc, 21.5f, 1013, 45);
iotdata_encode_wind(&enc, 5.2f, 180.0f, 8.7f);
iotdata_encode_rain(&enc, 3, 15); // x10 units
iotdata_encode_solar(&enc, 850, 7);
iotdata_encode_end(&enc, &len);
/* buf[0..len-1] is now a 15-byte packet */The library supports extensive compile-time configuration to minimise code size and memory usage on constrained targets.
Variant selection:
| Define | Effect |
|---|---|
IOTDATA_VARIANT_MAPS_DEFAULT |
Enable built-in weather station variant |
IOTDATA_VARIANT_MAPS=<sym> |
Use custom variant map array |
IOTDATA_VARIANT_MAPS_COUNT=<n> |
Number of entries in custom map |
Field support compilation:
| Define | Effect |
|---|---|
IOTDATA_ENABLE_SELECTIVE |
Only compile elements explicitly enabled below |
IOTDATA_ENABLE_BATTERY |
Compile battery field |
IOTDATA_ENABLE_LINK |
Compile link field |
IOTDATA_ENABLE_ENVIRONMENT |
Compile environment bundle |
IOTDATA_ENABLE_TEMPERATURE |
Compile temperature field |
IOTDATA_ENABLE_PRESSURE |
Compile pressure field |
IOTDATA_ENABLE_HUMIDITY |
Compile humidity field |
IOTDATA_ENABLE_WIND |
Compile wind bundle |
IOTDATA_ENABLE_WIND_SPEED |
Compile wind speed field |
IOTDATA_ENABLE_WIND_DIR |
Compile wind direction field |
IOTDATA_ENABLE_WIND_GUST |
Compile wind gust field |
IOTDATA_ENABLE_RAIN |
Compile rain bundle |
IOTDATA_ENABLE_RAIN_RATE |
Compile rain rate field |
IOTDATA_ENABLE_RAIN_SIZE |
Compile rain size field |
IOTDATA_ENABLE_SOLAR |
Compile solar field |
IOTDATA_ENABLE_CLOUDS |
Compile clouds field |
IOTDATA_ENABLE_AIR_QUALITY |
Compile air quality field |
IOTDATA_ENABLE_RADIATION |
Compile radiation bundle |
IOTDATA_ENABLE_RADIATION_CPM |
Compile radiation CPM field |
IOTDATA_ENABLE_RADIATION_DOSE |
Compile radiation dose field |
IOTDATA_ENABLE_DEPTH |
Compile depth field |
IOTDATA_ENABLE_POSITION |
Compile position field |
IOTDATA_ENABLE_DATETIME |
Compile datetime field |
IOTDATA_ENABLE_FLAGS |
Compile flags field |
IOTDATA_ENABLE_TLV |
Compile TLV fields |
When IOTDATA_ENABLE_SELECTIVE is defined, only the element types with their
corresponding IOTDATA_ENABLE_xxx defined will be compiled. When
IOTDATA_VARIANT_MAPS_DEFAULT is defined (without IOTDATA_ENABLE_SELECTIVE),
all elements used by the default weather station variant are automatically
enabled. When neither is defined, all elements are compiled.
In particular, avoidance of the TLV element will save considerable footprint.
Functional subsetting:
| Define | Effect |
|---|---|
IOTDATA_NO_DECODE |
Exclude decoder functions (also excludes print and JSON encoder) |
IOTDATA_NO_ENCODE |
Exclude encoder functions (also excludes JSON decoder) |
IOTDATA_NO_PRINT |
Exclude print functions |
IOTDATA_NO_DUMP |
Exclude dump functions |
IOTDATA_NO_JSON |
Exclude JSON functions |
IOTDATA_NO_TLV_SPECIFIC |
Exclude TLV specific type handling |
IOTDATA_NO_CHECKS_STATE |
Exclude state checking logic |
IOTDATA_NO_CHECKS_TYPES |
Exclude type checking logic |
IOTDATA_NO_ERROR_STRINGS |
Exclude error strings (and iotdata_strerror) |
These allow building an encoder-only image for a sensor node (smallest possible footprint) or a decoder-only image for a gateway.
The JSON encoding functions have a dependency on the decoder (decode from wire format and encode into JSON), and JSON decoding functions equivalently are dependent on the encoder (decode from JSON and encode into wire format). The print functions, for brevity, are also dependant upon the decoder. The dump functions work directly upon the the wire format buffer are are not dependent on either the encoder or decoder.
Be aware that IOTDATA_NO_CHECKS_STATE will cease verification of non null
iotdata_encoder_t* and the ordering of the encoding calls (i.e. that begin
must be first, followed by individual encode_ functions before a final end.
This is moderately safe, and acceptable to turn on during development and off
for production. It will also turn off null checks for buffers passed into the
dump, print and json function.s
IOTDATA_NO_CHECKS_TYPES will cease verification of type boundaries on calls to
encode_ functions, for example that temperatures passed are between
quantisable minimum and maximum values. This is less safe, but results only in
bad data (and badly quantised data) passed over the wire: this may fail to
interpret bad data obtained from sensors. This option will turn off length
checking in TLV encoded strings (and worst case, truncate them) as well as TLV
encoded string validity (and worst case, transmit these as spaces).
Unless there are considerable space constraints, such as on Class 1
microcontrollers (Appendix E), it is not recommended to engage either of the
NO_CHECKS options.
Floating-point control:
| Define | Effect |
|---|---|
| (default) | double for position, float for other fields |
IOTDATA_NO_FLOATING_DOUBLES |
Use float instead of double everywhere |
IOTDATA_NO_FLOATING |
Integer-only mode: all values as scaled integers |
In integer-only mode (IOTDATA_NO_FLOATING), temperature is passed as
degrees×100 (e.g. 2250 for 22.50°C), wind speed as m/s×100, radiation dose as
µSv/h×100, position as degrees×10^7, and SNR as dB×10. This eliminates all
floating-point dependencies. Future implementations SHOULD utilise this
multiple-of-ten approach.
The test-versions target will build each of versions across the Functional
subsetting and Floating-point control, including a combined NO_JSON and
NO_FLOATING version. This is intended as a build smoke test to verify
compilation control paths. Note that the combined version will, on x86
platforms, force the compiler to reject floating-point operations, so as to
ensure they are not latent in the implementation.
The minimal and minimal-esp32 targets yield object files for the purpose of
establishing minimal build sizes (with a comparison to full build sizes) using
the host (minimal) or cross-compiler (minimal-esp32) tools.
The following measurements are from GCC on x86-64, aarch64 and ESP32-C3 using
the minimal build target. With space optimisation, the minimal implementation
is less than 1KB on the embedded target.
| Configuration | x86-64 -O6 | x86-64 -Os | aarch64 -O6 | aarch64 -Os | esp32-c3 -O6 | esp32-c3 -Os |
|---|---|---|---|---|---|---|
| Full library (all elements, encode + decode + JSON) | ~85 KB | ~29 KB | ~87 KB | ~31 KB | ~67 KB | ~19 KB |
| Encoder-only, battery + environment only | ~5.5 KB | ~1.1 KB | ~5.4 KB | ~1.1 KB | ~5.0 KB | ~0.7 KB |
--- Full library ---
gcc -Wall -Wextra -Wpedantic -Werror -Wcast-align -Wcast-qual -Wstrict-prototypes -Wold-style-definition -Wcast-align -Wcast-qual -Wconversion -Wfloat-equal -Wformat=2 -Wformat-security -Winit-self -Wjump-misses-init -Wlogical-op -Wmissing-include-dirs -Wnested-externs -Wpointer-arith -Wredundant-decls -Wshadow -Wstrict-overflow=2 -Wswitch-default -Wundef -Wunreachable-code -Wunused -Wwrite-strings -O6 -DIOTDATA_VARIANT_MAPS_DEFAULT -c iotdata.c -o iotdata_full.o
text data bss dec hex filename
85866 2112 4096 92074 167aa iotdata_full.o
--- Minimal encoder (battery + environment, integer-only) ---
gcc -Wall -Wextra -Wpedantic -Werror -Wcast-align -Wcast-qual -Wstrict-prototypes -Wold-style-definition -Wcast-align -Wcast-qual -Wconversion -Wfloat-equal -Wformat=2 -Wformat-security -Winit-self -Wjump-misses-init -Wlogical-op -Wmissing-include-dirs -Wnested-externs -Wpointer-arith -Wredundant-decls -Wshadow -Wstrict-overflow=2 -Wswitch-default -Wundef -Wunreachable-code -Wunused -Wwrite-strings -O6 -mno-sse -mno-mmx -mno-80387 \
-DIOTDATA_NO_DECODE \
-DIOTDATA_ENABLE_SELECTIVE -DIOTDATA_ENABLE_BATTERY -DIOTDATA_ENABLE_ENVIRONMENT \
-DIOTDATA_NO_JSON -DIOTDATA_NO_DUMP -DIOTDATA_NO_PRINT \
-DIOTDATA_NO_FLOATING -DIOTDATA_NO_ERROR_STRINGS -DIOTDATA_NO_CHECKS_STATE -DIOTDATA_NO_CHECKS_TYPES \
-c iotdata.c -o iotdata_minimal.o
Minimal object size:
text data bss dec hex filename
5559 32 0 5591 15d7 iotdata_minimal.o
0000000000000000 l O .data.rel.ro.local 0000000000000010 _iotdata_field_ops
0000000000000018 l O .data.rel.ro.local 0000000000000008 _iotdata_field_def_battery
0000000000000010 l O .data.rel.ro.local 0000000000000008 _iotdata_field_def_environment
0000000000000000 l d .data.rel.ro.local 0000000000000000 .data.rel.ro.local
--- Full library ---
gcc -Wall -Wextra -Wpedantic -Werror -Wcast-align -Wcast-qual -Wstrict-prototypes -Wold-style-definition -Wcast-align -Wcast-qual -Wconversion -Wfloat-equal -Wformat=2 -Wformat-security -Winit-self -Wjump-misses-init -Wlogical-op -Wmissing-include-dirs -Wnested-externs -Wpointer-arith -Wredundant-decls -Wshadow -Wstrict-overflow=2 -Wswitch-default -Wundef -Wunreachable-code -Wunused -Wwrite-strings -Os -DIOTDATA_VARIANT_MAPS_DEFAULT -c iotdata.c -o iotdata_full.o
text data bss dec hex filename
29603 2112 4096 35811 8be3 iotdata_full.o
--- Minimal encoder (battery + environment, integer-only) ---
gcc -Wall -Wextra -Wpedantic -Werror -Wcast-align -Wcast-qual -Wstrict-prototypes -Wold-style-definition -Wcast-align -Wcast-qual -Wconversion -Wfloat-equal -Wformat=2 -Wformat-security -Winit-self -Wjump-misses-init -Wlogical-op -Wmissing-include-dirs -Wnested-externs -Wpointer-arith -Wredundant-decls -Wshadow -Wstrict-overflow=2 -Wswitch-default -Wundef -Wunreachable-code -Wunused -Wwrite-strings -Os -mno-sse -mno-mmx -mno-80387 \
-DIOTDATA_NO_DECODE \
-DIOTDATA_ENABLE_SELECTIVE -DIOTDATA_ENABLE_BATTERY -DIOTDATA_ENABLE_ENVIRONMENT \
-DIOTDATA_NO_JSON -DIOTDATA_NO_DUMP -DIOTDATA_NO_PRINT \
-DIOTDATA_NO_FLOATING -DIOTDATA_NO_ERROR_STRINGS -DIOTDATA_NO_CHECKS_STATE -DIOTDATA_NO_CHECKS_TYPES \
-c iotdata.c -o iotdata_minimal.o
Minimal object size:
text data bss dec hex filename
1101 32 0 1133 46d iotdata_minimal.o
0000000000000000 l O .data.rel.ro.local 0000000000000010 _iotdata_field_ops
0000000000000018 l O .data.rel.ro.local 0000000000000008 _iotdata_field_def_battery
0000000000000010 l O .data.rel.ro.local 0000000000000008 _iotdata_field_def_environment
0000000000000000 l d .data.rel.ro.local 0000000000000000 .data.rel.ro.local
--- ESP32-C3 full library (no JSON) ---
riscv32-esp-elf-gcc -march=rv32imc -mabi=ilp32 -Os -DIOTDATA_NO_JSON -c iotdata.c -o iotdata_esp32c3_full.o
text data bss dec hex filename
19513 0 0 19513 4c39 iotdata_esp32c3_full.o
--- ESP32-C3 minimal encoder (battery + environment, integer-only) ---
riscv32-esp-elf-gcc -march=rv32imc -mabi=ilp32 -Os \
-DIOTDATA_NO_DECODE \
-DIOTDATA_ENABLE_SELECTIVE -DIOTDATA_ENABLE_BATTERY -DIOTDATA_ENABLE_ENVIRONMENT \
-DIOTDATA_NO_JSON -DIOTDATA_NO_DUMP -DIOTDATA_NO_PRINT \
-DIOTDATA_NO_FLOATING -DIOTDATA_NO_ERROR_STRINGS -DIOTDATA_NO_CHECKS_STATE -DIOTDATA_NO_CHECKS_TYPES \
-c iotdata.c -o iotdata_esp32c3_minimal.o
Minimal object size:
text data bss dec hex filename
768 0 0 768 300 iotdata_esp32c3_minimal.o
00000000 l d .data 00000000 .data
The test-example target compiled with gcc -fstack-usage -Os on x86-64
illustrates per-function stack frames; nested calls accumulate.
| Function | Stack (bytes) | Notes |
|---|---|---|
iotdata_dump_to_string |
5872 | iotdata_dump_t on stack |
iotdata_dump_to_file |
5872 | iotdata_dump_t on stack |
iotdata_decode_to_json |
2768 | iotdata_decoded_t + cJSON |
iotdata_print_to_string |
2224 | iotdata_decoded_t on stack |
iotdata_print_to_file |
2208 | iotdata_decoded_t on stack |
iotdata_encode_from_json |
416 | Encoder context + JSON parsing |
iotdata_dump_build |
192 | Dynamic, bounded |
iotdata_encode_begin |
< 64 | — |
iotdata_encode_end |
< 64 | — |
iotdata_decode |
< 128 | — |
The dump and print functions dominate because they allocate iotdata_dump_t
(5.8 KB) or iotdata_decoded_t (2.2 KB) on the stack. These are
gateway/diagnostic functions not intended for constrained devices. The macro
IOTDATA_MAX_DUMP_ENTRIES defaults to 48, and may be reduced to tune down this
size. Note that iotdata_decoded_t contains a complete set of all decoded
variables, which in this example is the weather station variant with expensive
TLV entries.
The encode path — encode_begin, field calls, encode_end — peaks well under
500 bytes total stack depth. On a Class 2 device with 20 KB RAM, this leaves
ample room for the RTOS stack, radio driver, and application logic.
To reduce stack usage on memory-constrained targets, allocate iotdata_dump_t
or iotdata_decoded_t as a static or global rather than calling the convenience
wrappers which declare them locally.
The variant table in the reference implementation is a compile-time array.
Adding a new variant requires defining a variant_def_t entry with the desired
field mapping and compiling with the IOTDATA_VARIANT_MAPS and
IOTDATA_VARIANT_MAPS_COUNT defines. No changes to the encoder or decoder logic
are needed — the dispatch mechanism automatically handles any valid variant
table entry.
If a new encoding type is needed (not just a relabelling of an existing type), the implementer must:
- Add a new
IOTDATA_FIELD_*enum value. - Implement the six per-field functions (pack, unpack, json_add, json_read, dump, print).
- Add a case to each of the six dispatcher functions.
- Add the appropriate constants and quantisation helpers.
This protocol provides no confidentiality, integrity, or authentication mechanisms at the packet level. It is designed for environments where these properties are provided at other layers:
-
LoRaWAN provides AES-128 encryption and message integrity checks at the MAC layer.
-
TLS/DTLS may be used for IP-based transports.
-
Physical security may be sufficient for isolated deployments on private land.
Specific risks to consider:
-
Replay attacks: An attacker could retransmit captured packets. The sequence number provides detection (not prevention) of replayed packets, but only if the receiver tracks per-station sequence state.
-
Spoofing: Station IDs are not authenticated. An attacker within radio range could transmit packets with a forged station ID.
-
Eavesdropping: The wire format is not encrypted. Sensor readings (temperature, position, etc.) are transmitted in the clear.
Deployments with security requirements MUST use an appropriate underlying transport that provides the needed properties.
The following items are identified for future revisions:
-
Sensor metadata TLVs (types 0x10-0x1F): Standardised TLV formats for conveying sensor type, accuracy class, time source, position source, and calibration metadata. This would enable interoperability between devices from different manufacturers or deployments without prior out-of-band knowledge.
-
Quality indicator fields: Per-field quality/confidence indicators (e.g. GNSS fix quality, HDOP, number of satellites). These would likely use the reserved TLV type range.
-
Extended header (variant 15): A future header format with more variant bits, larger station ID space, or additional structural fields.
-
Implementation singularity limitation: The wire format supports multiple instances of the same field type in different slots (e.g. two independent temperature readings). The current reference implementation uses fixed named storage in the encoder/decoder structs, limiting each field type to one instance. A future implementation could decouple field type from field storage, allowing the variant map to bind each slot to an independent storage location.
This document defines version 1 of the IoT Sensor Telemetry Protocol. The protocol does not carry an explicit version field in the packet header. Version identification relies on the combination of variant ID and the field table known to the receiver.
This is a deliberate design choice. A version field would cost 2–4 bits in every packet — significant when the minimum useful packet is 46 bits. The trade-off is that version negotiation and graceful version coexistence are not supported at the wire level.
The protocol's compatibility properties differ by component:
Within a variant definition (fully compatible). Adding or removing optional fields within an existing variant does not break compatibility. A transmitter that begins including a new field (e.g. adding air quality to a weather station that previously omitted it) is handled transparently by the presence bit mechanism. Receivers that understand the variant's field table will decode the new field; the packet is self-describing within the scope of the variant.
New variant definitions (forward compatible). A transmitter using a new variant ID (e.g. variant 3 for a soil sensor) produces packets that existing receivers cannot decode, because the receiver does not have the field table for variant 3. The receiver MUST discard such packets (Section 11.4, Section 11.6). This is the intended behaviour — variants are deployment-specific, and receivers are expected to be configured with the variant tables relevant to their deployment.
Field encoding changes (incompatible). Any change to a field type's bit width, quantisation formula, or semantic meaning is a breaking change. A receiver using the old encoding will silently produce incorrect values. There is no mechanism to detect this at the wire level. Such changes MUST be accompanied by a new variant ID, ensuring that old receivers discard the packet rather than misinterpret it.
Header changes (incompatible). Any change to the header layout (variant field width, station ID width, sequence width, or total header length) breaks all existing encoders and decoders. Such changes are not contemplated for version 1 and would constitute a new protocol version, distinguishable only by out-of-band means (e.g. separate radio channel, different LoRaWAN port, or application-layer framing).
A receiver MUST be configured — at compile time or runtime — with the set of variant definitions it is expected to decode. A receiver MUST reject packets with variant IDs not in its configured set (Section 11.4).
A receiver SHOULD NOT attempt heuristic detection of unknown field layouts. Because fields are bit-packed with no delimiters or self-describing type tags, misalignment by even one bit corrupts all subsequent fields in the packet.
When a deployment upgrades field definitions or introduces new variants, the following procedure is RECOMMENDED:
-
Update receivers first. Gateways and servers are updated with the new variant tables before any transmitter firmware is changed. This ensures that new-format packets are understood upon arrival.
-
Update transmitters. Sensors are updated via OTA or physical access. The transition period — where some sensors use the old variant and others use the new — is handled naturally, since each packet carries its variant ID and receivers can decode both.
-
Retire old variants. Once all transmitters have been updated, old variant definitions may be removed from receiver configurations. This is optional; retaining them costs only the memory for the field table.
For breaking changes (field encoding modifications), the old and new encodings MUST use different variant IDs. This allows both to coexist during the transition.
The mesh protocol (Appendix G) uses a separate versioning strategy. Mesh control packets are identified by variant ID 15 and dispatched by the ctrl_type field. Reserved ctrl_type values (0x7–0xF) MUST be silently discarded by nodes that do not recognise them, allowing incremental deployment of new mesh packet types. See Appendix G, Section J.7 for details.
If a future protocol revision requires an explicit version field, the following mechanisms are available without breaking the v1 header layout:
-
Variant-based versioning. Reserve one or more variant IDs (e.g. 14) for "versioned payload" packets where the first N bits after the presence bytes carry a version identifier. Existing v1 receivers discard variant 14 packets as unknown.
-
TLV-based version advertisement. A VERSION TLV (type 0x01) already exists for firmware identification. A similar mechanism could carry a protocol version, though this is available only to receivers that successfully decode the packet — a circular dependency for breaking changes.
-
Out-of-band signalling. LoRaWAN FPort, MQTT topic, HTTP header, or other transport-layer metadata can indicate the protocol version without consuming payload bits.
The packed string format (TLV Format = 1) encodes each character as 6 bits using the following table:
| Value | Char | Value | Char | Value | Char | Value | Char |
|---|---|---|---|---|---|---|---|
| 0 | space | 16 | p | 32 | 5 | 48 | L |
| 1 | a | 17 | q | 33 | 6 | 49 | M |
| 2 | b | 18 | r | 34 | 7 | 50 | N |
| 3 | c | 19 | s | 35 | 8 | 51 | O |
| 4 | d | 20 | t | 36 | 9 | 52 | P |
| 5 | e | 21 | u | 37 | A | 53 | Q |
| 6 | f | 22 | v | 38 | B | 54 | R |
| 7 | g | 23 | w | 39 | C | 55 | S |
| 8 | h | 24 | x | 40 | D | 56 | T |
| 9 | i | 25 | y | 41 | E | 57 | U |
| 10 | j | 26 | z | 42 | F | 58 | V |
| 11 | k | 27 | 0 | 43 | G | 59 | W |
| 12 | l | 28 | 1 | 44 | H | 60 | X |
| 13 | m | 29 | 2 | 45 | I | 61 | Y |
| 14 | n | 30 | 3 | 46 | J | 62 | Z |
| 15 | o | 31 | 4 | 47 | K | 63 | (rsvd) |
Value 63 is reserved for a future escape mechanism to extend the character set.
The corresponding encode/decode functions in the reference implementation:
static inline int char_to_6bit(char c) {
if (c == ' ') return 0;
if (c >= 'a' && c <= 'z') return 1 + (c - 'a');
if (c >= '0' && c <= '9') return 27 + (c - '0');
if (c >= 'A' && c <= 'Z') return 37 + (c - 'A');
return -1; /* unencodable */
}
static inline char sixbit_to_char(uint8_t val) {
if (val == 0) return ' ';
if (val >= 1 && val <= 26) return 'a' + (val - 1);
if (val >= 27 && val <= 36) return '0' + (val - 27);
if (val >= 37 && val <= 62) return 'A' + (val - 37);
return '?';
}Input: 75%
q = round(75 / 100.0 * 31.0) = round(23.25) = 23
Decoded: round(23 / 31.0 * 100.0) = round(74.19) = 74%
Error: 1 percentage point
Input: -15.25°C
q = round((-15.25 - (-40.0)) / 0.25) = round(24.75 / 0.25) = round(99.0) = 99
Decoded: -40.0 + 99 * 0.25 = -40.0 + 24.75 = -15.25°C
Error: 0.00°C (exact)
Latitude:
q = round((59.334591 - (-90.0)) / 180.0 * 16777215)
= round(149.334591 / 180.0 * 16777215)
= round(0.829636617 * 16777215)
= round(13918991.6) = 13918992
Decoded: 13918992 / 16777215.0 * 180.0 + (-90.0)
= 0.829636653 * 180.0 - 90.0
= 149.334597 - 90.0 = 59.334597°
Error: 0.000006° ≈ 0.67 m
Longitude:
q = round((18.063240 - (-180.0)) / 360.0 * 16777215)
= round(198.063240 / 360.0 * 16777215)
= round(0.550175667 * 16777215)
= round(9230415.2) = 9230415
Decoded: 9230415 / 16777215.0 * 360.0 + (-180.0)
= 0.550175631 * 360.0 - 180.0
= 198.063227 - 180.0 = 18.063227°
Error: 0.000013° ≈ 0.72 m (at 59°N, cos correction)
Input: Day 5, 12:00:00 (432,000 + 43,200 = 475,200 seconds from year start)
ticks = 475200 / 5 = 95040
Decoded: 95040 * 5 = 475200 seconds
Error: 0 seconds (exact, since input is a multiple of 5)
Input: Day 5, 12:00:03 (475,203 seconds — not a multiple of 5)
ticks = 475203 / 5 = 95040 (integer division, truncated)
Decoded: 95040 * 5 = 475200 seconds
Error: 3 seconds (truncation towards zero)
Note: the encoder uses integer division (truncation), not rounding, for the datetime field. This means the decoded time is always ≤ the actual time, with a maximum error of 4 seconds.
The following example from the reference implementation test suite demonstrates encoding a full weather station telemetry packet:
#define IOTDATA_VARIANT_MAPS_DEFAULT
#include "iotdata.h"
/* Encode a full weather station packet (variant 0) */
void encode_full_packet(uint8_t *buf, size_t buf_size, size_t *out_len)
{
iotdata_encoder_t enc;
iotdata_encode_begin(&enc, buf, buf_size, 0, 42, 50000);
/* Pres0 fields — most common, smallest packet when only these */
iotdata_encode_battery(&enc, 95, true);
iotdata_encode_link(&enc, -76, 10.0f);
iotdata_encode_environment(&enc, -2.75f, 1005, 95);
iotdata_encode_wind(&enc, 12.0f, 270, 18.5f);
iotdata_encode_rain(&enc, 3, 15); // x10 units
iotdata_encode_solar(&enc, 450, 7);
/* Pres1 fields — trigger extension byte */
iotdata_encode_cloud(&enc, 6);
iotdata_encode_air_quality_index(&enc, 75);
iotdata_encode_radiation(&enc, 100, 0.50f);
iotdata_encode_position(&enc, 59.334591, 18.063240);
iotdata_encode_datetime(&enc, 3251120);
iotdata_encode_flags(&enc, 0x42);
iotdata_encode_end(&enc, out_len);
/* Result: 32 bytes for all 12 fields */
}Decoding on the receiver side:
/* Decode and inspect */
iotdata_decoded_t dec;
iotdata_decode(buf, len, &dec);
printf("Station %u: %.2f°C, %u hPa, wind %.1f m/s @ %u°\n",
dec.station, dec.temperature, dec.pressure,
dec.wind_speed, dec.wind_direction);
/* Or decode to JSON for forwarding */
char *json;
iotdata_decode_to_json(buf, len, &json);
/* ...forward json to MQTT, database, etc... */
free(json);The iotdata protocol is designed so that a useful telemetry packet typically fits within a single link-layer frame on the target medium. Fragmenting a packet across multiple frames defeats the core design goals: each additional frame incurs a separate preamble, MAC header, and — critically on duty-cycle-regulated media — a separate transmission window. On LoRa at SF12, a single 24-byte frame takes approximately 1.5 seconds to transmit; two frames would take over 3 seconds, consume twice the energy, and halve the effective reporting rate under duty cycle constraints.
Implementers SHOULD select fields per packet to remain within the target medium's payload limit. The presence flag mechanism makes this straightforward: each packet is self-describing, so the receiver correctly handles any combination of fields without prior negotiation.
LoRa is the primary target medium. At 125 kHz bandwidth with coding rate 4/5 and explicit header, the time-on-air for representative iotdata packet sizes is:
| Packet | Bytes | SF7 | SF8 | SF9 | SF10 | SF11 | SF12 |
|---|---|---|---|---|---|---|---|
| Minimal (battery) | 6 | 36 ms | 62 ms | 124 ms | 248 ms | 496 ms | 991 ms |
| Typical (b+e+d) | 10 | 41 ms | 72 ms | 144 ms | 289 ms | 578 ms | 991 ms |
| With link+flags | 12 | 41 ms | 82 ms | 144 ms | 289 ms | 578 ms | 1.16 s |
| Full telemetry | 24 | 62 ms | 113 ms | 206 ms | 371 ms | 823 ms | 1.48 s |
(Computed using the Semtech AN1200.13 formula with 8-symbol preamble and low data rate optimisation enabled for SF11/SF12.)
All iotdata packets (6-24 bytes) fit well within the raw LoRa PHY payload limit of 255 bytes at any spreading factor. The binding constraint is not payload size but time-on-air and duty cycle.
In the EU868 ISM band, the regulatory duty cycle limit is typically 1%. This means a device must remain silent for 99× the transmission duration after each packet. The implications are significant:
| Packet | Bytes | SF7 | SF12 |
|---|---|---|---|
| Minimal | 6 | 1 per 4 s | 1 per 1.7 min |
| Typical | 10 | 1 per 4 s | 1 per 1.7 min |
| Full telemetry | 24 | 1 per 6 s | 1 per 2.5 min |
The difference between a 10-byte typical packet and a 24-byte full packet at SF12 is the difference between transmitting every 1.7 minutes and every 2.5 minutes — or equivalently, ~35 vs ~24 transmissions per hour. This means that for bit-packing: the savings are not merely aesthetic, they directly translate to reporting frequency, battery life, or both.
Spreading factor selection is an implementation decision that balances range against airtime. SF7 provides the shortest airtime but the least range; SF12 provides maximum range (approximately 10-15 km line of sight) at the cost of 32× the airtime. The iotdata protocol is agnostic to spreading factor — the same packet is valid regardless of the underlying modulation parameters.
LoRaWAN adds a MAC layer on top of LoRa, providing device management, adaptive data rate (ADR), and AES-128 encryption with message integrity. The MAC overhead consumes approximately 13 bytes of the LoRa PHY payload (MHDR, DevAddr, FCtrl, FCnt, FPort, MIC), reducing the available application payload.
The maximum LoRaWAN application payload by data rate (EU868):
| Data Rate | Modulation | Max payload | Full iotdata (24B) | Headroom |
|---|---|---|---|---|
| DR0 | SF12/125 | 51 bytes | fits | 27 B |
| DR1 | SF11/125 | 51 bytes | fits | 27 B |
| DR2 | SF10/125 | 51 bytes | fits | 27 B |
| DR3 | SF9/125 | 115 bytes | fits | 91 B |
| DR4 | SF8/125 | 222 bytes | fits | 198 B |
| DR5 | SF7/125 | 222 bytes | fits | 198 B |
Full iotdata telemetry (24 bytes) fits comfortably at all LoRaWAN data rates, with at least 27 bytes of headroom even at the lowest data rate.
Note that the AWS LoRaWAN documentation identifies 11 bytes as the safe universal application payload across all global frequency plans and data rates. The iotdata protocol's typical packet (battery + environment + depth) is 10 bytes, falling within this universal limit.
LoRaWAN's AES-128 encryption and MIC address the security considerations discussed in Section 14. Deployments using LoRaWAN inherit these protections without any additional work at the iotdata protocol layer.
Sigfox imposes the tightest constraints of any common LPWAN medium: a maximum uplink payload of 12 bytes and a limit of 140 messages per day (approximately one every 10 minutes for uniform distribution).
| Packet configuration | Bytes | Fits Sigfox? |
|---|---|---|
| Minimal (battery) | 6 | ✓ |
| Typical (bat+env+depth) | 10 | ✓ |
| With link+flags | 12 | ✓ (exact) |
| With position | 19 | ✗ |
| Full telemetry | 24 | ✗ |
The protocol's core telemetry packets fit within the 12-byte Sigfox limit. Position and datetime, which require the extension byte and add 6-9 bytes, do not fit alongside a full complement of sensor fields.
For Sigfox deployments, implementers SHOULD use a field rotation strategy: transmit core telemetry (battery, environment, depth) on every message, and rotate in less-frequently-needed fields across separate messages. For example:
- Every 10 minutes: battery + environment + depth (10 bytes)
- Once per hour: battery + position (12 bytes)
- Once per day: battery + datetime + flags (10 bytes)
The presence flag mechanism supports this natively — each packet is self-describing, so the receiver assembles a complete picture from multiple packets without any out-of-band configuration.
Sigfox provides its own authentication and anti-replay mechanisms at the network level, but does not encrypt the payload. Implementers requiring payload confidentiality on Sigfox must implement application-layer encryption within the 12-byte constraint.
IEEE 802.11ah operates in the sub-GHz ISM bands (typically 868 MHz in Europe, 902-928 MHz in the US) and targets IoT applications with range up to 1 km. Unlike LoRa and Sigfox, it is IP-based and supports standard Ethernet-class MSDUs (up to 1500 bytes payload per frame), with A-MPDU aggregation for larger transfers.
Packet size is not a meaningful constraint for iotdata on 802.11ah. However, the efficiency argument still applies:
-
Power consumption scales with transmission duration. 802.11ah introduced a reduced MAC header (18 bytes vs 28 bytes in legacy 802.11) specifically to reduce overhead for small IoT payloads. A 10-byte iotdata payload benefits from this optimisation more than a 200-byte JSON payload would.
-
EU duty cycle regulations apply to the sub-GHz bands used by 802.11ah, though the specific constraints differ from LoRa (802.11ah typically uses listen-before-talk rather than pure duty cycle limits).
-
Contention in dense deployments is reduced by shorter frame durations, improving effective throughput for all stations.
The iotdata payload would typically be carried as a UDP datagram within the 802.11ah frame. The receiver-side JSON conversion is well suited to 802.11ah gateways, which have IP connectivity and typically run on more capable hardware.
Cellular technologies provide reliable, wide-area connectivity with operator-managed infrastructure. Three cellular transports are relevant for iotdata:
NB-IoT and LTE-M are purpose-built cellular IoT standards. NB-IoT supports payloads of approximately 1600 bytes per message; LTE-M supports standard IP MTU sizes. Payload size is not a constraint. Both provide encryption, integrity, and authentication at the network layer, fully addressing the security considerations of Section 14.
SMS is a widely overlooked but practical transport for low-rate telemetry. The GSM 03.40 specification defines an 8-bit binary encoding mode (selected via the TP-DCS field) that provides 140 bytes of raw octet payload per message — more than enough for any iotdata packet. Binary SMS is sent via AT commands in PDU mode, which every GSM/3G/4G modem supports (SIM800, u-blox SARA, Quectel, etc.).
| Cellular transport | Max payload | Full iotdata | IP stack needed |
|---|---|---|---|
| NB-IoT | ~1600 B | ✓ | Yes |
| LTE-M | ~MTU | ✓ | Yes |
| SMS (8-bit binary) | 140 B | ✓ | No |
SMS has several properties that distinguish it from IP-based cellular transports:
-
Near-universal coverage. SMS operates on the GSM signalling channel and works in 2G-only areas where NB-IoT or LTE-M may not be deployed.
-
Store-and-forward. The SMSC holds messages if the receiver is temporarily unreachable, providing inherent buffering that IP-based transports must implement at the application layer.
-
No IP stack. The sensor MCU needs only a UART connection to a GSM modem and a handful of AT commands (
AT+CMGSin PDU mode). This significantly reduces firmware complexity compared to a full IP/CoAP/DTLS stack. -
No data plan. SMS-only SIM plans are available at low cost, avoiding the complexity of cellular data provisioning.
-
Fallback resilience. SMS uses the control plane rather than the data plane, so it typically remains functional during network congestion that would affect data services.
The primary disadvantages of SMS are per-message cost (unlike LoRa which is free, or bulk-metered data plans), latency (typically 1-30 seconds, occasionally longer during congestion), and receiving-side complexity (the gateway requires either a GSM modem or an SMS-to-HTTP gateway service). SMS provides no payload encryption; content is visible to the carrier network.
The bit-packing efficiency of iotdata remains beneficial across all cellular transports for two reasons:
-
Energy per byte. Cellular radio transmission energy is roughly proportional to transmission time. Shorter payloads mean shorter active radio periods and longer battery life.
-
Data and message cost. For IP-based transports, reducing payload from 200 bytes (JSON) to 10 bytes (iotdata) reduces per-message data consumption by 95%. For SMS, keeping packets within a single 140-byte message avoids the overhead and cost of concatenated multi-part SMS.
SMS is particularly well suited as a fallback transport: a sensor that normally transmits via LoRa could fall back to SMS when connectivity is lost or when an alarm condition requires guaranteed delivery via a different path. The same encoded packet is valid on both transports without modification.
| Medium | Max payload | Full iotdata | Primary constraint | Encryption | Notes |
|---|---|---|---|---|---|
| LoRa | 255 B | ✓ | Duty cycle / airtime | No | Primary target medium |
| LoRaWAN | 51-222 B | ✓ | Duty cycle / airtime | AES-128 | Managed network |
| Sigfox | 12 B | Partial | Hard payload limit | Auth only | Field rotation needed |
| 802.11ah | 1500 B | ✓ | Duty cycle (EU) / power | WPA2/3 | IP-based, UDP transport |
| NB-IoT | ~1600 B | ✓ | Energy / data cost | Yes | Operator infrastructure |
| LTE-M | ~MTU | ✓ | Energy / data cost | Yes | Operator infrastructure |
| SMS | 140 B | ✓ | Per-message cost | No | Fallback / universal coverage |
The protocol's presence flag mechanism makes medium-aware field selection a runtime decision rather than a compile-time decision. The same encoder can produce a 10-byte packet for Sigfox and a 24-byte packet for LoRa, with the receiver handling both identically.
The iotdata protocol is designed to run on a wide range of microcontrollers (MCU), but the appropriate implementation strategy varies significantly by device class:
| Class | Examples | RAM | Flash | FPU | Typical role |
|---|---|---|---|---|---|
| 1 | PIC16F, ATtiny, MSP430G | 256B-2KB | 4-16KB | No | Sensor (encode only) |
| 2 | PIC18F, STM32L0, nRF52 | 2-64KB | 32-256KB | No* | Sensor (encode + basic decode) |
| 3 | ESP32-C3, STM32F4, RP2040 | 256-520KB | 384KB-16MB | Yes | Sensor + gateway |
| 4 | Raspberry Pi, Linux SBC | 256MB+ | SD/eMMC | Yes | Gateway / server |
*nRF52840 has an FPU; most Class 2 devices do not.
The reference implementation currently targets Class 3 and 4 devices. Class 1 and 2 devices require implementation strategies discussed below. The design is specifically intended to be extended to them.
The reference implementation's data structures have the following sizes (measured on a 64-bit platform; 32-bit targets will be smaller due to pointer size):
| Structure | Size | Purpose |
|---|---|---|
iotdata_encoder_t |
~300 B | Encoder context (all fields + TLV pointers) |
iotdata_decoded_t |
~2000 B | Decoded packet (includes TLV data buffers) |
The encoder context (~300 bytes) is dominated by the TLV pointer array (8 entries × 2 pointers × 8 bytes = 128 bytes on 64-bit). The core sensor fields occupy approximately 60 bytes. On a 32-bit MCU:
- Full encoder context: ~200 bytes
- Encoder context without TLV: ~72 bytes
- Core sensor fields alone: ~50 bytes
The decoded struct (~2000 bytes) is dominated by the TLV data buffers (8 entries × ~256 bytes = 2048 bytes). This structure is designed for gateway/server use and is NOT appropriate for Class 1 or 2 devices. A minimal decoder that ignores TLV data needs approximately 60 bytes.
TLV support can be excluded from the encoder, which would yield the most considerable level of savings if resource constrained.
The current encoder uses a "store then pack" strategy:
iotdata_encode_begin(&enc, buf, sizeof(buf), variant, station, seq);
iotdata_encode_battery(&enc, 84, false); /* stores values */
iotdata_encode_environment(&enc, 21.5f, 1013, 45);
iotdata_encode_end(&enc, &out_len); /* packs all at once */Advantages:
- Fields can be added in any order — the variant table determines wire order at pack time.
- Validation happens eagerly, at the
encode_*()call site. - The presence bytes are computed once the full field set is known, avoiding backfill.
Disadvantage:
- The full encoder context must be held in RAM simultaneously: all field values plus the output buffer. On Class 3+ devices this is negligible; on Class 1 devices (256 bytes RAM) it is prohibitive without stripping.
For severely RAM-constrained devices (Class 1), a pack-as-you-go encoder could eliminate the context struct entirely. The encoder would write bits directly to the output buffer as each field is supplied.
The challenge is that the presence bytes (at bit offsets 32-39 and optionally
40-47) must appear in the wire format before the data fields they describe, but
the encoder does not know which fields will be present until all encode_*()
calls have been made.
Two approaches can resolve this:
Approach A: Presence byte backfill. Reserve the presence byte positions in the output buffer (write zeros), then pack each field's bits immediately. After the last field, go back and write the correct presence bytes. This requires fields to be supplied in strict field order (S0, S1, S2...) so that the bit cursor advances correctly.
/* Pseudocode for pack-as-you-go with backfill */
write_header(buf, variant, station, seq); /* bits 0-31 */
pres0_offset = 32; /* remember position */
skip_bits(8); /* reserve pres0 */
/* caller must add fields in field order */
add_battery(buf, &cursor, level, charging); /* pack immediately */
add_environment(buf, &cursor, t, p, h);
add_depth(buf, &cursor, cm);
/* after all fields: */
backfill_presence(buf, pres0_offset, fields_present);
Approach B: Two-pass encode. First pass: call all encode_*() functions
which simply set bits in a fields_present bitmask (1 byte of RAM). Second
pass: iterate the variant field table and pack only the fields that are present.
This requires the field values to be available again on the second pass, either
from global/static variables or by re-reading the sensors.
Trade-offs:
| Property | Store-then-pack | Pack-as-you-go (A) | Two-pass (B) |
|---|---|---|---|
| RAM (no TLV) | ~72 B + buf | ~4 B + buf | ~4 B + buf* |
| Field order | Any | Strict field order | Any |
| Code complexity | Low | Medium | Medium |
| Re-read sensors | No | No | Yes |
| Suitable for Class 1 | Marginal | Yes | Yes |
*Two-pass requires field values to be available on the second pass, either stored elsewhere or re-read from hardware.
The reference implementation uses store-then-pack because it is the most developer-friendly and the target devices (ESP32-C3, Class 3) have ample RAM. Implementers targeting Class 1 devices SHOULD consider pack-as-you-go with backfill. Approach A maybe be provided in a future version of the reference implementation.
The reference implementation factors all per-field operations into static inline functions specifically to enable compile-time stripping:
#ifdef IOTDATA_ENABLE_SOLAR
static inline void pack_solar(uint8_t *buf, size_t *bp,
const iotdata_encoder_t *enc) {
bits_write(buf, bp, enc->solar_irradiance, 10);
bits_write(buf, bp, enc->solar_ultraviolet, 4);
}
static inline void unpack_solar(...) { ... }
/* ...4 more functions... */
#endifEach field type has 6 functions (pack, unpack, json_add, json_read, dump, print). On an embedded sensor that only transmits battery, environment, and depth:
| Component | Functions | Approx. code size |
|---|---|---|
| Core (header, presence, bits) | — | ~1 KB |
| Battery (6 functions) | 6 | ~400 B |
| Environment (6 functions) | 6 | ~500 B |
| Depth (6 functions) | 6 | ~350 B |
| Included total | 18 | ~2.2 KB |
| Solar (excluded) | 6 | -400 B |
| Link quality (excluded) | 6 | -350 B |
| Flags (excluded) | 6 | -300 B |
| Position (excluded) | 6 | -500 B |
| Datetime (excluded) | 6 | -400 B |
| JSON functions (excluded) | ~20 | -2 KB |
| Print/dump (excluded) | ~20 | -1.5 KB |
A fully stripped sensor-only build (encode path only, 3 field types, no JSON/print/dump) can fit in approximately 2-3 KB of flash. This is achievable on Class 1 devices.
Several encode functions accept floating-point parameters:
iotdata_encode_environment()takesfloattemperatureiotdata_encode_position()takesdoublelatitude/longitudeiotdata_encode_link()takesfloatSNR
On MCUs without a hardware FPU (most Class 1 and many Class 2 devices), floating-point arithmetic is emulated in software, which is both slow (~50-100 cycles per operation) and adds code size (~2-5 KB for soft-float library).
For Class 1 targets, implementers SHOULD consider:
-
Integer-only temperature API: Accept temperature as a fixed-point integer in units of 0.25°C offset from -40°C. The caller performs
q = (temp_raw - (-40*4))and passesqdirectly. No floating point needed. -
Integer-only position API: Accept pre-quantised 24-bit latitude and longitude values. The caller uses the GNSS receiver's native integer output and scales appropriately.
-
Integer-only SNR: Accept SNR as an integer in dB.
The reference implementation uses float/double for developer convenience on
Class 3+ targets. However, the compile-time configurations of
IOTDATA_NO_FLOATING_DOUBLES or IOTDATA_NO_FLOATING can be used to remove
floating point operations and provide replacement integer-only entry points such
as temperature encoders that take values multiplied by 100, e.g. 152 to
represent a temperature of 15.2°C.
The reference implementation has the following dependency profile:
| Component | Dependencies | Required for |
|---|---|---|
| Encoder | <stdint.h>, <stdbool.h>, <stddef.h>, <math.h> |
All builds |
| Decoder | Same as encoder | Gateway / bidirectional |
| JSON conversion | libcjson |
Gateway / server |
| Print / dump | <stdio.h> |
Debug / gateway |
The core encoder has no external library dependencies. The <math.h> dependency
is for round() and floor() in the quantisation functions; on platforms where
<math.h> is unavailable or expensive, these can be replaced with integer
arithmetic equivalents:
/* Integer-only round for non-negative values */
static inline uint16_t int_round(uint32_t num, uint32_t denom) {
return (uint16_t)((num + denom / 2) / denom);
}The libcjson dependency exists only for the JSON serialisation functions and
SHOULD be excluded from embedded builds via #ifdef IOTDATA_NO_JSON.
The encoder and decoder are designed for stack allocation only. No
malloc() or free() calls are made in any encode or decode path. This is
critical for:
- Bare-metal systems without a heap allocator.
- RTOS environments (FreeRTOS, Zephyr) where heap fragmentation must be avoided and stack sizes are fixed.
- Safety-critical systems where dynamic allocation is prohibited by coding standards (MISRA C, etc.).
The JSON conversion functions (iotdata_decode_to_json) do allocate heap memory
via cJSON_CreateObject() and return a malloc'd string. These functions are
gateway/server-only and are not intended for embedded use.
The bit-packing functions operate on individual bytes via bit manipulation and
are endian-agnostic. The bits_write() and bits_read() functions address
the output buffer byte-by-byte, computing bit offsets explicitly. No multi-byte
loads or stores are used in the packing/unpacking path.
This means the same code runs correctly on:
- Little-endian ARM (ESP32, STM32, RP2040)
- Big-endian PIC (in standard configuration)
- Any other byte-addressable architecture
No byte-swap operations are needed when moving between platforms.
The encoder's encode_end() function performs a single linear pass over the
variant field table and packs all present fields. The execution time is bounded
and predictable:
- Minimum (battery only): ~50 bit operations
- Maximum (all fields + TLV): ~300 bit operations
Each bit operation is a constant-time shift-and-mask. There are no loops with data-dependent iteration counts (except TLV string encoding, which iterates over the string length). The encoder is suitable for use in interrupt service routines or time-critical sections, though calling it from a main-loop context is more typical.
Raspberry Pi / Linux (Class 4): The full implementation runs unmodified, including JSON conversion, print, dump, and all 8 field types. Typically used as a gateway, receiving packets via LoRa HAT or USB-connected radio module, decoding to JSON, and forwarding via MQTT, HTTP, or database insertion.
ESP32-C3 (Class 3, primary target): The reference implementation runs
unmodified. The ESP32-C3 has 400 KB SRAM and 4 MB flash but no hardware FPU —
both single- and double-precision arithmetic is software-emulated. Use
IOTDATA_NO_FLOATING for best performance. Both the encoder and decoder,
including JSON functions, fit comfortably. The ESP-IDF build system supports
#ifdef stripping via menuconfig or sdkconfig defines.
STM32L0 (Class 2): With 20 KB RAM and 64-192 KB flash, both the encoder
context (328 bytes) and decoded struct (2176 bytes) fit on the stack
comfortably. Exclude JSON, print, and dump functions. Use
-ffunction-sections -fdata-sections -Wl,--gc-sections to strip unused code.
PIC18F (Class 2): Similar constraints to STM32L0 but with typically less RAM
(2-4 KB) and a more limited C compiler. The reference implementation's use of
static inline functions may need to be adjusted (some PIC compilers do not
inline effectively). Consider the pack-as-you-go approach (Section E.4) for
minimal RAM usage.
PIC16F (Class 1): With as little as 256 bytes of RAM, the full encoder
context does not fit. Use pack-as-you-go with backfill (Section E.4, Approach
A), integer-only APIs (Section E.6), and aggressive #ifdef stripping (Section
E.5). Target flash budget: 2-3 KB for a battery + environment + depth encoder.
On devices with as little as 256 bytes of RAM (PIC16F, ATtiny), the library's table-driven architecture is unnecessary overhead. The following self-contained function encodes a weather station packet with battery, environment, and two TLV entries directly into a caller-provided buffer. No structs, no function pointers, no library linkage — just bit-packing arithmetic. The full weather station packet requires a buffer of no more than 32 bytes (without TLV). If really necessary, the implementation can avoid buffers and use ws_bits to write directly to an output (e.g. serial port).
#define WS_VARIANT 0 /* Built-in weather station */
#define WS_STATION 1 /* 0 to 4095 */
#define WS_PRES0_FIELDS 6 /* battery, link, environment, wind, rain, solar */
#define WS_PRES_EXT 0x80
#define WS_PRES_TLV 0x40
static void ws_bits(uint8_t *buf, uint16_t *bp, uint32_t val, uint8_t n) {
for (int8_t i = n - 1; i >= 0; i--, (*bp)++)
if (val & (1UL << i))
buf[*bp >> 3] |= (1U << (7 - (*bp & 7)));
}
/*
* Encodes: battery, environment, one raw TLV, one string TLV.
* All integer arithmetic. No floating point. No malloc.
*
* Parameters:
* buf — output buffer, must be zeroed by caller, >= 32 bytes
* sequence — 16-bit sequence number
* batt_pct — battery level 0-100
* charging — 0 or 1
* temp100 — temperature in centidegrees (-4000 = -40.00 C)
* press_hpa — pressure in hPa (850-1105)
* humid_pct — humidity 0-100
* tlv0_type — first TLV type (0-63)
* tlv0_data — first TLV raw data pointer
* tlv0_len — first TLV data length
* tlv1_type — second TLV type (0-63)
* tlv1_str — second TLV string (6-bit charset: a-z 0-9 A-Z space)
* tlv1_len — second TLV string length
*
* Returns: packet size in bytes
*/
static uint8_t ws_encode(
uint8_t *buf, uint16_t sequence,
uint8_t batt_pct, uint8_t charging,
int16_t temp100, uint16_t press_hpa, uint8_t humid_pct,
uint8_t tlv0_type, const uint8_t *tlv0_data, uint8_t tlv0_len,
uint8_t tlv1_type, const char *tlv1_str, uint8_t tlv1_len)
{
uint16_t bp = 0;
/* --- Header: 4-bit variant + 12-bit station + 16-bit sequence --- */
ws_bits(buf, &bp, WS_VARIANT, 4);
ws_bits(buf, &bp, WS_STATION, 12);
ws_bits(buf, &bp, sequence, 16);
/* --- Presence byte 0: ext=0, tlv=1, battery=1, link=0,
environment=1, wind=0, rain=0, solar=0 --- */
/* bit 7: ext = 0 (no pres1)
* bit 6: tlv = 1 (TLV present)
* bit 5: battery = 1
* bit 4: link = 0
* bit 3: environment = 1
* bit 2: wind = 0
* bit 1: rain = 0
* bit 0: solar = 0
*/
ws_bits(buf, &bp, 0x68, 8); /* 0b01101000 */
/* --- Battery: 5-bit level + 1-bit charging --- */
ws_bits(buf, &bp, ((uint32_t)batt_pct * 31 + 50) / 100, 5);
ws_bits(buf, &bp, charging ? 1 : 0, 1);
/* --- Environment: 9-bit temp + 8-bit pressure + 7-bit humidity --- */
ws_bits(buf, &bp, (uint32_t)((temp100 - (-4000)) + 12) / 25, 9);
ws_bits(buf, &bp, (uint32_t)(press_hpa - 850), 8);
ws_bits(buf, &bp, (uint32_t)humid_pct, 7);
/* --- TLV 0: raw --- */
ws_bits(buf, &bp, 0, 1); /* format: raw */
ws_bits(buf, &bp, tlv0_type, 6); /* type */
ws_bits(buf, &bp, 1, 1); /* more: yes */
ws_bits(buf, &bp, tlv0_len, 8); /* length */
for (uint8_t i = 0; i < tlv0_len; i++)
ws_bits(buf, &bp, tlv0_data[i], 8);
/* --- TLV 1: 6-bit string --- */
ws_bits(buf, &bp, 1, 1); /* format: string */
ws_bits(buf, &bp, tlv1_type, 6); /* type */
ws_bits(buf, &bp, 0, 1); /* more: no */
ws_bits(buf, &bp, tlv1_len, 8); /* length */
for (uint8_t i = 0; i < tlv1_len; i++) {
char c = tlv1_str[i];
uint8_t v = (c == ' ') ? 0 :
(c >= 'a' && c <= 'z') ? 1 + (c - 'a') :
(c >= '0' && c <= '9') ? 27 + (c - '0') :
(c >= 'A' && c <= 'Z') ? 37 + (c - 'A') : 0;
ws_bits(buf, &bp, v, 6);
}
return (uint8_t)((bp + 7) >> 3);
}Resource usage: This function requires approximately 20 bytes of stack (loop counters, bit pointer, temporaries) plus the output buffer (or not, if directly writing to serial output). The code compiles to under 400 bytes on PIC18F or AVR. The caller can reuse the output buffer between transmissions.
Adapting for other variants: Copy the function, change the presence byte
constant, and add or remove the field sections. Each field is a self-contained
block of ws_bits calls — the protocol document (Sections 4-6) gives the bit
widths and quantisation formulae for every field type.
The test-example target generates pseudo sensor data simulating a weather
station to illustrate quantisation effects and ancillary (dump, print and JSON)
functionality.
╔══════════════════════════════════════════════════╗
║ iotdata weather station simulator ║
║ Station 42 — variant 0 (weather_station) ║
║ 30s reports / 5min full reports with position ║
║ Press Ctrl-C to stop ║
╚══════════════════════════════════════════════════╝
────────────────────────────────────────────────────────────────────────────────
** Packet #1 [17:29:08] *** 5-minute report (with position/datetime) ***
────────────────────────────────────────────────────────────────────────────────
** Sensor values:
battery: 85.2%
link: -85 dBm SNR 4.8 dB
temperature: +14.75 °C
pressure: 1013 hPa
humidity: 55 %
wind: 4.1 m/s @ 172° (gust 8.7 m/s)
rain: 3 mm/hr, 0.5 mm/d
solar: 393 W/m² UV 3
clouds: 4 okta
air quality: 41 AQI
radation: 22 CPM, 0.10 µSv/h
position: 59.334588, 18.063240
datetime: 3518948 s from year start
flags: 0x01
** Binary (32 bytes):
00 2A 00 01 BF 7E D2 26 DD 1B 71 0F 44 40 C5 89
34 14 80 2C 00 56 A3 18 84 66 C2 78 55 E9 68 08
** Diagnostic dump:
Offset Len Field Raw Decoded Range
------ --- ----- --- ------- -----
0 4 variant 0 0 0-14 (15=rsvd)
4 12 station 42 42 0-4095
16 16 sequence 1 1 0-65535
32 8 presence[0] 191 0xbf ext|tlv|6 fields
40 8 presence[1] 126 0x7e ext|7 fields
48 5 battery_level 26 84% 0..100%%, 5b quant
53 1 battery_charging 0 discharging 0/1
54 4 link_rssi 8 -88 dBm -120..-60, 4dBm
58 2 link_snr 2 0 dB -20..+10, 10dB
60 9 temperature 219 14.75 C -40..+80C, 0.25C
69 8 pressure 163 1013 hPa 850..1105 hPa
77 7 humidity 55 55% 0..100%%
84 7 wind_speed 8 4.0 m/s 0..63.5, 0.5m/s
91 8 wind_direction 122 172 deg 0..355, ~1.4deg
99 7 wind_gust 17 8.5 m/s 0..63.5, 0.5m/s
106 8 rain_rate 3 3 mm/hr 0..255 mm/hr
114 4 rain_size 1 0.4 mm/d 0..6.3 mm/d
118 10 solar_irradiance 393 393 W/m2 0..1023 W/m2
128 4 solar_ultraviolet 3 3 0..15
132 4 clouds 4 4 okta 0..8 okta
136 9 air_quality 41 41 AQI 0..500 AQI
145 14 radiation_cpm 22 22 CPM 0..65535 CPM
159 14 radiation_dose 10 0.10 uSv/h 0..163.83, 0.01
173 24 latitude 13918992 59.334592 -90..+90
197 24 longitude 9230415 18.063230 -180..+180
221 24 datetime 703789 day 40 17:29:05 (3518945s) 5s res
245 8 flags 1 0x01 8-bit bitmask
Total: 253 bits (32 bytes)
** Decoded:
Station 42 seq=1 var=0 (weather_station) [253 bits, 32 bytes]
battery: 84% (discharging)
link: -88 dBm RSSI, 0 dB SNR
environment: 14.75 C, 1013 hPa, 55%
wind: 4.0 m/s, 172 deg, gust 8.5 m/s
rain: 3 mm/hr, 0.4 mm/d
solar: 393 W/m2, UV 3
clouds: 4 okta
air_quality: 41 AQI
radiation: 22 CPM, 0.10 uSv/h
position: 59.334592, 18.063230
datetime: day 40 17:29:05 (3518945s)
flags: 0x01
** JSON:
{"variant":0,"station":42,"sequence":1,"packed_bits":253,"packed_bytes":32,"battery":{"level":84,"charging":false},"link":{"rssi":-88,"snr":0},"environment":{"temperature":14.75,"pressure":1013,"humidity":55},"wind":{"speed":4,"direction":172,"gust":8.5},"rain":{"rate":3,"size":4},"solar":{"irradiance":393,"ultraviolet":3},"clouds":4,"air_quality":41,"radiation":{"cpm":22,"dose":0.099999994039535522},"position":{"latitude":59.334592183506032,"longitude":18.06323039908591},"datetime":3518945,"flags":1}
────────────────────────────────────────────────────────────────────────────────
** Packet #2 [17:29:38] 30-second report
────────────────────────────────────────────────────────────────────────────────
** Sensor values:
battery: 84.9%
link: -85 dBm SNR 5.5 dB
temperature: +14.48 °C
pressure: 1013 hPa
humidity: 55 %
wind: 3.6 m/s @ 171° (gust 7.2 m/s)
rain: 5 mm/hr, 0.0 mm/d
solar: 390 W/m² UV 3
** Binary (16 bytes):
00 2A 00 02 3F D2 36 D5 1B 70 EF 43 81 41 86 30
** Diagnostic dump:
Offset Len Field Raw Decoded Range
------ --- ----- --- ------- -----
0 4 variant 0 0 0-14 (15=rsvd)
4 12 station 42 42 0-4095
16 16 sequence 2 2 0-65535
32 8 presence[0] 63 0x3f ext|tlv|6 fields
40 5 battery_level 26 84% 0..100%%, 5b quant
45 1 battery_charging 0 discharging 0/1
46 4 link_rssi 8 -88 dBm -120..-60, 4dBm
50 2 link_snr 3 10 dB -20..+10, 10dB
52 9 temperature 218 14.50 C -40..+80C, 0.25C
61 8 pressure 163 1013 hPa 850..1105 hPa
69 7 humidity 55 55% 0..100%%
76 7 wind_speed 7 3.5 m/s 0..63.5, 0.5m/s
83 8 wind_direction 122 172 deg 0..355, ~1.4deg
91 7 wind_gust 14 7.0 m/s 0..63.5, 0.5m/s
98 8 rain_rate 5 5 mm/hr 0..255 mm/hr
106 4 rain_size 0 0.0 mm/d 0..6.3 mm/d
110 10 solar_irradiance 390 390 W/m2 0..1023 W/m2
120 4 solar_ultraviolet 3 3 0..15
Total: 124 bits (16 bytes)
** Decoded:
Station 42 seq=2 var=0 (weather_station) [124 bits, 16 bytes]
battery: 84% (discharging)
link: -88 dBm RSSI, 10 dB SNR
environment: 14.50 C, 1013 hPa, 55%
wind: 3.5 m/s, 172 deg, gust 7.0 m/s
rain: 5 mm/hr, 0.0 mm/d
solar: 390 W/m2, UV 3
** JSON:
{"variant":0,"station":42,"sequence":2,"packed_bits":124,"packed_bytes":16,"battery":{"level":84,"charging":false},"link":{"rssi":-88,"snr":10},"environment":{"temperature":14.5,"pressure":1013,"humidity":55},"wind":{"speed":3.5,"direction":172,"gust":7},"rain":{"rate":5,"size":0},"solar":{"irradiance":390,"ultraviolet":3}}
The iotdata mesh protocol extends the reach of sensor networks by allowing dedicated relays to forward sensor data across multiple relays toward one or more gateways. The protocol is designed to be seamless — existing sensors require no firmware changes, the system works without mesh infrastructure, and relays can be inserted into a live deployment to fill coverage gaps.
The mesh layer is carried within the existing iotdata wire format using variant ID 15 (0x0F) for all control-plane traffic. This means mesh packets share the same 4-byte header structure as sensor data, can coexist on the same radio channel, and are handled by the same receive path up to the point of variant dispatch. Relay nodes have a dedicated station ID and can also convey sensor data under that ID.
LoRa radio links between sensors and gateways are subject to terrain, vegetation, buildings, and seasonal variation. A sensor that works reliably in winter may become intermittent when foliage returns in spring. A sensor placed in a valley or behind a structure may never reach the gateway directly. Increasing transmit power or antenna height is not always practical or permitted.
Rather than requiring all sensors to participate in a mesh network (which adds complexity, power consumption, and firmware requirements), the protocol introduces a separate class of mesh-aware relays that transparently extend range. Sensors remain simple transmit-only devices. The mesh is an overlay infrastructure.
The protocol defines three roles. A single physical device may implement one or two of these roles simultaneously.
Sensor — A device that periodically transmits iotdata-encoded packets containing measurement data. Sensors are transmit-only, fire-and-forget. They have no awareness of the mesh, do not listen for packets, and do not participate in routing. A sensor's firmware is identical whether or not mesh infrastructure is deployed. Sensors use iotdata variant IDs 0–14 as defined by their measurement type. Sensors are typically power constrained.
Relay — A mesh-aware device that listens for both sensor packets and mesh control packets. Its primary function is to forward sensor data toward a gateway when the sensor cannot reach the gateway directly. Relay nodes form a self-organising tree topology rooted at gateways, using periodic beacon messages to discover routes. A relay treats sensor payloads as opaque byte sequences — it never inspects or interprets measurement fields. A relay may optionally also function as a sensor (dual-role), transmitting its own measurement data (e.g. position, battery level, environment) using a standard iotdata variant alongside its mesh traffic on variant 15. Relays have higher power demand than sensors, but are still unlikely to be mains powered.
Gateway — A mesh-aware device that receives sensor data (directly or via relays) and delivers it to upstream systems for processing, storage, and display. Gateways originate beacon messages that define the routing topology. A deployment may include multiple gateways for redundancy or to cover a wide area. Each gateway is identified by a unique station ID. Gateways perform duplicate suppression — if the same sensor packet arrives both directly and via a relay, only the first arrival is processed. Gateways are typically mains powered and likely to be connected to network and internet infrastructure.
| Capability | Sensor | Relay | Gateway |
|---|---|---|---|
| Transmits own sensor data | yes | optional (dual-role) | no |
| Listens for packets | no | yes | yes |
| Forwards sensor data | no | yes | no (endpoint) |
| Participates in mesh routing | no | yes | yes (root) |
| Originates beacons | no | no | yes |
| Rebroadcasts beacons | no | yes | no |
| Requires iotdata field knowledge | yes (own variant) | no (opaque relay) | yes (all variants) |
| Firmware changes for mesh | none | mesh-specific | mesh additions |
The mesh layer is an optional enhancement, not a prerequisite. A deployment consisting only of sensors and gateways works exactly as it does today. Mesh infrastructure can be added incrementally — deploying a relay between a struggling sensor and the gateway immediately improves reliability without touching the sensor.
Mesh control packets use iotdata variant ID 15 (0x0F). This reserves the final variant slot for mesh traffic while leaving variants 0–14 available for sensor data definitions. The 4-byte iotdata header (variant, station_id, sequence) is shared by all packet types, meaning mesh packets are structurally valid iotdata packets with a different interpretation of the payload.
Byte 4, which serves as a presence bitmap in sensor variants, serves as a control type field in variant 15 packets. The upper nibble identifies the mesh packet type (4 bits, supporting up to 16 types). This allows the receive path to branch on variant ID alone: variants 0–14 route to the sensor data decoder, variant 15 routes to the mesh handler.
Relays never inspect the contents of sensor packets beyond the 4-byte iotdata header. The header is read only to extract the originating sensor's station_id and sequence number for duplicate suppression. All remaining bytes are treated as an opaque blob, copied verbatim during forwarding. This means the mesh layer has zero coupling to field definitions, variant suites, encoding formats, or any future changes to the iotdata measurement schema.
The sole structural dependency is the position and size of station_id (12 bits at bytes 0–1) and sequence (16 bits at bytes 2–3) in the iotdata header. This is the most stable contract in the protocol and is not expected to change.
Each gateway originates its own beacon stream identified by its station_id
(carried as gateway_id in the beacon). Relays independently track which
gateway trees they belong to and select the best gateway by cost (relay count),
breaking ties by received signal strength. If a gateway fails, its beacons
cease, and so relays in its tree will time out after a configurable number of
missed beacon rounds, and automatically adopt an alternative gateway's tree.
The mesh uses a simplified distance-vector approach where each relay knows its cost (number of relays to reach the gateway) and forwards data toward lower-cost neighbours. This is conceptually similar to RPL (RFC 6550, Routing Protocol for Low-Power and Lossy Networks) but dramatically simplified — no full topology state, no Directed Acyclic Graph computation, no IPv6 dependency. Each node stores only its parent, a backup parent, and a small neighbour table.
Topology is built through periodic beacon propagation from gateways outward.
Gateway (cost=0)
│
│ BEACON (gateway_id=G, generation=N, cost=0)
│
▼
Relay A hears beacon, adopts Gateway as parent, sets cost=1
│
│ BEACON (gateway_id=G, generation=N, cost=1) [after random 1–5s jitter]
│
▼
Relay B hears Relay A's rebroadcast, adopts Relay A as parent, sets cost=2
│
│ BEACON (gateway_id=G, generation=N, cost=2) [after random 1–5s jitter]
│
▼
...continues outward until no new nodes hear the beacon
Gateways transmit beacons at a regular interval (of which there is no recommended default, as this should be a function of the periodicity and density of sensor network, but 60 seconds is a reasonable figure). Each beacon carries a generation counter that increments per round. Relays compare incoming beacons against their current state:
- Newer generation (modular comparison within half the 12-bit range): update parent if cost is equal or better.
- Same generation, lower cost: adopt the new sender as parent.
- Same generation, equal or higher cost: suppress — do not rebroadcast.
The random rebroadcast jitter (1–5 seconds) prevents synchronised retransmission from nodes that hear the same beacon simultaneously, reducing collisions in dense areas.
Sensor data flows inward from sensors toward gateways, relayed transparently by relays.
Sensor S transmits raw iotdata packet (variant=V, station=S, seq=N)
│
│ [raw packet, no mesh awareness]
│
├──────────────────────┐
▼ ▼
Gateway (hears directly) Relay A (hears sensor)
│ │
│ process normally │ wrap in FORWARD, send to parent
│ │
│ ▼
│ Gateway (receives FORWARD)
│ │
│ │ unwrap inner packet
│ │ dedup: {S, N} already seen? → discard
│ │ otherwise process normally
▼ ▼
[sensor data processed once]
When a relay hears a raw sensor packet (any variant 0–14), it waits a short random backoff (200–1000ms). If during that backoff it hears another relay forward the same packet (identified by matching origin station and sequence), it suppresses its own forward. This Trickle-style suppression reduces redundant airtime in areas where multiple relays overlap.
If no suppression occurs, the relay wraps the raw sensor packet in a FORWARD control message (variant 15, ctrl_type 0x1) addressed to its parent and transmits. The parent, if another relay, repeats the process — unwrap, dedup, re-wrap with its own header, forward to its parent — until the packet reaches a gateway.
Each FORWARD is acknowledged by the receiving parent to confirm delivery.
Relay A Relay B (A's parent)
│ │
│──── FORWARD (seq=X) ───────>│
│ │
│<──── ACK (fwd_station=A, ──>│
│ fwd_seq=X) |
│ |
[clear retry timer] [forward inner packet upstream]
If no ACK is received within a timeout (recommended 500ms for high frequency sensor networks, up to 15-30 seconds for low frequency networks), the sender retries up to a configurable number of attempts (recommended: 3). After exhausting retries, the sender marks its parent as unreliable, promotes its backup parent (if available), and retransmits the FORWARD to the new parent. If no backup parent is available, the node broadcasts a ROUTE_ERROR and enters an orphaned state, listening for beacons to reattach to the tree.
When a relay loses all upstream paths, it broadcasts a ROUTE_ERROR so downstream nodes can immediately reroute rather than waiting for beacon timeout.
Relay B (was Relay C's parent) Relay C (child of B)
│ │
[B loses its parent] │
│ │
│──── ROUTE_ERROR ──────────>│
│ (reason=parent_lost) │
│ │
[C immediately seeks alternative parent from neighbour table]
This converts a multi-minute outage (waiting for 3 missed beacon rounds × 60s = 180s) into sub-second failover in the best case.
Relays periodically send NEIGHBOUR_REPORT messages upstream to the gateway, providing a snapshot of their local topology view. These reports are forwarded like any other data (wrapped in FORWARD by upstream relays). The gateway aggregates reports from all relays to build a complete network topology graph, enabling operators to visualise the mesh, identify weak links, and plan node placement.
In a future protocol revision, the gateway may send PING messages routed downstream toward a specific target node. The target responds with a PONG that routes back upstream. This provides on-demand reachability confirmation and round-trip-time measurement without waiting for the target's next scheduled data or neighbour report transmission.
| Byte | Bits | Field |
|---|---|---|
| 0 | [7:4] | variant_id (4 bits: 0–14 = sensor data, 15 = mesh control) |
| 0–1 | [3:0]+[7:0] | station_id (12 bits: 0–4095) |
| 2–3 | [15:0] | sequence (16 bits, big-endian) |
| 4 | [7:0] | presence bitmap (variants 0–14) | ctrl_type + payload (variant 15) |
The variant and station_id are packed into a 4+12 bit structure:
byte[0] = (variant << 4) | (station_id >> 8)
byte[1] = station_id & 0xFFThis packing primitive recurs throughout the mesh protocol wherever a 4-bit field is paired with a 12-bit station_id or generation counter.
All mesh control packets share this structure:
| Byte | Bits | Field | Notes |
|---|---|---|---|
| 0–1 | 4+12 | 0xF | sender_station |
The mesh node transmitting this packet |
| 2–3 | 16 | sender_seq |
Mesh sequence counter (separate from any sensor data sequence if dual-role) |
| 4 | [7:4] | ctrl_type |
Mesh packet type (0x0–0xF) |
| 4 | [3:0] | type-specific | Upper nibble of first payload field |
The remaining 4 bits of Byte 4 and the whole bytes of Byte 5 onward are control-type-specific. Fields pack as a bitstream from byte 4, MSB-first, with no padding except where explicitly noted.
Originated by gateways, rebroadcast by relays. Flows outward from gateway.
| Byte | Bits | Field | Range | Notes |
|---|---|---|---|---|
| 0–1 | 4+12 | 0xF | sender_station |
0–4095 | Who (re)broadcast this copy |
| 2–3 | 16 | sender_seq |
0–65535 | |
| 4–5 | 4+12 | ctrl=0x0 | gateway_id |
0–4095 | Originating gateway |
| 6 | 8 | cost |
0–255 | 0 at gateway, +1 per relay |
| 7 | 4+4 | flags | generation[11:8] |
flags: b0 = accepting forwards, b1–b3 reserved | |
| 8 | 8 | generation[7:0] |
0–4095 | Beacon round counter |
Total: 9 bytes.
Byte packing detail:
buf[4] = (0x0 << 4) | (gateway_id >> 8)
buf[5] = gateway_id & 0xFF
buf[6] = cost
buf[7] = (flags << 4) | ((generation >> 8) & 0x0F)
buf[8] = generation & 0xFFGeneration uses wraparound comparison: beacon A is newer than B if
(A - B) mod 4096 is in the range 1–2047. At a 60-second beacon interval,
generation wraps every ~68 hours.
Wraps a raw sensor packet for relay toward the gateway.
| Byte | Bits | Field | Range | Notes |
|---|---|---|---|---|
| 0–1 | 4+12 | 0xF | sender_station |
0–4095 | This relay |
| 2–3 | 16 | sender_seq |
0–65535 | |
| 4 | 4+4 | ctrl=0x1 | ttl[7:4] |
||
| 5 | 4+4 | ttl[3:0] | 0 |
0–255 | 4-bit pad aligns inner packet to byte boundary |
| 6+ | 8×N | inner_packet |
Raw iotdata bytes, opaque |
Total: 6 + N bytes.
Byte packing detail:
buf[4] = (0x1 << 4) | (ttl >> 4)
buf[5] = (ttl & 0x0F) << 4 /* lower nibble is zero pad */
memcpy(&buf[6], inner_packet, N) /* byte-aligned, no shifting */The 4-bit pad at byte 5 lower nibble ensures the inner packet starts at a byte boundary (offset 6). This is a deliberate trade-off: the pad may cause up to 11 bits of wasted space in the worst case (as the inner packet may already have up to 7 bits wasted in the final byte alignment), but avoids requiring every relay to bit-shift the entire opaque payload. For relay hot-path performance (just a memcpy), this is the right choice. The pad nibble is reserved for future use (e.g. priority, retry count).
Inner packet length is derived from the radio layer: N = rx_packet_len - 6.
For duplicate suppression, the relay reads bytes 6–9 of the radio frame (the inner packet's iotdata header) to extract the originating sensor's station_id and sequence:
origin_station = ((buf[6] & 0x0F) << 8) | buf[7]
origin_sequence = (buf[8] << 8) | buf[9]No FORWARD nesting occurs. Each relay creates a fresh FORWARD with its own sender_station and sender_seq. The inner_packet bytes are always the original sensor transmission, regardless of how many relays have occurred.
Relay-by-relay acknowledgement of a received FORWARD.
| Byte | Bits | Field | Range | Notes |
|---|---|---|---|---|
| 0–1 | 4+12 | 0xF | sender_station |
0–4095 | Parent sending the ACK |
| 2–3 | 16 | sender_seq |
0–65535 | |
| 4–5 | 4+12 | ctrl=0x2 | fwd_station |
0–4095 | Child whose FORWARD is being ACKed |
| 6–7 | 16 | fwd_seq |
0–65535 | Child's sender_seq from the FORWARD |
Total: 8 bytes.
Broadcast by a relay that has lost all upstream paths.
| Byte | Bits | Field | Range | Notes |
|---|---|---|---|---|
| 0–1 | 4+12 | 0xF | sender_station |
0–4095 | Orphaned node |
| 2–3 | 16 | sender_seq |
0–65535 | |
| 4 | 4+4 | ctrl=0x3 | reason |
0–15 | 0=parent_lost, 1=overloaded, 2=shutdown |
Total: 5 bytes. The minimum possible mesh packet — just the common header with a reason code.
Reason codes:
| Value | Meaning |
|---|---|
| 0x0 | parent_lost — all upstream links failed |
| 0x1 | overloaded — too many children, shedding load |
| 0x2 | shutdown — graceful node shutdown |
| 0x3–0xF | reserved |
Periodic topology snapshot sent upstream to the gateway.
Header:
| Byte | Bits | Field | Range | Notes |
|---|---|---|---|---|
| 0–1 | 4+12 | 0xF | sender_station |
0–4095 | Reporting node |
| 2–3 | 16 | sender_seq |
0–65535 | |
| 4–5 | 4+12 | ctrl=0x4 | parent_id |
0–4095 | Current parent (0xFFF if orphaned) |
| 6 | 8 | my_cost |
0–255 | Reporting node's cost |
| 7 | 6+2 | num_neighbours | gateway_id[11:10] |
0–63 | Number of neighbour entries that follow |
| 8 | 8 | gateway_id[9:2] |
0–4095 | Current active gateway tree |
| 9 | 2 | gateway_id[1:0] |
Neighbour entry (3 bytes each):
| Offset | Bits | Field | Range | Notes |
|---|---|---|---|---|
| +0 | 8 | cost |
0–255 | Neighbour's advertised cost |
| +1–2 | 4+12 | rssi_q4 | station_id |
RSSI quantised to 4 bits + station_id |
Total: 9.2 bytes + 3N bytes.
RSSI quantisation uses 5 dBm steps from a floor of −120 dBm:
| rssi_q4 | Approximate dBm |
|---|---|
| 0 | ≤ −120 |
| 1 | −115 |
| 2 | −110 |
| ... | ... |
| 10 | −70 |
| 15 | ≥ −45 |
Encode: rssi_q4 = clamp((rssi_dbm + 120) / 5, 0, 15). Decode:
rssi_dbm ≈ (rssi_q4 × 5) − 120.
Example sizes:
| Neighbours | Total bytes |
|---|---|
| 4 | 22 |
| 8 | 34 |
| 16 | 58 |
| 32 | 106 |
| 63 | 199 |
All fit within standard LoRa maximum payload sizes (222 bytes at SF7/125kHz, up to 255 at lower spreading factors).
Gateway-originated reachability test, routed downstream toward a target node.
| Byte | Bits | Field | Range | Notes |
|---|---|---|---|---|
| 0–1 | 4+12 | 0xF | sender_station |
0–4095 | Current forwarding relay |
| 2–3 | 16 | sender_seq |
0–65535 | |
| 4–5 | 4+12 | ctrl=0x5 | target_id |
0–4095 | Destination node |
| 6 | 8 | ttl |
0–255 | Decremented per relay on downstream path |
| 7 | 8 | ping_id |
0–255 | Correlates with PONG |
Total: 8 bytes.
Response to PING, flows upstream toward the gateway.
| Byte | Bits | Field | Range | Notes |
|---|---|---|---|---|
| 0–1 | 4+12 | 0xF | sender_station |
0–4095 | Responding node |
| 2–3 | 16 | sender_seq |
0–65535 | |
| 4–5 | 4+12 | ctrl=0x6 | gateway_id |
0–4095 | Route back to originating gateway |
| 6 | 8 | relays |
0–255 | Incremented each relay on return path |
| 7 | 8 | ping_id |
0–255 | Echoed from PING |
Total: 8 bytes.
Reserved for future use. Relays receiving an unrecognised ctrl_type should silently discard the packet.
| ctrl | Name | Direction | Bytes | Version |
|---|---|---|---|---|
| 0x0 | BEACON | outward (gateway → relays) | 9 | v1 |
| 0x1 | FORWARD | inward (relays → gateway) | 6 + N | v1 |
| 0x2 | ACK | single relay (parent → child) | 8 | v1 |
| 0x3 | ROUTE_ERROR | broadcast | 5 | v1 |
| 0x4 | NEIGHBOUR_REPORT | inward (relays → gateway) | 9.2 + 3N | v1 |
| 0x5 | PING | outward (gateway → target) | 8 | v2 |
| 0x6 | PONG | inward (target → gateway) | 8 | v2 |
| 0x7–0xF | reserved | — | — | — |
A relay maintains the following state in RAM. Total memory footprint is under 512 bytes for typical configurations.
Routing state:
parent— station_id, cost, RSSI, last beacon time (8 bytes)backup_parent— same structure (8 bytes)my_cost— current relay count to gateway (1 byte)my_gateway— gateway_id of the tree this node belongs to (2 bytes)beacon_generation— most recently processed generation (2 bytes)
Neighbour table (up to 63 entries):
- Per entry: station_id, cost, RSSI, last_heard timestamp (8 bytes each)
- Typical: 8–16 entries = 64–128 bytes
- Entries expire after a configurable timeout (recommended: 5× beacon interval)
Duplicate suppression ring (32–64 entries):
- Per entry: origin_station_id (12 bits) + origin_sequence (16 bits) = 4 bytes packed
- Ring of 64 entries = 256 bytes
- FIFO: oldest entry evicted when ring is full
Forward retry queue (4–8 entries):
- Per entry: pending FORWARD packet buffer, retry count, timestamp of last attempt, parent at time of send
- Entries cleared on ACK receipt or after max retries
initialise:
listen for beacons to join a tree
set status = orphaned
on receive packet:
if variant == 15:
switch (ctrl_type):
BEACON: process_beacon()
FORWARD: unwrap, dedup, re-wrap, forward to parent
ACK: match against forward retry queue, clear entry
ROUTE_ERROR: if sender is my parent, trigger parent reselection
other: discard
else:
// raw sensor packet (variant 0–14)
schedule_forward(packet) // backoff, dedup, wrap, send to parent
periodic timers:
beacon rebroadcast — on beacon receipt, after 1–5s random jitter
forward retry — check pending queue, retransmit if ACK timeout
parent timeout — if no beacon for 3 rounds, orphan and reselect
neighbour report — send report upstream every N minutes
own sensor readings — if dual-role, encode and transmit own data
An existing iotdata gateway requires three additions to support mesh:
Beacon origination: Every N seconds (60 default), transmit a BEACON with cost=0 and an incrementing generation counter. The gateway_id is the gateway's own station_id.
FORWARD handling: On receiving a variant 15 packet with ctrl_type 0x1, extract the inner packet starting at byte 6 and process it through the normal iotdata receive path (decode, store, display). Send an ACK back to the FORWARD's sender.
Duplicate suppression: Maintain a ring buffer of recently-seen {station_id, sequence} pairs. Check every incoming sensor packet (whether received directly or unwrapped from a FORWARD) against this ring. Discard duplicates, keeping the first arrival.
The existing iotdata decode path for variants 0–14 is completely untouched.
Duplicate suppression is critical because the same sensor packet may arrive at a gateway via multiple paths: directly, via one relay, or via different relay chains. Without dedup, every measurement would be recorded multiple times.
The dedup key is {origin_station_id, origin_sequence}, extracted from the iotdata header of the original sensor packet. Both relays and gateways maintain dedup rings:
- At the relays: prevents forwarding the same sensor packet twice (e.g. two relays both hear the same sensor and both forward upstream — the upstream relay deduplicates).
- At the gateway: prevents processing the same data twice when it arrives both directly and via relay.
A ring buffer of 64 entries is sufficient for most deployments. With 16 sensors transmitting every 5–15 seconds, the ring covers approximately 5–20 minutes of history. The ring is FIFO — the oldest entry is evicted when the buffer is full.
A relay selects its parent using the following priority:
- Lowest cost (fewest relays to gateway)
- If equal cost, highest RSSI (strongest signal)
- If equal cost and RSSI, prefer existing parent (stability)
The backup parent is the second-best candidate by the same criteria.
Failover triggers:
- FORWARD ACK timeout after max retries — parent is unreachable.
- ROUTE_ERROR received from parent — parent has lost its own uplink.
- Beacon timeout — no beacon from parent's tree for 3 consecutive rounds.
On failover, the node promotes its backup parent, recalculates cost (new parent's cost + 1), and continues forwarding. If no backup is available, the node broadcasts a ROUTE_ERROR with reason=parent_lost and enters orphaned state, listening for beacons from any tree.
A relay rebroadcasts a received beacon only if:
- The beacon's generation is newer than the last processed generation for this gateway_id (modular comparison: newer if difference mod 4096 is in range 1–2047), OR
- The beacon has the same generation but offers a strictly lower cost than the current best seen for this generation.
If the beacon does not meet either condition, it is suppressed. This prevents beacon storms in dense deployments where many nodes hear the same beacon simultaneously. The random rebroadcast jitter (1–5 seconds) further reduces collision probability.
When a relay hears a raw sensor packet that it intends to forward, it waits a random backoff period (200–1000ms) before transmitting the FORWARD. During this backoff, if the node hears another relay transmit a FORWARD containing the same inner packet (identified by matching origin station and sequence in the inner header), it cancels its own forward.
This Trickle-style suppression (inspired by RFC 6206) significantly reduces redundant airtime in areas where multiple relay nodes have overlapping coverage. In the worst case (no other relay forwards), it adds 200–1000ms latency to the first relay. In dense areas, it eliminates duplicate transmissions entirely.
The mesh protocol is designed for off-the-shelf LoRa modules (e.g. Semtech SX1262-based modules like Ebyte E22-900T22D) connected to ESP32-class microcontrollers. No specialised radio hardware is required.
Recommended relay hardware:
- ESP32-C3 or ESP32-S3 (low cost, low power, sufficient RAM and compute)
- LoRa module with RSSI reporting (for neighbour quality assessment)
- Reliable power: mains, solar + battery, or PoE where available
- Weatherproof enclosure with external antenna for outdoor deployment
Relays should have reliable power supplies. Unlike sensors which can sleep between transmissions, relays must listen continuously. Solar + battery is viable in most climates with an appropriately sized panel (5–10W) and battery (5–10Ah).
Typical LoRa range per relay in different environments at commonly used settings (SF7–SF9, 125kHz bandwidth, 14–22 dBm transmit power):
| Environment | Typical range per relay | Notes |
|---|---|---|
| Open farmland | 2–8 km | Line-of-sight, minimal obstructions |
| Rolling hills | 1–4 km | Terrain shadowing, partial LOS |
| Forest / dense vegetation | 0.5–2 km | Significant attenuation, seasonal variation |
| Urban / buildings | 0.3–1.5 km | Multipath, reflection, penetration loss |
| Indoor-to-outdoor | 50–500 m | Wall penetration, highly variable |
With a maximum TTL of 255 relays, the theoretical network span is enormous (hundreds of kilometres). In practice, latency and airtime constraints limit useful depth to 5–10 relays in most deployments. Beyond 10 relays, per-packet latency (including backoff, transmission time, and ACK round-trips) accumulates significantly.
Per-relay latency for a forwarded packet:
| Component | Typical | Notes |
|---|---|---|
| Trickle backoff | 200–1000 ms | Random, first-relay only for sensor→relay |
| LoRa TX time (30 bytes, SF7) | ~50 ms | Higher SF increases proportionally |
| ACK wait | 0–500 ms | Timeout before retry |
| Processing overhead | < 1 ms | Negligible on ESP32 |
Estimated end-to-end latency by relay count:
| Relays | Best case | Typical |
|---|---|---|
| 1 | ~300 ms | ~700 ms |
| 2 | ~400 ms | ~1.2 s |
| 3 | ~500 ms | ~1.8 s |
| 5 | ~700 ms | ~3.0 s |
| 10 | ~1.2 s | ~6.0 s |
For environmental sensor data with transmission intervals of 5–15 seconds, these latencies are entirely acceptable. The data is not time-critical — a few seconds of additional delay has no impact on the value of temperature, moisture, or depth readings.
Every relay consumes airtime. A packet forwarded across 3 relays uses 3× the airtime of a direct transmission plus ACK overhead. In regions with regulatory duty cycle limits (e.g. 1% in EU 868MHz sub-band h1.4), this constrains the aggregate throughput.
Example: 16 sensors transmitting every 10 seconds, average packet 25 bytes.
| Scenario | Packets/sec | Airtime/sec (SF7) | Effective duty cycle |
|---|---|---|---|
| All direct to gateway | 1.6 | ~80 ms | ~0.8% |
| All via 1 relay | 1.6 × 2 (fwd+ack) | ~210 ms | ~2.1% |
| All via 2 relays | 1.6 × 4 (2×fwd + 2×ack) | ~420 ms | ~4.2% |
| Mixed (50% direct, 50% 1-relay) | ~2.4 | ~145 ms | ~1.5% |
In practice, most sensors will be direct to gateway. Only sensors with poor direct links use mesh relays. A typical deployment with 3–5 sensors using 1-relay stays well within 1% duty cycle for the relay and gateway.
For deployments requiring higher throughput, use the 915MHz ISM band (Americas, Australia) which has more relaxed duty cycle requirements, or use LoRa spreading factor 7 (fastest airtime) with forward error correction.
| Parameter | Recommended limit | Hard limit |
|---|---|---|
| Sensors per gateway | 50–100 | 4095 (station_id space) |
| Relays per gateway | 10–20 | No hard limit, bounded by airtime |
| Maximum useful relays | 5–10 | 255 (TTL field) |
| Network span | 10–50 km | Limited by relay count × range |
| Gateways per deployment | 2–5 | No hard limit (each runs independent tree) |
| Neighbour table size per relay | 8–16 typical | 63 (protocol limit) |
| Total nodes (sensors + relays) | 100–200 | 4095 (station_id space) |
A 500-hectare farm with a central farmhouse, outlying barns, fields extending 2–3 km in each direction, and a river valley at the property boundary.
Challenge: Sensors in low-lying fields near the river are 3 km from the farmhouse with a ridge blocking line-of-sight. A weather station on a north-facing hillside has intermittent connectivity.
Deployment:
[WS-4] weather station (hilltop)
|
direct
|
[SM-1] soil ──direct──> [GATEWAY] farmhouse
[SM-2] soil ──direct──/ |
[WL-1] water ─direct──/ |
|
direct
|
[HOP-A] barn roof (solar powered)
|
1 relay
/ \
[SM-3] soil [SM-4] soil (low field, behind ridge)
[WS-5] weather station (river valley)
[WL-2] water level (river gauge)
Configuration: 1 gateway, 1 relay, 8 sensors. Relay A sits on a barn roof at mid-elevation with clear line-of-sight to both the gateway and the river valley. Sensors SM-3, SM-4, WS-5, and WL-2 transmit normally. Relay A hears them (they cannot reach the gateway directly), wraps their data in FORWARD messages, and sends upstream to the gateway. Total relay load on Relay A: 4 sensors × ~0.1 packets/sec = ~0.4 FORWARD packets/sec. Well within capacity.
Seasonal variation: In summer, tree canopy growth may degrade Relay A's link to WS-5. If WS-5's data becomes intermittent, deploy Relay B near the river to provide a 2-relay path: WS-5 → Relay B → Relay A → Gateway. Relay B automatically integrates — it hears Relay A's rebroadcast beacon (cost=1), sets itself as cost=2, and begins forwarding.
A 2000-hectare managed forest with environmental sensors distributed along trails and watercourses. Dense canopy limits per-relay range to 500m–1.5km. A research cabin at the forest edge serves as the gateway location, with a second gateway at a fire lookout tower 4 km away.
Challenge: Sensors deep in the forest are 3–5 km from either gateway with no line-of-sight. Canopy attenuation is severe.
Deployment:
[GATEWAY-1] research cabin [GATEWAY-2] fire tower
| |
direct direct
| |
[HOP-1] ridge clearing [HOP-4] trail junction
| |
1 relay from GW-1 1 relay from GW-2
| |
[HOP-2] stream crossing [HOP-5] canopy gap
| |
2 relays from GW-1 2 relays from GW-2
| |
[SM-1..4] soil sensors [ENV-1..3] environment
[WL-1] water level [WS-1] weather station
|
[HOP-3] deep forest
|
3 relays from GW-1
|
[SM-5..8] deep soil sensors
[AQ-1] air quality
Configuration: 2 gateways, 5 relays, ~16 sensors. Maximum depth is 3 relays. Relays are placed at natural clearings, ridge lines, trail junctions, and stream crossings where canopy gaps improve radio propagation.
Redundancy: HOP-2 can hear beacons from both GW-1 (via HOP-1, cost=2) and GW-2 (via HOP-4 and HOP-5, cost=3). It normally routes via GW-1 (lower cost). If HOP-1 fails, HOP-2 receives no beacons from GW-1's tree, times out after 3 rounds, and adopts GW-2's tree via HOP-5 at cost=3. Sensors SM-1..4 and WL-1 continue to operate without interruption — they are unaware of the topology change.
If a sensor is mounted on a vehicle (e.g. a tractor, livestock tracker, or patrol vehicle) that moves between coverage areas, the protocol handles this naturally because sensors are not mesh-aware.
How it works: The vehicle-mounted sensor transmits periodically as always. As it moves, different relays (or the gateway directly) hear its transmissions. Whichever relay hears the packet forwards it. If multiple relays hear it, Trickle suppression ensures only one forwards. As the vehicle moves out of one relay's range and into another's, forwarding seamlessly transitions.
What works well: Slow-moving vehicles (tractors, livestock) that spend minutes to hours within each relay's coverage area. The sensor's transmission interval (5–15 seconds) means multiple packets are sent during each coverage window.
What works less well: Fast-moving vehicles passing through a relay's range in seconds. If the sensor's transmission interval is longer than the transit time through coverage, packets may be missed during the transition between nodes. This is inherent to any non-continuous transmission scheme and is not specific to the mesh protocol.
The sensor firmware needs no changes. The mesh adapts to the sensor's location in real time. The gateway sees the same station_id and sequence numbers regardless of which relay forwarded the data. Duplicate suppression handles cases where the sensor is within range of multiple relays simultaneously.
| Version | Description |
|---|---|
| v1 | Initial mesh protocol. BEACON, FORWARD, ACK, ROUTE_ERROR, NEIGHBOUR_REPORT. Gradient-based routing with single parent selection and relay-by-relay acknowledgement. |
| v2 (planned) | Adds PING/PONG for gateway-initiated reachability testing. Requires downstream routing capability at relays. |
| Identifier | Value | Meaning |
|---|---|---|
| variant_id | 0x0F (15) | Mesh control packet |
| ctrl_type | 0x0–0x6 | Defined mesh packet types |
| ctrl_type | 0x7–0xF | Reserved for future use |
| parent_id | 0xFFF | Orphaned (no parent) |
| station_id | 0x000 | Reserved (do not assign to nodes) |
| reason codes | 0x3–0xF | Reserved for future use |
In multi-gateway deployments, a sensor positioned between two gateways (or a relay node with paths to both) may deliver the same packet to multiple gateways. Each gateway performs local dedup correctly, but the upstream system (database, MQTT broker, dashboard) receives the same measurement data twice from different gateway sources.
Lightweight solution: UDP dedup broadcast. Each gateway UDP-broadcasts a compact dedup notification on the local network whenever it processes a sensor packet. The notification contains only the dedup key — no measurement data:
[gateway_id] 2 bytes (12-bit station_id of the broadcasting gateway)
[num_entries] 1 byte (number of dedup tuples in this batch, 1–32)
[entries...] 4 bytes each:
[station_id] 2 bytes (12-bit origin sensor, zero-padded to 16 bits)
[sequence] 2 bytes (16-bit origin sequence)
Maximum batch: 32 entries × 4 bytes + 3 byte header = 131 bytes per UDP datagram.
On receipt, other gateways add these tuples to their local dedup ring. If a subsequent FORWARD or direct sensor packet arrives with a station_id and sequence already in the ring (whether from local receive or cross-gateway notification), it is suppressed.
Timing: On a LAN, UDP broadcast latency is under 1ms. LoRa packet transmission plus relay backoff is typically 200–1000ms. This means the cross-gateway notification almost always arrives before the second copy of the sensor data, achieving reliable suppression. In the rare case where two gateways receive the same packet within 1ms of each other (both heard the sensor directly), both may process it. This is acceptable — the upstream system can perform its own dedup on {station_id, sequence} as a final safety net.
Implementation: This is entirely optional. A deployment with a single gateway has no need for it. A multi-gateway deployment works correctly without it — the upstream system simply sees occasional duplicates. The UDP broadcast layer can be added to gateways independently of the mesh protocol and requires no changes to relays or sensors.
Alternative approaches:
- Shared MQTT topic — gateways publish dedup tuples to a topic such as
iotdata/mesh/dedup/{gateway_id}. Other gateways subscribe. Adds dependency on the MQTT broker being available but piggybacks on infrastructure that likely already exists for sensor data delivery. - Upstream dedup only — skip gateway-to-gateway coordination entirely. The database or ingestion layer deduplicates on {station_id, sequence, time_window}. Simplest to implement, slightly higher upstream load from duplicate records.
- Station-id range assignment — the operator assigns non-overlapping station_id ranges to gateways. A gateway only processes packets from its assigned range. Simple but inflexible — a sensor that moves or a relay that reroutes to a different gateway may fall outside the assigned range. Not recommended for dynamic mesh deployments.
The ctrl_type field has 10 unused values (0x7–0xF, plus 0x5–0x6 reserved for PING/PONG v2). Future protocol revisions may define additional packet types. The following have been identified as candidates:
CONFIG_PUSH (candidate: 0x7) — Gateway pushes configuration parameters to a specific relay. Routed downstream like PING. Enables remote adjustment of beacon intervals, transmit power, parent selection thresholds, and reporting frequency without physical access to the node.
Possible payload: target_station(12), config_key(8), config_value(16). Config keys could include:
| Key | Meaning | Value range |
|---|---|---|
| 0x01 | Beacon rebroadcast interval (seconds) | 10–600 |
| 0x02 | Forward retry count | 0–15 |
| 0x03 | Forward ACK timeout (ms / 100) | 1–50 (100ms–5000ms) |
| 0x04 | Parent timeout (missed beacon rounds) | 1–15 |
| 0x05 | Neighbour report interval (minutes) | 1–60 |
| 0x06 | Transmit power level | Module-specific |
| 0x07 | Force rejoin (clear routing state) | 1 = trigger |
CONFIG_ACK (candidate: 0x8) — Target node acknowledges receipt of CONFIG_PUSH. Routes upstream. Confirms the configuration was applied.
PATH_TRACE (candidate: 0x9) — Diagnostic packet that records the station_id of every relay it traverses, building a full path trace from sensor to gateway. A relay appends its own station_id (2 bytes) to the payload before forwarding. The gateway receives a complete ordered list of the relay chain.
Possible payload: origin_station(12), ttl(8), relay_count(8), then relay_count × station_id(12) entries packed sequentially. Maximum path of 15 relays = 5 + 1 + 1 + 23 = 30 bytes. This would be triggered by the gateway wrapping a specific sensor's next FORWARD in a PATH_TRACE envelope, or by a relay when it detects a new sensor (first time seeing a station_id).
NETWORK_RESET (candidate: 0xA) — Gateway broadcasts a command for all relays to flush routing state and re-discover the topology from scratch. Nuclear option for when the mesh becomes wedged in a suboptimal configuration. Payload: just a confirmation nonce to prevent accidental triggering.
TIME_SYNC (candidate: 0xB) — If a future deployment requires coordinated sleep windows or TDMA-style channel access, a dedicated time synchronisation packet could carry a high-resolution timestamp from the gateway. Relays would estimate propagation delay from relay count and adjust their local clocks. However, for the current protocol's CSMA-based channel access, this is unnecessary.
GROUP_FORWARD (candidate: 0xC) — Aggregation packet where a relay bundles multiple small sensor packets into a single transmission to reduce per-packet overhead and ACK traffic. Payload: num_packets(6), then concatenated inner packets with 1-byte length prefixes. Most useful in dense deployments where a single relay forwards for many sensors. Trade-off: increases single-packet airtime and failure blast radius (losing one aggregated transmission loses multiple sensor readings).
The current NEIGHBOUR_REPORT carries cost, RSSI, and station_id per neighbour. Future revisions may extend neighbour entries with additional quality metrics:
- Packet delivery ratio (PDR) — percentage of expected packets actually received from this neighbour over a window. 4 bits (16 levels, ~6% granularity) would suffice. Better parent selection metric than instantaneous RSSI.
- Asymmetric link detection — a flag indicating whether the neighbour has acknowledged hearing this node. A neighbour with good inbound RSSI but no evidence of hearing this node's transmissions is a poor parent candidate (asymmetric link, common with differing antenna heights or transmit powers).
- Neighbour role — 2 bits indicating whether the neighbour is a gateway, relay, or sensor. Currently inferred from behaviour (gateways originate beacons, relays rebroadcast, sensors don't participate), but an explicit role field simplifies topology visualisation.
These extensions would increase neighbour entry size from 3 to 4 bytes. The num_neighbours field (6 bits, max 63) and LoRa payload limits (222 bytes at SF7) would support up to 53 extended entries — still more than sufficient.
The v1 protocol includes no authentication or encryption. For agricultural and environmental monitoring deployments, the threat model is typically low — the data has no commercial sensitivity and the radio channel is shared ISM spectrum. However, for deployments where integrity matters:
Packet authentication — a shared 4-byte key (pre-configured on all mesh nodes and gateways) could be used to compute a 2-byte truncated HMAC appended to every mesh control packet. This prevents rogue nodes from injecting false beacons or FORWARD packets. The key would be distributed during provisioning and is not expected to change frequently. 2 bytes provides 65536 possible values — sufficient to prevent casual injection, though not secure against a determined attacker with radio access.
Replay protection — the existing sequence number provides partial replay protection. A replayed packet with a previously-seen sequence number is caught by dedup. However, a replayed BEACON with a valid-looking generation could disrupt routing. Binding the HMAC to the generation counter and gateway_id prevents this.
Encryption — encrypting the inner packet within FORWARD would prevent eavesdropping on sensor data. AES-128 in CTR mode adds zero overhead to the packet size (ciphertext is same length as plaintext) and requires only a shared key and a nonce derivable from {station_id, sequence}. However, this adds computational cost at every relay (decrypt to verify, re-encrypt to forward) which is unnecessary if the relay treats the inner packet as opaque — the relay does not need to read the inner packet's contents, so the inner packet can remain encrypted end-to-end between sensor and gateway with no relay involvement. The sensor would encrypt before transmission, the gateway would decrypt after receipt, and relay nodes would forward the encrypted blob unchanged.
Relays must listen continuously, which prevents the aggressive sleep modes used by transmit-only sensors. Typical listen-mode current for an SX1262 LoRa module is 5–8 mA; combined with an ESP32-C3 in active mode (~25–50 mA average with WiFi disabled), total system draw is 30–60 mA.
Solar viability: At 12V with a 30mA average draw, daily consumption is ~8.6Wh. A 10W solar panel in temperate latitudes produces 20–40Wh/day (seasonal variation). A 10Ah 12V battery provides ~3–4 days of autonomy in complete cloud cover. This is viable for most deployments.
Duty-cycled listening (future optimisation): If beacon intervals are synchronised, relays could sleep between expected beacon windows and wake only during scheduled listen slots. This requires the TIME_SYNC mechanism described in J.2 and adds complexity to the parent selection logic (must account for clock drift during sleep). Not recommended for v1 but noted as a path to lower power consumption if needed.
Low-power relay mode: A relay with no downstream children (leaf relay — only forwarding for sensors it directly hears) could adopt a semi-synchronised schedule: listen for a window after each expected sensor transmission, forward any received packets, then sleep until the next expected window. This requires the relay to learn sensor transmission intervals through observation, which is feasible since sensors typically transmit at regular (if slightly randomised) intervals.
The mesh protocol's capacity is fundamentally limited by shared airtime on the LoRa channel. All nodes — sensors, relays, and gateways — share a single frequency and spreading factor (assuming no frequency planning).
Single-channel capacity at SF7/125kHz:
A 30-byte LoRa packet at SF7/125kHz takes approximately 50ms of airtime. At 1% duty cycle (EU 868MHz regulatory limit), one device can transmit 20 packets per 100 seconds, or 0.2 packets/second sustained.
In a mesh deployment, the bottleneck is the gateway's immediate neighbourhood — all forwarded packets must pass through the last relay to the gateway. If 10 relay paths converge on a single relay-1 node, that node must forward 10× the traffic of any individual sensor. With 100 sensors at 10-second intervals producing 10 packets/second aggregate, the relay-1 node must forward all 10 plus transmit ACKs, consuming ~1 second of airtime per second — impossible under any duty cycle regulation.
Mitigation strategies:
- Multiple relay-1 nodes — deploy 2–4 relays within direct range of the gateway, each serving a different angular sector or downstream branch. Distributes the forwarding load.
- Multiple gateways — each gateway serves a subset of the network. Cross-gateway dedup (J.1) prevents duplicate processing.
- Frequency planning — assign different LoRa channels to different branches of the mesh. Requires relays to manage multiple frequencies, adding hardware or scheduling complexity.
- Adaptive transmission intervals — sensors or the CONFIG_PUSH mechanism could adjust transmission rates based on network load. Sensors deeper in the mesh (more relay) could transmit less frequently.
- Aggregation — the GROUP_FORWARD packet type (J.2) could reduce per-packet overhead and ACK count at the cost of increased single-transmission airtime.
Rule of thumb: For a single LoRa channel at SF7 with 1% duty cycle, plan for no more than 30–50 sensors per gateway, with no more than 20 forwarded through any single relay-1 relay. Scale beyond this by adding gateways, not by deepening the mesh.
The protocol currently has no version negotiation mechanism. All mesh nodes are expected to run the same protocol version. For future-proofing:
- Reserved ctrl_types (0x7–0xF) should be silently discarded by nodes that do not recognise them. This allows new packet types to be deployed incrementally — gateways can be updated first, followed by relays, without causing errors on nodes still running older firmware.
- Reserved flag bits in the BEACON packet provide a forward-compatible extension point. A new capability can be signalled by setting a flag bit. Older nodes ignore unknown flags but continue to process the beacon normally.
- The FORWARD packet is inherently version-agnostic — relay nodes do not interpret the inner payload, so changes to iotdata sensor variants, field definitions, or encoding formats require no mesh firmware updates.
If a formal version field becomes necessary, it could be encoded in the BEACON's reserved flag bits (e.g. flags bits 2–3 as a 2-bit protocol version, supporting 4 versions). Alternatively, a VERSION_ANNOUNCE packet type could be defined using one of the reserved ctrl_type slots.
This appendix discusses system-level concerns that fall outside the wire protocol but are essential for reliable deployment. The protocol defines how data is encoded and transmitted; this appendix addresses how the broader system around it should be designed and operated.
The protocol does not define or constrain the sensor's transmission interval. In practice, the interval is a deployment parameter balancing data freshness against power consumption and airtime budget.
Typical intervals for environmental monitoring range from 5 seconds (high-rate weather stations during storm events) to 3600 seconds (daily check-in for dormant sensors). The most common operational range is 30–300 seconds.
The interval SHOULD be chosen with awareness of the regulatory duty cycle limit. At 1% duty cycle (EU 868MHz), a 30-byte packet at SF7 (~50ms airtime) can be transmitted every 5 seconds. At SF12 (~1.5s airtime), the minimum interval rises to 150 seconds. Implementations that transmit more frequently than their duty cycle permits risk regulatory non-compliance and may interfere with other users of the shared ISM band.
Sensors that transmit at a fixed interval risk persistent collisions if multiple sensors are powered on simultaneously (e.g. after a site-wide power restoration or batch deployment). Two sensors with identical 60-second intervals that happen to align will collide on every transmission indefinitely.
Sensors SHOULD add uniformly distributed random jitter to each transmission interval. A jitter of ±10% of the base interval is sufficient to decorrelate sensors within a few transmission cycles. For example, a 60-second base interval should use a random delay of 54–66 seconds per cycle.
The jitter SHOULD be re-randomised for each transmission, not fixed at boot. A fixed offset (e.g. "this sensor always transmits at base + 3 seconds") reduces collision probability at boot but does not eliminate persistent collisions between sensors that happen to draw similar offsets.
Some deployments benefit from event-driven interval changes:
-
Storm mode. A weather station that detects rapidly changing pressure or high wind speeds may temporarily reduce its interval (e.g. from 60s to 10s) to capture the event at higher resolution.
-
Low battery mode. A sensor below a battery threshold may increase its interval to extend operational life.
-
Quiet mode. A sensor that detects no change in its readings over several cycles may increase its interval. The presence bit mechanism ensures that unchanged fields can be omitted entirely, further reducing airtime.
These behaviours are deployment-specific and are not standardised by this protocol. The CONFIG TLV (Section 9.5.4) can report the current interval to the gateway for fleet monitoring.
A gateway receives iotdata packets (directly or via mesh relays), decodes them, and delivers the data to upstream systems. The gateway is the protocol's termination point — upstream of the gateway, data is typically represented as JSON, stored in a time-series database, or forwarded via MQTT or HTTP.
The gateway's receive path should be structured as a pipeline:
-
Radio receive. The LoRa (or other) radio delivers a raw byte buffer and link metadata (RSSI, SNR, frequency, spreading factor).
-
Duplicate suppression. Check {station_id, sequence} against a ring buffer of recently processed packets. Discard duplicates. See Section E.4 of Appendix G for implementation details; this mechanism applies equally to non-mesh deployments where a sensor may be heard by multiple gateways.
-
Decode. Decode the binary packet to the internal representation or directly to JSON. Discard malformed packets per Section 11.6.
-
Enrich. Attach gateway-side metadata: receive timestamp (from the gateway's clock, independent of any datetime field in the packet), gateway identity, link quality metrics, and any cached state for this station (last known position, firmware version, etc.).
-
Deliver. Forward the enriched record to upstream systems via MQTT, HTTP POST, database insertion, or local storage.
A gateway SHOULD maintain per-station state for:
-
Last sequence number. For gap detection and duplicate suppression.
-
Last known position. Cached from the most recent packet containing a position field. Associated with subsequent packets that omit position (see Section 11.2).
-
Last known datetime offset. For stations that transmit datetime infrequently, the gateway can estimate the sensor's clock drift by comparing the sensor's datetime field against the gateway's receive timestamp.
-
Firmware version and configuration. Cached from VERSION and CONFIG TLV entries. Used for fleet management and diagnostics.
-
Cumulative statistics. Packet count, gap count, average RSSI, last heard timestamp. See Section H.3.
This state may be held in memory (sufficient for small deployments), in a local key-value store (e.g. SQLite, Redis), or in the upstream database.
Operators SHOULD monitor the following metrics to maintain system health. These metrics are derived from the packet stream and gateway state, not from any specific protocol field.
| Metric | Derivation | Alerts on |
|---|---|---|
| Packets per interval | Count packets per station per time window | Station silent for >2× expected interval |
| Sequence gap rate | Count gaps in sequence number per station | Gap rate exceeding expected packet loss for the link |
| RSSI trend | Moving average of link RSSI per station | Sustained decline indicating antenna, obstruction, or hardware degradation |
| Battery trend | Track battery level over time | Level below deployment-specific threshold; unexpected discharge rate |
| Decode error rate | Count packets that fail decoding per station | Non-zero rate from a previously healthy station |
| TLV diagnostic frequency | Count DIAGNOSTIC TLV entries per station per window | Sudden increase indicating sensor fault |
| Restart frequency | Track restart count from STATUS TLV | Restarts with WATCHDOG or PANIC reason |
| Clock drift | Compare sensor datetime against gateway receive time | Drift exceeding 30 seconds (may indicate RTC failure) |
| Metric | Derivation | Alerts on |
|---|---|---|
| Active stations | Count stations heard within the last N intervals | Station count drops below expected fleet size |
| Duplicate rate | Count packets suppressed by dedup as a fraction of total | High rate may indicate unnecessary mesh overlap |
| Gateway packet rate | Total packets processed per second across all stations | Approaching processing or duty cycle capacity |
| Airtime utilisation | Sum of received packet airtimes per time window | Approaching regulatory duty cycle limit |
| Decode failure rate | Aggregate decode errors across all stations | Spike indicating firmware rollout issues |
These metrics can be derived from the gateway's receive path with minimal overhead. The enrichment step (Section H.2) is the natural point to update counters and evaluate alert conditions.
Alerting thresholds are deployment-specific. A remote weather station in a mountain location with marginal link budget has different expectations than a soil sensor 50 metres from the gateway. Operators SHOULD configure per-station or per-station-class thresholds rather than global defaults.
The most critical alert is station silence — a station that stops transmitting entirely. This may indicate hardware failure, power exhaustion, theft, or catastrophic link degradation. The alert threshold should be set to 2–3× the expected transmission interval to avoid false positives from normal jitter and occasional packet loss.
The protocol's datetime field (Section 8.8) encodes time relative to the start of the year with no absolute time reference. The receiver resolves the year (Section 11.1). This design assumes that the receiver has an accurate clock.
For gateways, this is typically satisfied by NTP synchronisation over an internet connection, or by a local GNSS receiver. Gateways that lack both SHOULD use the receive timestamp as the primary time reference and treat the sensor's datetime field as a secondary indicator, useful for detecting transmission delays or buffered transmissions but not as the authoritative timestamp.
For sensors, the time source determines the accuracy of the datetime field:
| Source | Typical accuracy | Drift |
|---|---|---|
| GNSS (GPS/Galileo) | < 1 second | None (re-acquired each fix) |
| NTP (via WiFi) | < 100 ms | None (re-synchronised) |
| RTC (crystal) | ±2 ppm (good crystal) | ~1 minute per year |
| RTC (internal RC) | ±1-5% (uncalibrated) | Minutes per day |
Sensors using a free-running RTC with no external synchronisation will accumulate drift. The 5-second resolution of the datetime field means that RTC drift below 5 seconds is invisible, but over days or weeks the drift becomes significant. The gateway can detect and report drift by comparing the sensor's datetime against its own receive timestamp (Section H.3).
The gateway's output — typically JSON records — feeds into a data pipeline for storage, analysis, and visualisation. The following considerations apply to the pipeline design:
Idempotent ingestion. The combination of {station_id, sequence} is unique per transmission. The ingestion layer SHOULD use this as a deduplication key, ensuring that duplicate deliveries (from multi-gateway deployments, message broker retries, or pipeline replays) do not create duplicate records.
Schema evolution. When new field types or variants are introduced, the upstream schema must accommodate new JSON keys. A schema-on-read approach (e.g. storing the full JSON document in a document store or a JSONB column) is more resilient to evolution than a rigid relational schema with a column per field.
Backfill and reprocessing. If the binary packets are stored alongside (or instead of) the decoded JSON, the pipeline can be reprocessed when decoder bugs are fixed or new field interpretations are added. Storing the raw hex alongside the decoded output is inexpensive (typically 16–32 bytes per record) and provides an authoritative source of truth.
Retention. Environmental monitoring data is typically retained for years or decades. At one packet per minute per station, a 16-station deployment produces approximately 8.4 million records per year — modest by time-series database standards.
Deployments with multiple gateways provide redundancy and extended coverage but introduce coordination requirements.
Duplicate suppression. A sensor within range of two gateways will be received by both. Each gateway independently decodes and delivers the data, producing duplicate records upstream. See Appendix G, Section J.1 for gateway-to-gateway dedup strategies. For non-mesh deployments, upstream dedup on {station_id, sequence} in the ingestion layer is the simplest and most robust approach.
Gateway identity. Each gateway SHOULD tag its output with its own identity (station_id, hostname, or other identifier) so that the upstream system can distinguish which gateway received each packet. This is essential for link quality analysis — a packet received by gateway A at -90 dBm and gateway B at -110 dBm indicates the sensor is closer to A.
Failover. If one gateway fails, sensors within range of both gateways continue to be received by the surviving gateway with no data loss. Sensors within range of only the failed gateway are lost until the gateway is restored or a mesh relay is deployed to bridge the gap.
This appendix compares the iotdata protocol along two dimensions: first, against alternative wire encodings for the same sensor payload; second, against established embedded C libraries that share iotdata's cross-platform, low-resource design philosophy even though they solve different problems.
The following sensor reading is used for all encoding comparisons:
| Field | Value |
|---|---|
| Battery level | 84% |
| Battery charging | false |
| Link RSSI | -96 dBm |
| Link SNR | 10 dB |
| Temperature | 21.5°C |
| Pressure | 1013 hPa |
| Humidity | 45% |
| Wind speed | 5.0 m/s |
| Wind direction | 180° |
| Wind gust | 8.5 m/s |
| Rain rate | 3 mm/hr |
| Rain drop size | 2.5 mm |
| Solar irradiance | 850 W/m² |
| UV index | 7 |
This is a Presence Byte 0 only packet (no position, datetime, radiation, or TLV data). It represents the most common transmission for a weather station.
Header (32 bits) + Presence Byte 0 (8 bits) + Battery (6) + Link (6) + Environment (24) + Wind (22) + Rain (12) + Solar (14) = 124 bits = 16 bytes (after zero-padding the final byte).
Hex: 00 2A C3 50 FF D5 EB 95 BA 2F 52 8A 35 28 70 00
[header ] [p][bat+lnk+environment ][wind ]...
{
"battery": { "level": 84, "charging": false },
"link": { "rssi": -96, "snr": 10.0 },
"environment": { "temperature": 21.5, "pressure": 1013, "humidity": 45 },
"wind": { "speed": 5.0, "direction": 180, "gust": 8.5 },
"rain": { "rate": 3, "size": 2.5 },
"solar": { "irradiance": 850, "ultraviolet": 7 }
}261 bytes. With keys shortened to single characters: ~130 bytes. Even aggressively minified JSON is 8× larger than iotdata.
CBOR encodes the same structure as a map of maps with integer keys. Using single-byte integer keys for all fields:
- Map overhead: ~14 bytes (outer map + 6 inner maps)
- Values: ~40 bytes (integers and floats in their minimal CBOR representation)
- Keys: ~12 bytes (single-byte integer keys)
~66 bytes. CBOR's self-describing nature adds per-field type tags and lengths. It is approximately 4× larger than iotdata for this payload.
A Protobuf message with field numbers and varint/fixed encoding:
- Field tags: ~12 bytes (one per field, varint-encoded field number + wire type)
- Values: ~30 bytes (varints for integers, fixed32 for floats)
- No nested message overhead if flattened
~42 bytes (flattened). With nested messages matching the JSON structure: ~52 bytes. Protobuf is approximately 2.5–3× larger than iotdata. The overhead comes from per-field tags and byte-aligned varint encoding.
MessagePack with integer keys produces results comparable to CBOR:
~62 bytes. Slightly smaller than CBOR due to more compact map headers. Approximately 4× larger than iotdata.
struct __attribute__((packed)) weather_packet {
uint8_t battery_level; /* 1 byte (wastes 8 bits for a 0-100 value) */
uint8_t battery_charging; /* 1 byte (wastes 7 bits for a boolean) */
int8_t link_rssi; /* 1 byte */
int8_t link_snr; /* 1 byte */
int16_t temperature; /* 2 bytes (×100 fixed point) */
uint16_t pressure; /* 2 bytes */
uint8_t humidity; /* 1 byte */
uint16_t wind_speed; /* 2 bytes (×100) */
uint16_t wind_direction; /* 2 bytes */
uint16_t wind_gust; /* 2 bytes (×100) */
uint8_t rain_rate; /* 1 byte */
uint8_t rain_size; /* 1 byte (×10) */
uint16_t solar_irradiance; /* 2 bytes */
uint8_t uv_index; /* 1 byte */
};20 bytes (plus typically 4 bytes of header for station ID and sequence, = 24 bytes total). The packed C struct is the closest generic competitor in size. However, it wastes bits on byte alignment (the boolean charging flag consumes 8 bits instead of 1, battery level uses 8 bits instead of 5, etc.) and has no presence flag mechanism — all fields are always transmitted regardless of whether they have changed or are relevant.
The generic formats above (JSON, CBOR, Protobuf) are general-purpose serialisation. The following libraries and formats were designed specifically for IoT sensor telemetry or for bit-efficient encoding on constrained devices.
CayenneLPP (Cayenne Low Power Payload) is the de facto standard payload format for LoRaWAN sensor devices, natively supported by The Things Network and myDevices Cayenne. It uses a byte-aligned TLV structure where each measurement is prefixed with a 1-byte channel identifier and a 1-byte IPSO-derived type code.
CayenneLPP defines standard types for temperature (2 bytes, 0.1°C), barometric pressure (2 bytes, 0.1 hPa), relative humidity (1 byte, 0.5%), luminosity (2 bytes, 1 lux), and GPS (9 bytes). Fields without a standard type must use the generic analog input (2 bytes, 0.01 resolution).
Encoding the test payload:
| Field | CayenneLPP type | Bytes (ch+type+data) |
|---|---|---|
| Battery level | Analog Input | 4 |
| Battery charging | Digital Input | 3 |
| Link RSSI | Analog Input | 4 |
| Link SNR | Analog Input | 4 |
| Temperature | Temperature | 4 |
| Pressure | Barometric Press. | 4 |
| Humidity | Rel. Humidity | 3 |
| Wind speed | Analog Input | 4 |
| Wind direction | Analog Input | 4 |
| Wind gust | Analog Input | 4 |
| Rain rate | Analog Input | 4 |
| Rain size | Analog Input | 4 |
| Solar irradiance | Luminosity | 4 |
| UV index | Analog Input | 4 |
~54 bytes. No header (station ID and sequencing are provided by LoRaWAN). Adding an equivalent 4-byte header for a fair comparison gives ~58 bytes.
CayenneLPP's strengths are its native integration with the LoRaWAN ecosystem
(The Things Network decodes CayenneLPP payloads automatically, with no custom
code) and its simplicity (a C++ Arduino library with lpp.addTemperature()
calls). Its weaknesses are the 2-byte overhead per field (channel + type), byte
alignment that wastes bits, no sub-byte field packing, and reliance on the
generic analog input type for any measurement not in the IPSO standard set —
which loses semantic meaning at the gateway.
The CayenneLPP constructor (CayenneLPP lpp(51)) uses malloc for its internal
buffer. The library is Arduino/C++ and is not straightforwardly portable to
bare-metal C on Class 1/2 MCUs.
Nanopb is a widely-used Protocol Buffers implementation targeting embedded
systems, written in ANSI C. It supports static allocation (no malloc at
runtime when configured with .options files), compiles to 2–10 KB ROM and ~300
bytes–1 KB RAM, and runs on STM32, AVR, ARM Cortex-M, and Linux.
The wire format is standard Protobuf: varint-encoded field tags and byte-aligned varint/fixed-width values. For the test payload, the encoding size is identical to the generic Protobuf figure:
~42 bytes (flattened message), ~52 bytes (nested messages).
Nanopb's strengths are its maturity (widely deployed, well-tested), its
interoperability with the broader Protobuf ecosystem (the same .proto schema
can generate code for C, Python, Go, Java, etc.), and its static allocation
mode. Its weaknesses for this use case are the requirement for a code generator
(protoc plus a Python plugin), byte-aligned encoding (no sub-byte fields), and
the overhead of per-field tags — which exist to support schema evolution but are
redundant in a closed deployment where both sides know the schema.
Nanopb is the strongest alternative for deployments that require interoperability with cloud services or multi-language environments where the Protobuf ecosystem is already established. For closed LoRa deployments where every bit matters, the 2.5–3× size overhead relative to iotdata is significant.
Bitproto is a bit-level serialisation format with a Protobuf-like schema
language. It is the closest existing tool to iotdata's approach: fields are
specified at arbitrary bit widths (uint3, uint5, uint11, etc.), and the
generated C encoder/decoder uses zero dynamic allocation and copies bits between
structures and buffers without padding or gaps.
If the test payload were defined in Bitproto with the same bit widths as iotdata (5-bit battery, 1-bit charging, 4-bit RSSI, etc.), the data portion would occupy a similar number of bits. However, Bitproto adds a 2-byte message size header when extensibility is enabled, and does not provide a built-in header (variant, station ID, sequence) or presence flags.
Encoding the test payload with equivalent bit widths:
~14 bytes (data bits only, no extensibility header), ~16 bytes (with 2-byte size header). Adding the equivalent 4-byte iotdata header and a 1-byte presence byte gives ~21 bytes — comparable to iotdata's 16 bytes, with the difference coming from the lack of presence flags (all fields are always transmitted) and the size header.
Bitproto's strengths are its bit-level granularity (identical to iotdata in principle), zero dynamic allocation in C, and the availability of a code generator for C, Go, and Python. Its limitations for this use case are:
- Little-endian only. Bitproto encodes in little-endian byte order, requiring byte-swap logic on big-endian platforms (some PIC configurations, network-order protocols). iotdata is endian-agnostic via explicit bit-by-bit packing.
- No presence flags. Every field defined in the schema is always encoded. There is no mechanism equivalent to iotdata's presence bytes, so a battery-only packet is the same size as a full-telemetry packet.
- No variable-length fields. All types are fixed-width. Bitproto cannot express variable-length constructs such as iotdata's TLV section or the Air Quality PM/Gas fields with per-channel presence masks.
- No quantisation. Bitproto packs raw bit fields; the quantisation (mapping physical values to reduced-bit representations) must be implemented by the application. iotdata defines the quantisation as part of the protocol.
- Requires a code generator. The
bitprotocompiler (Python) must be run to generate C source files from.bitprotoschema files. iotdata's reference implementation is self-contained C with no code generation step. - No domain awareness. Bitproto is a generic bit-packing tool. It has no concept of sensor types, field bundles, or variant maps.
TinyCBOR (developed by Intel) and QCBOR (developed by Qualcomm/Laurence
Lundblade) are embedded-focused CBOR implementations. Both avoid malloc in
their core encode/decode paths (TinyCBOR operates on caller-provided buffers;
QCBOR uses a similar model with richer error handling). QCBOR is approximately
25% larger in code size than TinyCBOR but provides more complete CBOR support.
The wire format is standard CBOR (RFC 8949), so the encoding size for the test payload is the same as the generic CBOR figure: ~66 bytes. The advantage of these implementations over generic CBOR libraries is their suitability for embedded targets: no heap allocation, bounded stack usage, and small code footprint (~4–8 KB for TinyCBOR, ~10–15 KB for QCBOR).
CBOR's self-describing nature (every value carries its type and length) is the opposite of iotdata's approach. This makes CBOR ideal for schemaless systems where the receiver may not know the sender's data model, but adds 4× overhead for the structured, schema-known payloads that iotdata targets.
| Encoding | Bytes | Ratio vs iotdata | Presence flags | Self-describing | Byte-aligned | Code gen required |
|---|---|---|---|---|---|---|
| iotdata | 16 | 1.0× | ✓ | ✗ | ✗ | ✗ |
| Bitproto† | ~16 | ~1.0× | ✗ | ✗ | ✗ | ✓ |
| Raw C struct | 24 | 1.5× | ✗ | ✗ | ✓ | ✗ |
| Protobuf / Nanopb | 42 | 2.6× | ✓* | Partial | ✓ | ✓ |
| CayenneLPP | 54 | 3.4× | ✗ | ✓ | ✓ | ✗ |
| MessagePack | 62 | 3.9× | ✗ | ✓ | ✓ | ✗ |
| CBOR / TinyCBOR | 66 | 4.1× | ✗ | ✓ | ✓ | ✗ |
| JSON (compact) | 261 | 16.3× | ✗ | ✓ | ✓ | ✗ |
*Protobuf omits default/zero values, which functions as implicit presence for non-zero fields.
†Bitproto data fields only, with equivalent bit widths. With the addition of an iotdata-equivalent header (4 bytes) and extensibility header (2 bytes), the total rises to ~22 bytes. Bitproto does not support presence flags, so this figure applies to every packet regardless of which fields have changed.
The size advantage of iotdata comes from three sources:
-
Sub-byte field packing. Fields are packed to the exact number of bits required. A boolean is 1 bit, a battery level is 5 bits, a wind direction is 8 bits. Only Bitproto matches this capability among the compared formats.
-
No per-field metadata. There are no field tags, type indicators, length prefixes, or key strings. The field layout is determined entirely by the variant table, which both sides know at compile time. CayenneLPP's 2-byte per-field overhead (channel + type) and Protobuf's varint field tags exist to support self-description and schema evolution — valuable properties, but expensive when every bit counts.
-
Quantisation to operational resolution. Temperature is quantised to 0.25°C steps, fitting in 9 bits. A Protobuf float or CBOR float uses 32 bits for the same value. The quantisation is chosen to be within or below the sensor's own measurement accuracy, so no operationally useful information is lost.
The trade-off is the loss of self-description. An iotdata packet cannot be decoded without prior knowledge of the variant table. For the target use case — closed deployments where the operator controls all devices — this is acceptable. For open or interoperable systems, a self-describing format like CBOR or Protobuf would be more appropriate, at the cost of 3–4× larger payloads. The CayenneLPP format occupies an intermediate position: it is self-describing and has LoRaWAN ecosystem support, but at 3.4× the size of iotdata, the cost is significant for duty-cycle-constrained deployments.
Alternatively, the specific variant table could be determined from the device's VERSION or CONFIG information, as transmitted at startup.
The preceding sections compare iotdata against alternative encodings — they answer "what else could I use to pack sensor telemetry onto a wire?" This section addresses a different question: "as an embedded C library designed to work from Cortex-M0 to Linux, how does iotdata's architecture compare to the best-practice embedded libraries that solve other problems?"
The embedded C ecosystem contains several libraries that are widely regarded as exemplars of portable, resource-conscious design. Although they address entirely different domains — networking, filesystems, compression, cryptography, parsing — they share a set of architectural principles that iotdata also follows. This comparison positions iotdata's design choices within that tradition and identifies where iotdata conforms to, or deviates from, established practice.
lwIP (lightweight IP) is an open-source TCP/IP stack created by Adam
Dunkels, targeting embedded systems with tens of kilobytes of RAM. It is used by
Espressif (ESP32/ESP-IDF), STMicroelectronics, NXP, Xilinx, and many others.
Code size is approximately 40 KB ROM; RAM usage is 10–30 KB depending on
configuration. Configuration is via a user-provided lwipopts.h header that
selects protocols, buffer sizes, and API style at compile time.
littlefs is a fail-safe filesystem designed for microcontrollers and NOR/NAND flash. It provides power-loss resilience, dynamic wear levelling, and bad block detection while maintaining strictly bounded RAM and ROM usage. It avoids recursion, limits dynamic memory to configurable caller-provided buffers, and at no point stores an entire storage block in RAM.
heatshrink is an LZSS-based compression library for embedded and real-time systems. It operates with as little as 50 bytes of RAM, processes data incrementally in arbitrarily small chunks, supports both static and dynamic allocation, and separates the encoder and decoder into independently compilable units. iotdata already uses heatshrink for image compression (Section 8.27.2).
Mbed TLS (formerly PolarSSL) is a C implementation of TLS/DTLS and
cryptographic primitives. Its minimum TLS stack requires under 60 KB ROM and
under 64 KB RAM. It is highly modular: individual cryptographic algorithms can
be used independently of the TLS stack. Feature selection is via a compile-time
configuration header (mbedtls_config.h).
wolfSSL / wolfCrypt is an embedded SSL/TLS library written in ANSI C, targeting RTOS and resource-constrained environments. With the LeanPSK configuration, it compiles to as little as 20 KB. It supports extensive hardware cryptographic acceleration and compile-time algorithm selection.
minmea is a minimalistic GPS NMEA 0183 parser in pure ISO C99. It consists of a single source file and header, uses no dynamic memory allocation, performs no floating-point arithmetic in the core library (offering both fixed-point and float output), and runs on embedded ARM, Linux, macOS, and Windows.
The following table maps six architectural principles common to high-quality embedded C libraries against the reference libraries and iotdata.
| Principle | lwIP | littlefs | heatshrink | Mbed TLS | minmea | iotdata |
|---|---|---|---|---|---|---|
| Compile-time modularity | lwipopts.h |
Config struct | heatshrink_config.h |
mbedtls_config.h |
Linker GC | IOTDATA_ENABLE_* |
| Zero malloc / caller buffers | Pool allocator | Caller-provided | Static or dynamic | Caller-provided | Stack only | Stack only, no malloc |
| Integer-only capability | N/A (networking) | Yes (all integer) | Yes (all integer) | Yes (bignum library) | Core is integer-only | IOTDATA_NO_FLOATING |
| Separable components | Raw/Netconn/Socket | Single unit | Encode ≠ Decode | Crypto ≠ TLS ≠ X.509 | Single unit | Encode ≠ Decode ≠ JSON |
| Platform abstraction | OS emulation layer | Block device API | None needed | Platform ALT layer | Compat headers | <stdint.h> only |
| Bounded resource usage | Configurable pools | No recursion, bounded RAM | Incremental, bounded CPU | Configurable buffers | Fixed struct sizes | Compile-time-known sizes |
Compile-time modularity. The ability to include only the features a
deployment needs, with the compiler eliminating unused code. lwIP pioneered this
with lwipopts.h, a user-created header that overrides defaults for every
tunable parameter. Mbed TLS uses the same pattern with mbedtls_config.h.
iotdata follows this tradition with IOTDATA_ENABLE_SELECTIVE and per-field
IOTDATA_ENABLE_* defines, plus functional subsetting (IOTDATA_NO_DECODE,
IOTDATA_NO_JSON, etc.). The result is that a minimal iotdata encoder
(battery + environment only) compiles to 768 bytes on ESP32-C3 — comparable to
heatshrink's decoder at ~1 KB on AVR.
Zero malloc / caller-provided buffers. Dynamic memory allocation is avoided
or prohibited because it introduces fragmentation, non-deterministic timing, and
failure modes that are unacceptable in embedded systems (particularly those
governed by MISRA C or similar standards). littlefs achieves this by requiring
the caller to provide a configuration struct with buffer pointers. heatshrink
supports both modes — static allocation for embedded, dynamic for convenience on
hosted platforms. iotdata's encode and decode paths perform no malloc or
free calls; the iotdata_encoder_t context is allocated on the caller's stack
or as a static variable. The only heap allocation in the library is within the
JSON conversion functions (cJSON_CreateObject), which are gateway-only and
excluded from embedded builds via IOTDATA_NO_JSON.
Integer-only capability. Many Class 1 and Class 2 MCUs lack a hardware FPU.
Software floating-point emulation adds 2–5 KB of code and ~50–100 cycles per
operation. minmea addresses this by keeping its core parser integer-only, using
a struct minmea_float that stores values as a numerator/denominator pair
(int_least32_t) and offering an explicit minmea_tocoord() conversion for
callers that need floating-point output. iotdata's IOTDATA_NO_FLOATING mode
follows the same philosophy: all values are passed as scaled integers
(temperature as centidegrees, position as degrees×10⁷), eliminating all
floating-point dependencies.
Separable components. heatshrink's encoder and decoder are independently
compilable — an embedded device that only compresses data need not include the
decompressor. Mbed TLS separates into three libraries: libtfpsacrypto (raw
cryptographic primitives), libmbedx509 (certificate handling), and
libmbedtls (TLS/DTLS protocol). An application that needs only AES encryption
links against the crypto library alone. iotdata provides the same separation:
IOTDATA_NO_DECODE excludes the decoder, IOTDATA_NO_JSON excludes JSON, and
IOTDATA_NO_PRINT / IOTDATA_NO_DUMP exclude diagnostic output. An
encoder-only build for a sensor node includes none of the decoder, print, dump,
or JSON machinery.
Platform abstraction without OS dependency. lwIP defines an operating system
emulation layer (sys_arch) that provides semaphores, mailboxes, and threads,
with a bare-metal implementation that uses polling. littlefs abstracts storage
behind a block device API (lfs_config with read/prog/erase function pointers).
iotdata has the lightest abstraction of all: it depends only on <stdint.h>,
<stdbool.h>, <stddef.h>, and optionally <math.h> (for round()/floor()
in floating-point mode). No OS services, no file I/O, no timers, no threading.
The bit-packing core operates on a caller-provided uint8_t buffer and a bit
offset — it is portable to any byte-addressable architecture.
Bounded, predictable resource usage. littlefs guarantees that its RAM
consumption is bounded regardless of filesystem size — it never stores an entire
flash block in RAM, avoids recursion (which would produce data-dependent stack
growth), and limits dynamic memory to configurable buffers. heatshrink processes
data in incremental chunks with bounded CPU time per call, making it suitable
for hard real-time contexts. iotdata's encode_end() function performs a single
linear pass over the variant field table; its execution time is proportional to
the number of present fields (maximum ~300 bit operations for all 12 fields),
with no data-dependent loops except TLV string encoding.
iotdata sits at an intersection that is unusual in the embedded library
landscape. Most IoT encoding libraries (CayenneLPP, Nanopb) do not follow all of
the embedded design principles above — CayenneLPP uses malloc, Nanopb requires
a code generator and an external toolchain dependency. Conversely, most
libraries that rigorously follow these principles (lwIP, littlefs, heatshrink)
are not encoding libraries — they solve networking, storage, or compression
problems.
iotdata applies the architectural discipline of the best embedded C libraries to the specific problem of sensor telemetry encoding. The design choices — compile- time field selection, caller-provided buffers, integer-only mode, separable encode/decode, no OS dependency, bounded resource usage — are individually unremarkable (they are standard practice in the embedded ecosystem). Their combination in a sensor telemetry protocol is less common, because the IoT encoding space has historically been dominated by formats designed for flexibility and interoperability (Protobuf, CBOR, CayenneLPP) rather than for the architectural constraints that govern embedded library design.
This positioning has costs. iotdata lacks the ecosystem integration of CayenneLPP (no native TTN/Cayenne support), the schema evolution guarantees of Protobuf (no field tags, no wire-level versioning), the self-description of CBOR (packets cannot be decoded without the variant table), and the multi-language code generation of Bitproto and Nanopb (the reference implementation is C only). These trade-offs are deliberate: they are the price of a library that compiles to 768 bytes on a RISC-V MCU and produces 16-byte packets for a full weather station reading.
The size difference has direct operational consequences on LoRa:
| Encoding | Bytes | Airtime (SF7/125kHz) | Airtime (SF10/125kHz) | Fits SF12? |
|---|---|---|---|---|
| iotdata | 16 | ~36 ms | ~247 ms | ✓ |
| Bitproto† | ~22 | ~41 ms | ~289 ms | ✓ |
| C struct | 24 | ~46 ms | ~370 ms | ✓ |
| Protobuf | 42 | ~72 ms | ~617 ms | ✓* |
| CayenneLPP | 54 | ~92 ms | ~781 ms | ✗ |
| CBOR | 66 | ~107 ms | ~925 ms | ✗ |
| JSON | 261 | ~369 ms | ✗ | ✗ |
*Protobuf at 42 bytes fits the SF12/125kHz maximum payload of 51 bytes, but with minimal room for header overhead.
†Bitproto with iotdata-equivalent header and extensibility header.
At SF10 with a 1% duty cycle, the minimum transmission interval for iotdata is 25 seconds. For CayenneLPP, it is 78 seconds — over 3× longer between transmissions for the same regulatory budget. For battery-powered sensors where radio transmission dominates power consumption, this difference translates directly to battery life.
CayenneLPP's 54-byte payload exceeds the SF12/125kHz maximum of 51 bytes, meaning it cannot be used at the highest spreading factor for the test payload without dropping fields. iotdata's 16 bytes fit comfortably at all spreading factors, with headroom for TLV data even at SF12.
This appendix documents known limitations, unresolved design questions, and areas where the current specification reflects engineering judgement rather than systematic analysis. These items are recorded to inform implementers of known risks and to scope the work required before the protocol advances beyond its current pre-release status.
The protocol is currently at version 0.90 (alpha/beta). Breaking changes to field encodings, header layout, and quantisation parameters are expected before the protocol is finalised as version 1.0. Implementers SHOULD be aware that deploying the current specification may require firmware updates when these issues are resolved. The items in this appendix will be systematically addressed — resolved, accepted with justification, or deferred — before the protocol is submitted as an Internet-Draft.
Issue: The pressure field (Section 8.3, 8.10) encodes 850–1105 hPa in 8 bits at 1 hPa resolution. This range excludes altitudes above approximately 1,500 metres, where standard atmospheric pressure falls below 850 hPa.
Impact: Deployments at moderate altitude are affected. A weather station at 1,600 metres (e.g. Davos, Switzerland, ~840 hPa) cannot encode its pressure readings. Mountain agriculture, ski resort monitoring, and high-altitude research stations — all plausible use cases for this protocol — are excluded by the current range.
Context: The BME280 sensor, explicitly named in the specification, measures 300–1100 hPa. The protocol's range covers only the upper third of the sensor's capability.
Options under consideration:
| Option | Bits | Range | Resolution | Notes |
|---|---|---|---|---|
| Current | 8 | 850–1105 hPa | 1 hPa | Excludes altitudes above ~1,500m |
| Wider range | 10 | 300–1100 hPa | ~0.8 hPa | Covers full BME280 range; 2 extra bits |
| Wider, coarser | 8 | 300–1100 hPa | ~3.1 hPa | Same bit width; resolution exceeds BME280 ±1 hPa accuracy |
| Wider, compromise | 9 | 540–1066 hPa | 1 hPa | Covers altitudes to ~5,000m; 1 extra bit |
Recommendation: This is a likely breaking change before v1.0. The current range is too restrictive for the protocol's stated deployment scenarios.
Issue: Wind speed (Section 8.12, 8.13) is encoded as 0–63.5 m/s in 7 bits at 0.5 m/s resolution. The maximum corresponds to the onset of Beaufort 12 (hurricane force). Actual hurricane and cyclone wind speeds reach 85+ m/s; tornado wind speeds exceed 100 m/s.
Impact: Weather stations that survive extreme wind events will saturate the field. The encoder clamps to 63.5 m/s, and the receiver cannot distinguish "63.5 m/s" from "90 m/s." For stations deployed specifically to monitor severe weather, this is a data loss.
Analysis needed: Whether the 0.5 m/s resolution is operationally justified. Most consumer and semi-professional anemometers (Davis Vantage Pro2, Inspeed Vortex) have ±1 m/s accuracy or ±5% of reading. At 10 m/s, ±5% is ±0.5 m/s, so the 0.5 m/s resolution is at the sensor's accuracy limit. At 30 m/s, ±5% is ±1.5 m/s, and the 0.5 m/s resolution is well below the noise floor.
Options under consideration:
| Option | Bits | Range | Resolution | Notes |
|---|---|---|---|---|
| Current | 7 | 0–63.5 m/s | 0.5 m/s | Saturates at hurricane onset |
| Extended range | 8 | 0–127.5 m/s | 0.5 m/s | Covers all terrestrial wind; 1 extra bit |
| Extended, coarser | 7 | 0–127 m/s | 1.0 m/s | Same bit width; resolution matches sensors |
| Non-linear | 7 | 0–127+ m/s | Variable | Finer below 30 m/s, coarser above; complex |
The same analysis applies to the wind gust field (Section 8.15), which shares the encoding.
Issue: All field encodings use linear quantisation, allocating equal resolution across the entire range. Many environmental measurements have heavily right-skewed distributions where most readings cluster at low values with rare high excursions.
Affected fields:
- Wind speed: Most readings are 0–10 m/s; extreme values above 30 m/s are rare but operationally significant.
- Rain rate: Most readings are 0–10 mm/hr; extreme events reach 100+ mm/hr.
- Radiation CPM: Background is 10–50 CPM; elevated readings are 100+; emergency-level readings are 1,000+.
- Radiation dose: Background is 0.05–0.20 µSv/h; the 0.01 µSv/h resolution provides only 5–20 distinguishable levels in the normal background range, while 16,363 of the 16,383 steps cover values that will never be observed in normal operation.
- Air quality PM: Background PM2.5 in clean environments is 0–10 µg/m³; the 5 µg/m³ resolution provides only 2 distinguishable levels (0 and 5) in this range.
Analysis needed: For each field, characterise the real-world distribution of values (from public meteorological and environmental datasets) and compute the information content per quantisation step. Compare linear quantisation against logarithmic, square-root, and piecewise-linear alternatives in terms of mean-squared quantisation error for typical measurement distributions.
Trade-off: Non-linear quantisation improves effective resolution in common value ranges but adds implementation complexity. Linear quantisation requires one multiply and one add; logarithmic quantisation requires a lookup table or a log/exp computation. On Class 1 MCUs (Section E.1), this is a meaningful cost. Additionally, non-linear quantisation violates the principle of least surprise — a user examining the raw quantised value cannot easily estimate the physical value without consulting the transfer function.
Current position: Linear quantisation is retained in v0.90 for simplicity. A systematic analysis of the quantisation error budget against real sensor data is planned before v1.0.
Issue: The datetime field (Section 8.8) uses 24 bits at 5-second resolution, covering 971 days. The 5-second tick was chosen because 24 bits at 1-second resolution covers only 194 days — insufficient for a calendar year.
Observation: The protocol operates at arbitrary bit boundaries throughout. A 25-bit datetime field at 1-second resolution covers 388 days — sufficient for a full year with 23 days of margin. The cost is 1 additional bit per packet when the datetime field is present. For sensors with GNSS time sources (sub-second accurate), the current design degrades accuracy by up to 4 seconds to preserve a round bit count that has no structural significance in a bit-packed protocol.
Counter-argument: The 5-second resolution exceeds the typical sensor observation cycle (tens of seconds to minutes). For a sensor that wakes, reads, encodes, and transmits once per minute, a ±4 second timestamp error is negligible. The 24-bit width also provides 971-day coverage, which is convenient for resolving year boundaries (Section 11.1) — a 25-bit field at 1-second resolution would require the year-boundary algorithm to handle timestamps up to 23 days into the next year rather than 241 days.
Analysis needed: Survey of sensor timing requirements across target use cases. Determine whether any use case benefits materially from 1-second vs. 5-second resolution, given that the protocol does not provide sub-second timestamps in either case.
Issue: The 32-bit header allocates 4 bits to variant (16 values), 12 bits to station ID (4,096 values), and 16 bits to sequence (65,536 values). The specification recommends 100–200 nodes per deployment, meaning approximately 95% of the station ID space is typically unused. The sequence field wraps every ~3.6 days at 5-second intervals or ~45 days at 60-second intervals.
Observation: A 24-bit header with 4 bits variant, 8 bits station ID (256 values), and 12 bits sequence (4,096 values before wrap) would save 8 bits per packet — a 17% reduction in minimum packet size (from 46 to 38 bits). The reduced station ID space (256) still exceeds the recommended deployment size. The reduced sequence space wraps every ~5.7 hours at 5-second intervals, which is adequate for dedup (where the relevant window is seconds, not hours) but reduces the gap detection window.
Counter-argument: The 32-bit header aligns to a 4-byte boundary, which simplifies implementation on platforms where the header is read as a single uint32_t. It also provides headroom for larger deployments and longer gap detection windows without protocol changes. The 8-bit saving per packet is meaningful for the shortest packets (battery-only heartbeats) but diminishes in significance for typical 16–32 byte weather station packets.
Analysis needed: Survey of real deployment sizes and transmission intervals to determine whether the current allocation is well-matched or over-provisioned. Consider whether a configurable header format (selected by variant or by a deployment-wide parameter) could serve both small and large deployments without a fixed compromise.
The specification already acknowledges this issue (Section 5, final paragraph) and notes that a reduction is "not contemplated in this version."
Issue: Bit 6 of Presence Byte 0 is permanently allocated to the TLV flag. Every packet pays this bit regardless of whether TLV data is present. In the common case (no TLV data), this bit is always 0 and conveys no information.
Impact: Presence Byte 0 has 6 data field slots (bits 0–5) rather than 7. A variant with 7 frequently-transmitted fields must use an extension byte for the seventh field, adding 8 bits to every packet. If the TLV bit were relocated, 7 fields could fit in a single presence byte.
Alternatives:
- TLV as an extension byte sentinel. TLV data could be signalled by a specific extension bit pattern (e.g. an extension byte where all data field bits are 0) rather than a dedicated bit.
- TLV as a field type. A field type IOTDATA_FIELD_TLV could be defined, occupying a field slot in the variant table. Variants that need TLV would allocate a slot; variants that don't would reclaim the slot.
- Accept the cost. The current design is simple, unambiguous, and costs at most 1 bit per packet. The number of deployments that need exactly 7 fields in pres0 may be small.
Analysis needed: Survey of planned and potential variant definitions to determine how many would benefit from a 7th slot in pres0.
Issue: The rain size field (Section 8.18) encodes a value in the range 0–6.0 mm at 0.25 mm resolution, but the specification does not define what physical quantity this represents.
Meteorological raindrop measurements come in several forms:
- Median volume diameter (D₀ or D₅₀): The drop diameter at which half the total volume is in smaller drops and half in larger. This is the standard descriptor for a raindrop size distribution.
- Mean diameter: Arithmetic mean of all detected drops.
- Maximum detected diameter: The largest single drop in the observation period.
- Modal diameter: The most common drop size.
These quantities differ significantly for the same rain event. A moderate stratiform rain might have D₅₀ = 1.5 mm, mean = 1.0 mm, and maximum = 3.5 mm.
Impact: Without a defined physical quantity, the field is ambiguous. Two different sensor implementations might encode different quantities using the same field, producing incomparable data.
Additionally: The 0.25 mm resolution is coarse for the scientifically interesting region below 2 mm, where the Marshall-Palmer distribution concentrates most drops. The field's utility for meteorological research is limited unless the resolution is improved or the target application (coarse operational classification rather than scientific measurement) is stated.
Recommendation: Define the field as encoding the median volume diameter (D₀), or explicitly state that the quantity is implementation-defined and intended for coarse classification rather than research-grade measurement.
Issue: The UV index field (Section 8.4) is encoded as 4 bits, range 0–15. The WHO UV Index scale is nominally 0–11+ with values above 11 classified as "extreme." Measured UV indices above 15 are documented at high altitude near the equator (Andes, Tibetan Plateau), with readings up to 20+ recorded by research stations.
Impact: Minimal for the majority of deployments. The ceiling of 15 accommodates all but the most extreme high-altitude equatorial conditions. Deployments targeting such environments would saturate the field.
Recommendation: Accept as a known limitation for v1.0, with a note that 5-bit encoding (range 0–31) would cover all documented terrestrial UV conditions at a cost of 1 additional bit.
Issue: The protocol claims bit efficiency as its primary design goal, and the comparison with alternative encodings (Appendix I) demonstrates that iotdata is significantly more compact than JSON, CBOR, Protobuf, and packed C structs. However, the specification does not establish how close the encoding is to the theoretical minimum.
Analysis needed: For a representative weather station payload (the variant 0 default), compute the Shannon entropy of each field given real-world measurement distributions. Sum the per-field entropies to obtain the theoretical minimum number of bits required to represent a typical reading. Compare this against the actual bit allocation.
For example, if the entropy of a typical reading is 90 bits, then the current 124-bit encoding has 38% overhead and there may be significant room for improvement. If the entropy is 115 bits, the encoding is within 8% of optimal and further compression would yield diminishing returns.
This analysis would also identify which fields contribute the most overhead relative to their information content, guiding any future optimisation of bit allocations.
Context: The protocol deliberately avoids variable-length or entropy-optimised encodings (Huffman, arithmetic coding) for implementation simplicity. A fixed bit-packed encoding can never reach the Shannon limit for variable-entropy data. The analysis would quantify the cost of this design choice and determine whether it is a few percent (acceptable) or tens of percent (worth revisiting).
Issue: Several fields encode physical quantities without specifying the measurement conditions, averaging periods, or sensor calibration assumptions that affect their interpretation.
Examples:
- Temperature: Dry-bulb? Wet-bulb? Aspirated or unaspirated sensor housing? The difference between an aspirated and unaspirated temperature reading in direct sunlight can be 5–10°C — well above the 0.25°C quantisation resolution.
- Wind speed and gust: What averaging period? WMO standard is 10-minute mean speed and 3-second gust. The National Weather Service uses 2-minute mean and 5-second gust. The protocol does not specify, so two stations with different averaging periods produce incomparable data.
- Humidity: Relative humidity depends on temperature. Is the temperature co-located and simultaneous? The Environment bundle (Section 8.3) implies yes, but standalone Humidity (Section 8.11) makes no such guarantee.
- Rain rate: Instantaneous rate? 1-minute average? 10-minute average? Tipping-bucket gauges report discrete tip events; the computed "rate" depends entirely on the averaging algorithm.
Impact: For closed deployments where the operator controls all sensors and understands their characteristics, this ambiguity is manageable. For any form of data sharing or comparison between deployments, it undermines data quality.
Relationship to Section 11.3 and Section 15: The specification acknowledges this gap and identifies sensor metadata TLVs (types 0x10–0x1F) as future work. This appendix entry records the specific fields where the ambiguity is most significant to guide the design of those TLVs.
Issue: The solar irradiance field (Section 8.4) encodes 0–1023 W/m² in 10 bits. The solar constant (top-of-atmosphere irradiance) is approximately 1,361 W/m². At the Earth's surface, clear-sky irradiance rarely exceeds 1,100 W/m², but localised reflections from clouds (the "cloud enhancement" or "lensing" effect) can produce transient readings of 1,200–1,400 W/m² as measured by high-quality pyranometers.
Impact: Stations equipped with research-grade pyranometers in environments prone to cloud enhancement (e.g. tropical cumulus, mountain environments) may occasionally saturate the field. For most operational deployments with standard silicon-cell sensors, the 1,023 W/m² ceiling is adequate.
Recommendation: Consider 11 bits (0–2,047 W/m², 1 W/m² resolution) for headroom, or accept the current range with a documented limitation. The additional bit is a minor cost.
Issue: The Image field (Section 8.27) is the most complex and largest field type in the protocol. It supports multiple pixel formats, compression methods, and sizes, with payload budgets up to 254 bytes. This sits uneasily in a protocol designed around 16–32 byte packets for resource-constrained sensors.
Concerns:
- Payload budget conflict. A 96-byte BILEVEL image at 32×24 plus a 16-byte weather station payload totals 112 bytes. This fits at SF7 (222 byte limit) but not at SF9 (115 bytes) or above. The image effectively precludes higher spreading factors, which are often needed precisely in the remote deployments where camera-equipped sensors might be deployed.
- Compression complexity. Heatshrink decompression requires ~256 bytes of RAM — feasible on Class 3 devices but prohibitive on Class 1 and marginal on Class 2. This creates a field type that cannot be decoded on the same device classes the protocol targets for encoding.
- Use case validation. The field was designed for motion-detection thumbnails (wildlife, livestock, security). It is not clear whether a 32×24 1-bit image transmitted over LoRa provides sufficient utility to justify the protocol complexity. Alternative approaches (transmitting a motion-detected flag bit and storing full images locally for periodic retrieval) may be more practical.
Current position: The Image field is included in v0.90 as an experimental capability. Its inclusion in v1.0 will be reviewed based on implementation experience and demonstrated operational utility.
Issue: The specification does not include normative test vectors (known inputs with expected binary outputs). The reference implementation test suite provides de facto test cases, but these are not reproduced in the specification document.
Impact: An independent implementer working from the specification alone cannot verify conformance without access to the reference implementation. This is a significant gap for a protocol intended for independent implementation.
Plan: A set of normative test vectors covering the following cases will be added before v1.0:
- Minimum valid packet (header + empty presence byte).
- Battery-only heartbeat (minimum useful packet).
- Full weather station packet (variant 0, all pres0 fields).
- Full weather station packet with pres1 fields (position, datetime, flags).
- Packet with TLV data (VERSION + STATUS).
- Edge cases: all fields at minimum values, all fields at maximum values, all fields at quantisation boundary values where round-trip error is maximised.
Issue: The specification describes the packet structure through bit diagrams, encode/decode formulae, and narrative text. It does not include a formal or pseudocode description of the complete decode procedure as a single algorithm.
Impact: The decode path requires synthesising information from Sections 4, 5, 6, 7, and 8. A reader must mentally reconstruct the decode loop: read header, look up variant, read presence bytes, iterate field table, for each present field read the appropriate number of bits and apply the decode formula. This is straightforward but is not stated as an explicit algorithm anywhere in the document.
Plan: A pseudocode decode algorithm (comparable to the encoder example in Appendix C) will be added before v1.0, likely as an additional appendix.
Issue: Some sensor groupings have both bundle and standalone forms (Environment, Wind, Rain, Air Quality, Radiation), while Solar (irradiance + UV) exists only as a bundle with no standalone components. Section 8 acknowledges this: "some bundles have no standalone forms ... this may be addressed in future versions."
Impact: A variant that needs irradiance without UV, or UV without irradiance, must either waste bits on the unwanted component or leave the field absent entirely. This is inconsistent with the protocol's bit-efficiency principle.
Recommendation: Add standalone Irradiance (10 bits) and standalone UV Index (4 bits) field types for completeness before v1.0.
Issue: The clouds field (Section 8.26) uses 4 bits to encode 0–8 okta, leaving 7 of 16 possible values unused. The okta scale is inherently coarse (9 levels for the entire range of cloud cover), and the 4-bit encoding wastes nearly half its capacity.
Alternatives:
- 3 bits (0–7) with 8 mapped to 7. Saves 1 bit but conflates "overcast" (8 okta) with "nearly overcast" (7 okta), which is a meaningful distinction in meteorology.
- 4 bits (0–15) with extended resolution. Use 0–16 to represent cloud cover in sixteenths rather than eighths, providing ~6% resolution. This matches some automated ceilometer outputs more closely than okta.
- Accept as-is. The 4-bit allocation is the smallest whole unit that accommodates 9 values. The waste is 3.17 bits of entropy in a 4-bit field — less than 1 bit of overhead.
Current position: Accepted as-is. The overhead is negligible and the okta scale is the established meteorological standard.
The current field type inventory covers core meteorological and environmental measurements. The following field types have been identified as absent but relevant to the protocol's stated deployment scenarios (farming, forestry, outdoor commercial/industrial, water monitoring). None require new encoding techniques — they are straightforward linear ranges that fit in 7–10 bits and can be added as new field types without structural protocol changes.
The quantisation ranges listed below are preliminary and subject to the same systematic analysis discussed in Section J.3 before inclusion in v1.0.
Priority: High. Soil moisture is arguably the most widely deployed agricultural sensor type after temperature. Its absence is conspicuous for a protocol targeting farm deployments.
Soil moisture is typically reported as volumetric water content (VWC), a percentage of the soil volume occupied by water. The standalone Humidity field (Section 8.11) could be repurposed via variant labelling, as the 0–100% range at 1% resolution is a reasonable fit for VWC. However, some capacitive and TDR sensors (e.g. Meter EC-5, Teros 12) report raw dielectric permittivity rather than calibrated VWC, with a range of approximately 1–80. A dedicated field type would allow the variant to distinguish calibrated VWC from raw permittivity.
Soil electrical conductivity (EC) measures the ability of the soil solution to conduct electrical current, serving as a proxy for salinity and dissolved ion concentration. This is critical for irrigation management in arid and semi-arid agriculture, where soil salinisation is a primary crop yield limiter. Typical range: 0–5,000 µS/cm for most agricultural soils (some saline soils reach 20,000+ µS/cm). Resolution of 10–20 µS/cm is adequate for management decisions. A 10-bit field at 20 µS/cm resolution covers 0–20,460 µS/cm.
Soil temperature is already available via the standalone Temperature field
(Section 8.9) with a variant label such as "soil_temp". No new field type is
needed.
A soil bundle (moisture + conductivity + temperature) would be natural, mirroring the Environment bundle pattern, for sensors like the Teros 12 and Meter TEROS 21 that output all three simultaneously.
| Candidate field | Bits | Range | Resolution | Sensor examples |
|---|---|---|---|---|
| Soil moisture (VWC) | 7 | 0–100% | 1% | Teros 12, EC-5, SHT40 |
| Soil EC | 10 | 0–20,460 µS/cm | 20 µS/cm | Teros 12, Teros 21 |
| Soil bundle | 26 | VWC + EC + temp | As above | Combined probe outputs |
Priority: High. pH is the most fundamental water quality measurement, relevant to aquaculture, river and lake monitoring, irrigation water assessment, and water treatment. The measurement is well-defined (hydrogen ion activity on a logarithmic scale), universally understood, and reported by a wide range of sensors.
Range: 0.00–14.00. Resolution: 0.1 pH units is standard for field instruments; 0.01 is available from laboratory-grade probes but rarely meaningful in continuous outdoor monitoring due to drift and fouling.
| Candidate field | Bits | Range | Resolution | Sensor examples |
|---|---|---|---|---|
| pH (×10) | 8 | 0–14.0 | ~0.06 | Atlas Scientific EZO-pH, Hanna |
| pH (×100) | 10 | 0–14.0 | ~0.014 | Laboratory probes |
The 8-bit encoding (0–255, mapped to 0–14.0 at 255 steps ≈ 0.055 resolution) is adequate for field deployment and matches the ±0.1 accuracy of most submersible pH probes.
Priority: High. Water EC measures dissolved ion concentration, serving as a proxy for total dissolved solids (TDS). It is the primary measurement for monitoring water quality in rivers, lakes, aquaculture ponds, and water treatment systems.
The range varies enormously by application: freshwater rivers are typically 50–1,500 µS/cm, drinking water up to 2,500 µS/cm, brackish water 2,500–30,000 µS/cm, and seawater ~50,000 µS/cm. This wide range is a candidate for non-linear quantisation (Section J.3) or a configurable range per variant.
| Candidate field | Bits | Range | Resolution | Notes |
|---|---|---|---|---|
| EC (freshwater) | 10 | 0–5,115 µS/cm | 5 µS/cm | Rivers, lakes, irrigation |
| EC (wide range) | 10 | 0–51,150 µS/cm | 50 µS/cm | Covers brackish and seawater |
| EC (log scale) | 8 | 1–100,000 µS/cm | ~12%/step | Single field for all use cases |
Atlas Scientific EZO-EC probes cover 0.07–500,000 µS/cm. A deployment-selected range (freshwater vs. wide) via variant label may be the most practical approach, using the same bit encoding with different scaling.
Note that soil EC (Section J.17.1) and water EC are the same physical measurement with different typical ranges. A single EC field type with variant-defined scaling could serve both.
Priority: Medium. Dissolved oxygen (DO) is critical for aquaculture (fish require >5 mg/L; below 3 mg/L is lethal for most species) and river ecology (regulatory thresholds for water quality classification). It is also relevant to wastewater treatment monitoring.
Typical range: 0–20 mg/L. Resolution: 0.1 mg/L is adequate for management decisions and matches the ±0.1–0.2 mg/L accuracy of optical DO probes (Atlas Scientific EZO-DO, In-Situ RDO).
Some sensors also report oxygen saturation as a percentage (0–100%+ ; supersaturation above 100% occurs in algae-rich water). This could be encoded as a standalone humidity-style field with a variant label.
| Candidate field | Bits | Range | Resolution | Sensor examples |
|---|---|---|---|---|
| DO (mg/L, ×10) | 8 | 0–25.5 | 0.1 mg/L | Atlas EZO-DO, RDO Pro |
| DO saturation (%) | 7 | 0–100% | 1% | Same sensors, % output |
Priority: Medium. Turbidity measures the optical clarity of water, reported in Nephelometric Turbidity Units (NTU). It is a proxy for suspended sediment, algal concentration, and general water quality. Relevant for river monitoring (sediment transport during flood events), water treatment (intake turbidity), and aquaculture (pond clarity).
The distribution is heavily right-skewed: clean water is 0–5 NTU, typical rivers 10–100 NTU, flood events 1,000+ NTU, and extremely turbid water can exceed 10,000 NTU. This is a strong candidate for non-linear quantisation.
| Candidate field | Bits | Range | Resolution | Notes |
|---|---|---|---|---|
| Turbidity (linear) | 10 | 0–1,023 NTU | 1 NTU | Adequate for clean water only |
| Turbidity (linear) | 10 | 0–10,230 NTU | 10 NTU | Covers flood events; coarse |
| Turbidity (log) | 8 | 0.1–10,000 | ~12%/step | Matches sensor dynamic range |
Priority: Low-Medium. ORP measures the tendency of a solution to oxidise or reduce, reported in millivolts. It is used in water treatment, aquaculture, and pool/spa monitoring as an indicator of disinfection effectiveness and water chemistry. Atlas Scientific produces an EZO-ORP module for this measurement.
Range: -1,000 to +1,000 mV (most natural water: +200 to +600 mV). Resolution: 1–5 mV is adequate.
| Candidate field | Bits | Range | Resolution | Sensor examples |
|---|---|---|---|---|
| ORP (mV) | 10 | -999 to +1,024 mV | ~2 mV | Atlas EZO-ORP |
Priority: Medium. Flow rate is relevant for river discharge monitoring, irrigation flow measurement, and water distribution systems. Unlike water level (which can be encoded as depth), flow rate is a derived quantity with a wide dynamic range: a small irrigation pipe might carry 0.1 L/s, while a river gauge might report 500 m³/s.
The wide dynamic range and the diversity of units (L/s, m³/s, gallons/min) suggest that this measurement may be better handled as a variant-specific custom encoding or via TLV, rather than as a fixed field type. Alternatively, a generic flow field with variant-defined units and a logarithmic encoding could cover the range.
Note: Many flow measurements are derived from water level via a stage-discharge curve (Manning's equation or a calibrated rating curve). In these cases, the sensor transmits level (depth field) and the gateway computes flow. A flow field type is primarily useful for sensors that output flow directly (ultrasonic transit-time meters, electromagnetic flow meters, weir gauges with integrated computation).
Priority: Low. Leaf wetness sensors detect the presence and quantity of surface moisture on vegetation. They are used in precision viticulture, orchard management, and crop disease prediction models (e.g. downy mildew in grapes, late blight in potatoes). The measurement is typically reported as a coarse categorical scale (dry / dew / wet / saturated) or as a percentage (0–100%) from resistive or capacitive sensors.
A 2-bit categorical field (4 levels) or a repurposed 7-bit humidity field with a variant label would suffice. The measurement is niche but falls within the farming use case.
Priority: Medium. The protocol encodes instantaneous values: a temperature reading is a snapshot at the moment of transmission. For fire detection and frost early warning, the rate of temperature change is often more informative than the absolute value.
Fire detection context: A wildfire approaching a sensor produces a characteristic thermal signature: ambient temperature rises rapidly (10–30°C over 1–5 minutes) before the absolute temperature reaches alarming levels. By the time the temperature reads 60°C, the fire is already at the sensor. Early detection depends on recognising the rate of change while the absolute temperature is still in the 25–40°C range.
Relevant measurements for fire/thermal anomaly detection:
Temperature rate of change (°C/min). A signed value indicating heating or cooling rate. Range: -10 to +30 °C/min covers both frost events (slow cooling, -1 to -5 °C/min) and fire approach (rapid heating, +5 to +30 °C/min). 8 bits at 0.2 °C/min resolution (-25.6 to +25.4 °C/min with signed encoding) would suffice.
This measurement is computed by the sensor from consecutive temperature readings and the interval between them. It requires no additional hardware — only firmware logic and retention of the previous reading. The averaging period should be stated (e.g. "rate computed over the most recent two readings" or "60- second moving average") and could be communicated via a CONFIG TLV.
Smoke / particulate spike. The existing Air Quality PM fields (Section 8.21) encode absolute particulate concentration, which can indicate smoke. However, fire-relevant smoke detection is better characterised by rate of change in PM2.5 (a sudden spike from baseline) rather than absolute level, since baseline varies by location and season. A rate-of-change field for PM2.5 (similar to the thermal rate) could complement the absolute PM reading. Alternatively, the sensor firmware can set a flag bit (Section 8.6) when it detects a PM spike, leaving the algorithm sensor-side.
Carbon monoxide. Already available in the Air Quality Gas field (Section 8.22, slot 3). CO is an early indicator of smouldering combustion. No new field type is needed, but the CO slot's 1 ppm resolution and 0–1,023 ppm range are well suited to fire detection (dangerous levels are 50+ ppm).
Summary of fire-relevant capabilities:
| Measurement | Current status | Gap |
|---|---|---|
| Absolute temperature | Available (8.3, 8.9) | None — but absolute alone is a late signal |
| Temperature rate | Not available | New field type needed |
| PM2.5 absolute | Available (8.21) | None |
| PM2.5 rate | Not available | New field type or flag-based approach |
| CO | Available (8.22) | None |
| Humidity drop | Available (8.3, 8.11) | Rapid humidity drop precedes fire front |
| IR flame detection | Not available | Specialised sensor; out of scope for v1.0 |
Design question: Should rate-of-change be a generic modifier applicable to
any field type, or a standalone field type? A generic approach (e.g. a "delta"
flag or a companion field type that encodes the first derivative of any
measurement) would be more flexible but adds protocol complexity. A standalone
TEMPERATURE_RATE field is simpler and covers the primary use case.
Recommendation: Add a standalone temperature rate-of-change field for v1.0 (simple, no new encoding concepts, high value for fire and frost detection). Defer generic rate-of-change mechanisms to a future version.
| Candidate field | Bits | Range | Resolution | Use case |
|---|---|---|---|---|
| Temp rate (°C/min) | 8 | -25.6 to +25.4 °C/min | 0.2 °C/min | Fire, frost warning |
| PM2.5 rate (µg/m³/min) | 8 | -127 to +128 µg/m³/min | 1 µg/m³/min | Smoke detection |
Issue: The iotdata wire format contains no structural markers — no field tags, type indicators, length prefixes, or sentinel values — between data fields. The decoder determines field boundaries entirely from the variant table and presence bytes. If either the presence bytes or a data field are corrupted by a bit error that is not caught by the transport layer, the decoder will produce a structurally valid but semantically wrong result with no indication of error.
Consider a single bit-flip in Presence Byte 0 that sets the wind bit (S3) when wind data was not transmitted. The decoder now attempts to read 22 bits of wind data from what is actually the rain, solar, and beginning of any extension byte or TLV data. Every field boundary after the corrupted presence byte is shifted, and every subsequent decoded value is wrong. The decoder reports success.
Comparison: Protobuf and CBOR include per-field type and length information that acts as structural redundancy. A corrupted Protobuf field tag will typically produce an invalid wire type or an impossibly large field number, causing the decoder to reject the packet. CayenneLPP's 2-byte channel+type prefix per field provides similar structural checkpoints — a corrupted type byte will fail to match any known sensor type. These formats pay a wire-size cost for this property, but gain detection of mid-payload corruption that survives the transport CRC.
Impact: In practice, this risk is low for deployments using LoRa CRC (which catches most bit errors) or LoRaWAN MIC (which provides cryptographic integrity). The risk is higher for deployments on transports without integrity checks, or where the LoRa CRC is disabled for range extension (a practice used by some long-range deployments at high spreading factors).
The most dangerous failure mode is a corrupted presence byte, because it shifts all subsequent field boundaries and can cause every decoded value to be plausible but wrong. A corrupted data field is less dangerous — it affects only that field and subsequent fields are still correctly aligned (because the corrupt field still occupies its expected bit width).
Mitigation options:
-
Transport-layer CRC (current approach). Rely on LoRa CRC, LoRaWAN MIC, or equivalent link-layer integrity. This is the protocol's stated design choice (Section 3.7). For most deployments it is sufficient.
-
Application-layer checksum. Reserve a TLV type for a packet checksum (e.g. CRC-8 over the header and data fields). The TLV section appears after all data fields, so a checksum TLV allows the receiver to verify the entire data section. Cost: 3 bytes (16-bit TLV header + 8-bit CRC). This is available today using a proprietary TLV type (0x20+) and requires no protocol changes.
-
Range validation. The decoder can validate each decoded value against the field's defined range (e.g. humidity must be 0–100, pressure must be 850–1105). Out-of-range values indicate corruption. This catches some corruptions but not all — a corrupted temperature of 22.5°C when the true value was 18.0°C passes range validation. The reference implementation does not currently perform post-decode range validation, though Section 11.6 identifies this as a non-fatal anomaly.
-
Statistical anomaly detection. The gateway can flag decoded values that are statistically inconsistent with recent history for the same station (e.g. a 30°C temperature jump between consecutive transmissions). This is a receiver-side strategy that requires no wire changes but cannot distinguish corruption from genuine rapid change.
Current position: The protocol's transport-delegated integrity model (option 1) is retained for v1.0. Deployments requiring stronger guarantees SHOULD use LoRaWAN MIC or add an application-layer checksum via a proprietary TLV. The specification should note this failure mode explicitly in Section 11.6 as a receiver consideration.
Issue: Nanopb, Bitproto, and Protobuf all provide code generators that
produce encoder/decoder source code from a schema definition file (.proto,
.bitproto). This ensures that both sides of a communication link use an
identical, machine-generated interpretation of the data layout. Schema changes
are made in one place and propagated automatically.
iotdata's variant tables are hand-coded C arrays. The reference implementation
defines them as iotdata_variant_def_t structs with manually specified field
types, labels, and presence byte counts. Custom variants are created by writing
C code (Section 7, "Custom Variant Maps"). There is no schema definition
language, no code generator, and no tooling to verify that a transmitter's
variant table matches a receiver's.
Impact: In a small deployment (5–20 sensors, one operator), this is manageable — the operator compiles the same variant definition into both sensor and gateway firmware. In larger deployments, or deployments with separate teams responsible for sensor and gateway software, the risk of variant table mismatch increases. A mismatch produces silently wrong data (compounded by J.18 — there are no structural markers to detect the misalignment).
The absence of a schema language also means there is no machine-readable variant definition that could be used to auto-generate decoders in other languages (Python, JavaScript, Go), to validate variant definitions for consistency, or to produce documentation from the schema.
Comparison: Nanopb's workflow is: edit .proto file → run protoc with the
nanopb plugin → get .pb.c and .pb.h → compile into both sensor and gateway.
The schema file is the single source of truth. Bitproto follows an identical
pattern. iotdata's workflow is: edit C source on both sensor and gateway,
manually ensuring consistency.
Options under consideration:
-
Schema definition file. Define a simple text format for variant tables (field type, label, presence byte assignment) and write a generator that produces C source for the reference implementation. This could also generate Python/JavaScript decoders for gateway use. The schema file becomes the single source of truth for the variant definition.
-
Variant table in VERSION TLV. Encode a compact representation of the variant table in the VERSION TLV (Section 9.5.1), transmitted at boot. The gateway auto-discovers the field layout from the first packet. This adds wire overhead but eliminates the need for pre-shared variant definitions. See also J.22.
-
Accept the limitation. The protocol explicitly disclaims global interoperability (Section 3.8). For closed deployments where one build system compiles both sensor and gateway, the risk of mismatch is low. A shared C header included by both sides achieves consistency without a separate toolchain.
Recommendation: Option 1 (a lightweight schema tool) is the most practical improvement and would also enable option 3 of J.22 (variant advertisement). Option 3 (accept the limitation) is the appropriate baseline for v1.0, with a shared C header as a documented best practice for multi-target builds. A schema tool is deferred to post-v1.0 tooling.
Issue: The reference implementation is written in C11 and provides a static
library (libiotdata.a). There are no bindings, ports, or reference
implementations in other languages. A gateway or server written in Python, Go,
JavaScript, or Rust must reimplement the decoder from the specification document
(this README), using the C code as an informal reference.
Comparison: Nanopb generates C code, but the .proto schema it consumes is
shared with the wider Protobuf ecosystem — any Protobuf library in any language
can decode a Nanopb-encoded message. Bitproto generates C, Go, and Python from a
single schema. CayenneLPP has implementations in C++ (Arduino), Python, and
JavaScript, and is natively decoded by The Things Network and ChirpStack without
any user code.
Impact: For an all-C deployment (ESP32 sensor + Raspberry Pi gateway using
the same libiotdata.a), this is not a limitation. For deployments where the
gateway or backend is written in Python, Go, or JavaScript — which is the common
case for cloud-connected IoT platforms — the absence of a reference decoder in
those languages is a significant adoption barrier. Reimplementing the
bit-packing, quantisation, variant dispatch, and TLV parsing is non-trivial and
error-prone, particularly for the variable-length fields (Air Quality, Image)
and the 6-bit packed string format.
Options:
-
Python reference decoder. A Python implementation of the decoder would cover the most common gateway/server language and could also serve as a test oracle for the C implementation. The bit-packing logic is straightforward in Python; the primary work is replicating the variant table dispatch and quantisation formulae.
-
JavaScript/TypeScript decoder. For LoRaWAN deployments, a JavaScript decoder function is directly usable as a TTN/ChirpStack payload formatter, addressing J.21 simultaneously.
-
Generated decoders. If a schema tool is developed (J.19), it could generate decoders in multiple languages from the variant definition.
Recommendation: A Python reference decoder and a JavaScript payload formatter are identified as high-value post-v1.0 deliverables. The C implementation remains the normative reference.
Issue: CayenneLPP's primary competitive advantage is not its wire efficiency (it is 3.4× larger than iotdata for the test payload) but its zero- configuration integration with the LoRaWAN ecosystem. The Things Network, ChirpStack, and myDevices Cayenne all decode CayenneLPP payloads automatically — the operator selects "CayenneLPP" as the payload formatter and sensor data appears in the dashboard with correct field names, units, and types. No custom code is required.
iotdata has no equivalent integration. An operator deploying iotdata on a LoRaWAN network server must write a custom payload formatter (typically in JavaScript) that reimplements the decoder for their specific variant. This formatter must be maintained alongside the sensor firmware and updated whenever the variant definition changes.
Impact: For operators already invested in the LoRaWAN ecosystem and using TTN or ChirpStack with Cayenne dashboards, CayenneLPP's ecosystem integration may outweigh iotdata's wire efficiency advantage. The 3.4× payload size difference matters most at high spreading factors and under tight duty cycle budgets; at SF7 with modest duty cycle pressure, the operational impact of larger payloads is tolerable, and the operational simplicity of CayenneLPP becomes the dominant factor.
For operators using custom gateway software (direct LoRa, non-LoRaWAN), iotdata's wire efficiency advantage applies fully and CayenneLPP's ecosystem integration is irrelevant.
Options:
-
TTN/ChirpStack payload formatter. Provide a JavaScript decoder function that can be pasted into the TTN or ChirpStack payload formatter configuration. This would need to be parameterised by variant (either a generic decoder that reads the variant from the header and looks up a JavaScript variant table, or a generated per-variant formatter). See also J.20 option 2.
-
MQTT auto-decode. For gateways that forward raw packets via MQTT, provide a lightweight MQTT-to-JSON bridge (Python or Node.js) that subscribes to raw packet topics, decodes iotdata, and republishes as JSON. This is architecturally equivalent to CayenneLPP's network server integration but operates at the application layer.
-
Accept the limitation. iotdata's design philosophy prioritises wire efficiency for constrained links over ecosystem convenience. Operators choosing iotdata accept the cost of custom integration in exchange for smaller payloads and longer battery life.
Recommendation: A reference JavaScript payload formatter for TTN/ChirpStack is identified as a high-value deliverable that would substantially reduce the adoption barrier for LoRaWAN deployments, and could be produced as a companion artifact alongside a Python decoder (J.20). The protocol itself requires no changes.
Issue: An iotdata packet is not self-describing. The receiver must possess the transmitter's variant table before it can decode any data fields. If a receiver encounters a packet from a station whose variant definition it does not have, it cannot determine the field types, field widths, or field order. The packet is opaque.
This is a deliberate design choice (Section 3.8: "it is expressly not a goal to support interoperability between implementations"). However, the absence of any mechanism for a receiver to discover a transmitter's variant definition creates operational friction in several scenarios:
-
New sensor deployment. When a new sensor is added to an existing deployment, the gateway must be reconfigured with the sensor's variant definition before it can decode the sensor's data.
-
Multi-operator environments. If two operators share a gateway or mesh infrastructure, each must ensure the gateway has variant definitions for all sensors from both operators.
-
Diagnostic and debugging. A technician with a generic LoRa receiver cannot inspect packets from an unknown deployment without obtaining the variant definition out-of-band.
-
Mesh relay transparency. Mesh relays (Appendix G) forward sensor packets as opaque blobs, but gateways must decode them. If a relay forwards a packet from a sensor using an unknown variant, the gateway cannot decode it.
Comparison: CayenneLPP payloads are fully self-describing — every field
carries its channel and type. CBOR and Protobuf carry structural metadata that
enables generic tools to display the data structure even without a schema (e.g.
protoc --decode_raw). Nanopb-encoded data can be decoded by any Protobuf
library with the .proto schema — and the schema is a portable text file, not
compiled into a specific target.
Options under consideration:
-
Accept the limitation (current position). For closed deployments where one operator controls all devices, the variant table is a compile-time artefact shared between sensor and gateway firmware. No wire-level discovery is needed.
-
Variant definition in VERSION TLV. The existing VERSION TLV (type 0x01, string format) carries firmware and hardware identification. A compact encoding of the variant table could be added — either as additional key-value pairs in the VERSION TLV (e.g.
V0 BAT LNK ENV WND RAN SOL) using short field type mnemonics, or as a new dedicated TLV type in the reserved 0x10–0x1F range.The variant definition is static per firmware build, so it would be transmitted once at boot alongside the VERSION TLV. The gateway caches it per station_id and uses it to decode subsequent packets. Cost: approximately 20–40 bytes once per boot cycle — negligible amortised across thousands of subsequent data packets.
-
Schema file distribution. If a schema definition language is developed (J.19), variant definitions could be distributed as files (alongside firmware images, via OTA manifest, or published to a repository). The gateway loads schema files for the variants it expects to encounter. This is an out-of-band mechanism that requires no wire changes.
-
Well-known variant registry. Publish a set of standardised variant definitions (weather station, soil sensor, water quality, snow depth) with assigned variant IDs. Receivers that implement the registry can decode any sensor using a registered variant without per-deployment configuration. This conflicts with the current non-interoperability stance (Section 3.8) but could be offered as an optional extension for operators who want plug-and-play behaviour.
Recommendation: Option 1 (accept the limitation) is appropriate for v1.0, consistent with the protocol's design philosophy. Option 2 (variant definition in a TLV) is the most promising future mechanism because it requires no out-of-band coordination and leverages existing protocol features. The design of a compact variant table encoding is deferred to post-v1.0 but is noted as a prerequisite for any future interoperability work. Option 4 (registered variants) may be revisited if the protocol achieves adoption beyond single- operator deployments.
Issue: ASN.1 with Unaligned Packed Encoding Rules (UPER, ITU-T X.691) is a standardised bit-packing encoding that operates on the same principle as iotdata's core encoding: constrained integer ranges are mapped to minimum-bit-width representations, fields are not byte-aligned, and optional fields are indicated by a presence bitmap at the start of the SEQUENCE. UPER is deployed at scale in 3GPP signalling (LTE RRC, 5G NR), aviation (ADS-B uses a fixed-layout bit-packed format with the same philosophy), automotive V2X (CAM/DENM messages), and space telemetry (ESA's Packet Utilisation Standard via ASN1SCC). An informed reviewer of iotdata will immediately ask: "why not define an ASN.1 schema and use UPER?"
Comparison: An ASN.1 schema for the iotdata test payload would look approximately like:
WeatherStation ::= SEQUENCE {
battery INTEGER (0..100) OPTIONAL, -- 7 bits
linkQuality INTEGER (0..100) OPTIONAL, -- 7 bits
temperature INTEGER (-400..850) OPTIONAL, -- 11 bits (range 1251)
humidity INTEGER (0..100) OPTIONAL, -- 7 bits
pressure INTEGER (8500..11050) OPTIONAL, -- 9 bits (range 2551)
...
}UPER would encode constrained integers in minimum bits (identical to iotdata), prefix the SEQUENCE with a presence bitmap (identical to iotdata's presence bytes), and pack fields without byte alignment. The wire encoding of the data fields would be nearly identical in size — UPER's encoding of this schema would produce a payload within 1–2 bits of iotdata's 16-byte test payload.
Why iotdata does not use ASN.1 UPER:
-
Toolchain weight. ASN.1 compilers are substantial tools. The open-source ASN1SCC (ESA) generates C and SPARK/Ada from ASN.1 grammars with zero-malloc guarantees and is suitable for embedded targets. However, ASN1SCC itself requires .NET 9 and Java JRE to run, and the generated code includes a runtime library (asn1crt.c, encoding helpers) that adds several KB of ROM. Commercial ASN.1 compilers (OSS Nokalva, Objective Systems) are expensive and typically licensed per-seat. The Python
asn1toolspackage can generate UPER C source but supports only a subset of ASN.1. For a project targeting ESP32-C3 with 400 KB flash, the toolchain overhead and generated code size are non-trivial compared to iotdata's single .c/.h with no external dependencies. -
No domain-specific quantisation. UPER encodes constrained integers in minimum bits, but the constraint must be expressed as an integer range. To encode temperature as 0.1°C resolution over -40.0°C to +85.0°C, the ASN.1 schema must define the field as
INTEGER (-400..850)and the application must perform the ×10 scaling on both sides. UPER provides the bit-packing but not the semantic quantisation — the schema does not express "this field is a temperature in °C with 0.1 resolution." iotdata's field type system encodes the physical meaning, resolution, and range as a single declaration, and the reference implementation performs quantisation and dequantisation automatically. -
No presence-byte-driven variable layout. UPER's OPTIONAL bitmap is fixed at schema definition time — every OPTIONAL field in the SEQUENCE gets a bit in the preamble, always. iotdata's variant system allows different deployments to define different field sets (variants) with different presence byte layouts, and the presence bytes serve double duty as both optional-field indicators and variant-specific field selectors. ASN.1 would require a separate schema (or CHOICE type) per variant, and the decoder would need to know which schema to apply — reintroducing the variant-selection problem at the ASN.1 level.
-
No TLV extension mechanism. iotdata's TLV section (Section 9.5) allows arbitrary typed extensions (firmware version, GPS coordinates, text labels, image data) to be appended to any packet without schema changes. ASN.1 supports extensibility via the
...extension marker, but extending a UPER-encoded SEQUENCE requires the extension to be defined in the schema and recompiled. iotdata's TLV section is deliberately schema-free. -
Specification complexity. The ASN.1 standard spans ITU-T X.680–X.683 (notation) and X.690–X.696 (encoding rules). UPER alone (X.691) is a 107-page specification with complex rules for fragmentation, length determinants, and constraint visibility. iotdata's encoding rules fit in a single README section and can be implemented from scratch in an afternoon. For a single-purpose IoT sensor protocol, the full generality of ASN.1 is unnecessary overhead.
What ASN.1 UPER does better:
- Formal schema language with decades of tooling, validation, and interoperability testing.
- Automatic code generation for C, Ada, Python, Java, Go, and Rust.
- Proven at enormous scale (every LTE/5G device on earth uses UPER for RRC).
- Schema evolution via extension markers — forward and backward compatibility is a solved problem.
- Interface Control Document (ICD) generation from the schema.
Position: iotdata's encoding is philosophically identical to ASN.1 UPER but trades generality for simplicity, domain awareness, and minimal toolchain dependency. For deployments where ASN.1 tooling is already available (e.g. space systems, automotive V2X), UPER is the superior choice. For bare-metal IoT sensors where the entire firmware fits in 256 KB and the developer does not have access to (or budget for) an ASN.1 compiler, iotdata provides the same wire efficiency with a fraction of the toolchain complexity.
The existence of ASN1SCC (open-source, zero-malloc, ESA-funded) narrows this gap considerably. A future version of iotdata could offer an ASN.1 schema as an alternative interface to the same wire format, allowing ASN.1-equipped teams to use their preferred toolchain while remaining wire-compatible with the C reference implementation.
Issue: SenML (Sensor Measurement Lists, IETF RFC 8428) and LwM2M (Lightweight M2M, OMA SpecWorks) are the IETF/OMA standards for IoT sensor data representation and device management respectively. CayenneLPP's type codes are derived from LwM2M/IPSO Smart Object IDs. Any IoT data format should be positioned relative to these standards.
SenML overview: SenML defines a data model for sensor measurements as a list of records, each containing a name, value, unit, and optional timestamp. It supports JSON, CBOR, XML, and EXI representations. The CBOR representation uses integer keys for compactness (e.g. key -2 for Base Name, key 2 for Value). A minimal SenML+CBOR record for one temperature reading is approximately 15–20 bytes (CBOR map with name string, unit string, and double-precision value). A full weather station payload (6 sensor readings) would be approximately 90–120 bytes in SenML+CBOR — 6–7× larger than iotdata's 16-byte encoding.
LwM2M overview: LwM2M defines a device management and service enablement protocol built on CoAP. It uses an object/resource model where standardised Object IDs (e.g. 3303 = Temperature, 3304 = Humidity, 3323 = Pressure) identify sensor types. LwM2M operates over CoAP/UDP with DTLS security and requires a LwM2M server. It is designed for bidirectional device management (firmware update, configuration, observation) rather than unidirectional sensor data streaming.
Why iotdata does not use SenML or LwM2M:
-
Wire overhead. SenML's self-describing records carry field names, units, and full-precision values per reading. Even in CBOR, this is 6–7× larger than iotdata. For LoRa at SF12 (51-byte maximum payload), a SenML+CBOR weather station payload would not fit in a single packet.
-
Protocol weight. LwM2M requires CoAP, DTLS, and a server-side LwM2M implementation. This is a full application-layer stack unsuitable for bare-metal LoRa devices with no IP connectivity. SenML as a data format is lighter but still assumes a transport capable of carrying its CBOR/JSON payloads.
-
Unidirectional design. iotdata is designed for fire-and-forget sensor telemetry on unidirectional or asymmetric links. LwM2M's observation model (where the server subscribes to resources and receives notifications) assumes bidirectional connectivity.
What SenML/LwM2M do better:
- Standardised sensor type identifiers (IPSO Object IDs) with IANA-registered units — a solved namespace problem.
- Self-describing payloads with no out-of-band schema requirement.
- Ecosystem integration with IoT platforms (AWS IoT, Azure IoT Hub, Thingsboard) that natively parse SenML.
- Formal extensibility through IANA registries.
Relevance to iotdata: If iotdata adopts global field type IDs (see J.27), aligning those IDs with IPSO/LwM2M Object IDs where possible would provide semantic interoperability without wire overhead. A gateway decoding iotdata could map field type 0x03 (IOTDATA_FIELD_ENVIRONMENT) to LwM2M Objects 3303 (Temperature) + 3304 (Humidity) + 3315 (Barometer), enabling integration with LwM2M-aware platforms at the application layer.
Issue: Appendix I.7 presents airtime comparisons across encoding formats and spreading factors, but does not translate these into the operational metric that matters most for battery-powered deployments: projected battery life.
Worked example: Consider a solar-powered weather station transmitting the test payload (Section I.1) every 60 seconds using an SX1262 transceiver at +14 dBm on EU868 (125 kHz bandwidth).
Key parameters from the SX1262 datasheet:
- TX current at +14 dBm (DC-DC): ~45 mA
- RX current: 4.2 mA
- Sleep current (warm start, RTC running): 1.2 µA
- MCU (ESP32-C3) deep sleep: ~5 µA
- Total sleep current: ~6.2 µA
For each transmission cycle, the energy cost is dominated by the TX duration. Using airtime values from Appendix I.7:
At SF7 (short range, high data rate):
| Format | Payload | Airtime | TX charge per cycle | TX charge/day |
|---|---|---|---|---|
| iotdata | 16 B | 46.3 ms | 0.579 µAh | 0.834 mAh |
| Bitproto | 20 B | 51.5 ms | 0.644 µAh | 0.927 mAh |
| CayenneLPP | 54 B | 97.5 ms | 1.219 µAh | 1.755 mAh |
| Protobuf | 42 B | 82.2 ms | 1.028 µAh | 1.480 mAh |
| CBOR | 66 B | 113.2 ms | 1.415 µAh | 2.037 mAh |
| JSON | 177 B | 256.0 ms | 3.200 µAh | 4.608 mAh |
At SF7, the differences are small in absolute terms — all formats consume <5 mAh/day on TX alone. Sleep current (~0.149 mAh/day) and MCU active time dominate. Battery life differences are negligible at this spreading factor.
At SF12 (long range, low data rate):
| Format | Payload | Airtime | TX charge per cycle | TX charge/day |
|---|---|---|---|---|
| iotdata | 16 B | 1,482 ms | 18.5 µAh | 26.7 mAh |
| CayenneLPP | 54 B | 3,121 ms* | 39.0 µAh | 56.2 mAh |
| JSON | 177 B | —** | — | — |
* CayenneLPP's 54 bytes exceeds the SF12 maximum payload of 51 bytes. The value shown assumes the DR0 maximum is relaxed or the payload is split across two packets (doubling actual TX cost).
** JSON's 177 bytes far exceeds the SF12 maximum payload. Multiple packets required.
On a 3000 mAh battery (e.g. 18650 LiPo), assuming 80% usable capacity (2400 mAh), with sleep current of ~0.149 mAh/day:
| Format | SF7 battery life | SF12 battery life |
|---|---|---|
| iotdata | ~2,440 days (6.7 years) | ~89 days (2.9 months) |
| CayenneLPP | ~2,410 days (6.6 years) | ~42 days* (1.4 months) |
* Assumes two-packet transmission to fit SF12.
Analysis: At SF7, encoding efficiency has minimal impact on battery life because sleep current dominates. At SF12 — which is the regime where encoding efficiency matters most — iotdata's 16-byte payload delivers roughly 2× the battery life of CayenneLPP. For solar-powered deployments with marginal winter charging, this difference can be the margin between continuous operation and data gaps.
The battery life advantage scales with transmission frequency. A sensor transmitting every 30 seconds at SF12 would halve all battery life figures, making the encoding efficiency difference more pronounced.
Limitation of this analysis: These figures account only for TX energy. In practice, MCU wake time (sensor reading, encoding, SPI transfer), RX windows (for LoRaWAN Class A), and DC-DC converter efficiency also contribute. TX energy is typically 60–80% of per-cycle energy at SF10+, making the airtime comparison a reasonable proxy for total energy at high spreading factors.
Issue: iotdata encodes every field as an absolute value on every transmission. For sensor data with high temporal correlation (temperature changes <0.5°C per minute, pressure changes <0.5 hPa per minute), this transmits substantial redundant information. A 9-bit absolute temperature (0.1°C over -40 to +85°C) could often be replaced by a 4-bit signed delta (±0.7°C), reducing the per-field cost from 9 bits to 4 bits for ~95% of consecutive readings.
Information-theoretic context: This observation connects to J.9 (information-theoretic encoding efficiency). iotdata's quantisation optimises the per-field encoding to the minimum bits required for the field's static range. Delta encoding would optimise for the dynamic range of consecutive readings — the temporal entropy rather than the static entropy. For slowly changing environmental data, temporal entropy is substantially lower than static entropy.
Comparison: No existing IoT payload format in the comparison set (CayenneLPP, Nanopb, Bitproto, TinyCBOR, SenML) implements delta encoding. This is not a gap relative to competitors but an opportunity for iotdata to extend its efficiency advantage.
Delta encoding is well-established in other domains: video codecs (I-frames vs P-frames), audio codecs (ADPCM), GPS track compression (delta-of-deltas), and time-series databases (Gorilla compression). The pattern is always the same: transmit a full keyframe periodically and deltas between keyframes.
Design sketch:
A delta-capable iotdata variant would operate as follows:
-
Every N-th packet (e.g. N=10) is a keyframe — encoded identically to the current absolute format. The keyframe establishes the reference values for all fields.
-
Intermediate packets are delta frames. Each present field is encoded as a signed delta from the previous keyframe value, using a smaller bit width defined per field type:
Field Absolute bits Delta bits Delta range Temperature 9 5 ±1.5°C Humidity 7 4 ±7% Pressure 9 5 ±1.5 hPa Wind speed 8 5 ±1.5 m/s Wind direction 9 5 ±15° Rain 5 3 ±0.3 mm Solar 10 5 ±15 W/m² -
If a delta exceeds the representable range, the field falls back to absolute encoding for that packet (indicated by a flag bit, or by transmitting a keyframe).
Estimated savings: For the test payload at steady-state (6 sensor fields present), absolute encoding uses ~55 data bits. Delta encoding would use ~27 data bits — approximately 50% reduction in the data section, saving ~3.5 bytes per packet. Over a 10-packet keyframe cycle, 9 delta frames save ~31.5 bytes total, at the cost of added decoder complexity and keyframe synchronisation requirements.
Challenges:
-
Keyframe synchronisation. A receiver that misses the keyframe cannot decode subsequent delta frames. This is the same problem as joining a video stream mid-GOP. Mitigations: periodic keyframes at a rate faster than the expected packet loss rate; transmitting the keyframe index in the header so the receiver knows when to expect the next one; or a "request keyframe" mechanism for bidirectional links.
-
Compounded by J.18. A corrupted delta value produces a wrong reference for subsequent deltas, causing error accumulation until the next keyframe. This is strictly worse than absolute encoding's corruption behaviour (where each packet is independent).
-
Complexity. Both encoder and decoder must maintain per-field state across packets. The encoder must track the last keyframe values; the decoder must reconstruct absolute values from deltas. This adds RAM (one
iotdata_reading_tper tracked station) and code complexity. -
Variant table expansion. Delta bit widths would need to be defined per field type, adding another dimension to the variant table.
Recommendation: Delta encoding is deferred beyond v1.0. The complexity and
synchronisation challenges outweigh the 3–4 byte savings for most deployments.
However, for deployments transmitting at SF12 where every byte matters (see
J.25), delta encoding could reduce a 16-byte payload to ~12 bytes, potentially
allowing a lower spreading factor and substantially reducing airtime. The design
sketch above is preserved for future consideration. If implemented, it should be
a variant-level option (e.g. IOTDATA_VARIANT_FLAG_DELTA) rather than a
protocol-level change, keeping backward compatibility with absolute-only
decoders.
Issue: J.22 identifies the lack of variant table discovery as a limitation. J.19 identifies the absence of schema tooling. This item proposes a concrete mechanism that addresses both: transmitting the variant map from the device as a compact TLV, enabling any receiver to decode subsequent packets without pre-shared configuration.
The core analogy is dictionary-based compression: zstd can transmit a dictionary once and reference it for all subsequent frames. Similarly, iotdata could transmit the variant definition once at boot (or periodically) and every subsequent packet is decoded against the cached variant map.
Prerequisite — Global Field Type Identifiers:
For the variant map to be meaningful to any receiver, field types must have
globally unique, stable identifiers. Currently, field types
(IOTDATA_FIELD_BATTERY, IOTDATA_FIELD_ENVIRONMENT, etc.) are
implementation-internal enum values in the C reference implementation. Their
numeric values are not part of the specification and could change between
releases.
Promoting these to protocol-level identifiers means:
- Each field type is assigned a permanent numeric ID in the specification.
- The ID encodes the field's data layout (bit widths, quantisation, sub-field structure) — a decoder that knows ID 0x03 can decode an ENVIRONMENT field without any additional information.
- IDs are never reassigned or reused. New field types receive new IDs.
- The ID space is partitioned: 0x00–0x3F for specification-defined types, 0x40–0x7F for user-defined types (with locally-scoped semantics).
A suggested initial assignment (illustrative, subject to specification review):
| ID | Field Type | Sub-fields |
|---|---|---|
| 0x01 | BATTERY | voltage_pct (7 bits) |
| 0x02 | LINK_QUALITY | rssi_pct (7 bits) |
| 0x03 | ENVIRONMENT | temp (9) + humidity (7) + pressure (9) |
| 0x04 | WIND | speed (8) + direction (9) + gust (5) |
| 0x05 | RAIN | accumulation (5 bits) |
| 0x06 | SOLAR | irradiance (10 bits) |
| 0x07 | UV_INDEX | uv (4 bits) |
| 0x08 | SOIL | moisture (7) + temp (9) |
| 0x09 | AIR_QUALITY | pm2.5 (10) + pm10 (10) + aqi (8) |
| 0x0A | WATER_QUALITY | tds (10) + ph (7) + temp (9) |
| 0x0B | SNOW_DEPTH | depth (10 bits) |
| 0x0C | LEAF_WETNESS | wetness (7 bits) |
| ... | ... | ... |
| 0x40 | USER_DEFINED_0 | (layout defined by variant map) |
Variant Map TLV Design:
A new TLV type (proposed: 0x10, within the reserved 0x10–0x1F sensor metadata range) carries the variant definition:
TLV Header: [0x10][length] -- 2 bytes (standard 6+10 bit TLV header)
Payload: [num_presence_bytes:3] -- 3 bits: number of presence bytes (1-7)
[num_fields:5] -- 5 bits: number of fields in variant (1-31)
[field_0_id:7] -- 7 bits per field: global field type ID
[field_1_id:7]
...
[field_N_id:7]
For the weather station test variant (2 presence bytes, 6 fields):
Presence count: 2 → 010 (3 bits)
Field count: 6 → 00110 (5 bits)
Field IDs: BATTERY → 0000001 (7 bits)
LINK → 0000010 (7 bits)
ENV → 0000011 (7 bits)
WIND → 0000100 (7 bits)
RAIN → 0000101 (7 bits)
SOLAR → 0000110 (7 bits)
Total: 3 + 5 + (6 × 7) = 50 bits = 7 bytes payload + 2 bytes TLV header = 9 bytes. Transmitted once at boot alongside the VERSION TLV, then cached by the gateway per station_id.
Operational model:
-
Boot: Sensor transmits a VERSION TLV (type 0x01) and a VARIANT MAP TLV (type 0x10) in the first packet after power-on or reset.
-
Periodic refresh: The variant map TLV is retransmitted every N packets (e.g. N=100, or once per hour) to handle gateway restarts and new receivers joining the network.
-
Gateway caching: The gateway maintains a map of
station_id → variant_definition. When a variant map TLV is received, the gateway stores or updates the entry. Subsequent data packets from that station_id are decoded using the cached variant. -
Unknown station: If a data packet arrives from a station_id with no cached variant, the gateway buffers the raw packet and waits for the next variant map TLV. Alternatively, the gateway can request a retransmission on bidirectional links.
Relationship to IPSO/LwM2M: If global field type IDs are aligned with IPSO Smart Object IDs where possible (J.24), the variant map TLV provides enough information for a gateway to not only decode the packet but also map each field to a standardised semantic type — bridging iotdata's compact wire format to the broader IoT standards ecosystem.
Relationship to other J items:
- J.19 (schema tooling): The global field type ID registry IS the schema. A schema tool generates variant tables from a list of field type IDs.
- J.20 (multi-language decoders): A decoder that knows the global field type registry can decode any variant map TLV and then decode any subsequent packet — no per-variant code generation needed.
- J.21 (LoRaWAN integration): A generic JavaScript payload formatter that parses the variant map TLV can decode any iotdata variant on TTN/ChirpStack without per-deployment configuration.
- J.22 (variant discovery): This item IS the concrete mechanism for variant discovery.
Recommendation: Global field type identifiers should be defined in the v1.0 specification even if the variant map TLV is deferred to a future version. Locking the IDs now ensures forward compatibility — any variant tables created today will be expressible as variant map TLVs in the future. The variant map TLV itself is a low-risk addition (it uses the existing TLV mechanism, adds no overhead to data packets, and is entirely optional) and could be included in v1.0 as an OPTIONAL feature.
Issue: Appendix G defines a mesh relay protocol for multi-hop iotdata delivery. This protocol should be compared against established mesh routing approaches for low-power wireless networks to contextualise its design choices and identify trade-offs.
Comparison targets:
RPL (RFC 6550) — IPv6 Routing Protocol for Low-Power and Lossy Networks:
RPL is the IETF standard for mesh routing in constrained networks. It builds a Destination-Oriented Directed Acyclic Graph (DODAG) rooted at a border router, using periodic DIO (DODAG Information Object) messages to construct and maintain the topology. Key characteristics:
- Full IP stack required. RPL operates on 6LoWPAN/IPv6, requiring a 6LoWPAN adaptation layer, IPv6, ICMPv6, and the RPL control protocol. The Contiki-NG implementation uses approximately 30–50 KB ROM and 10–20 KB RAM.
- Proactive routing. Routes are maintained continuously via DIO/DIS/DAO control messages, even when no data is being sent. The Trickle timer reduces control traffic in stable topologies but still consumes airtime and energy.
- Bidirectional. Supports multipoint-to-point (sensor→gateway), point-to-multipoint (gateway→sensors), and point-to-point (sensor→sensor).
- Topology-aware. RPL maintains a routing table and selects routes based on an Objective Function (e.g. minimise hop count, maximise path ETX).
- Target environment. IEEE 802.15.4 networks (Zigbee-class), typically sub-100m range, hundreds to thousands of nodes, 250 kbps data rate.
Meshtastic — LoRa Mesh Protocol:
Meshtastic is an open-source LoRa mesh protocol designed for off-grid text messaging. It uses managed flood routing (since v2.6) with distinct strategies for broadcast and direct messages. Key characteristics:
- Managed flooding. Broadcast messages are rebroadcast by all receiving nodes (up to a configurable hop limit, default 3, max 7). Nodes use a rebroadcast scoring heuristic based on SNR, hop count, and role to decide whether to relay — nodes unlikely to improve coverage suppress their rebroadcast.
- No routing tables. Nodes do not maintain routes. Each packet carries a hop limit and nodes make independent forwarding decisions. This eliminates control-plane overhead entirely.
- Protocol Buffers payload. The packet header is raw bytes (for hardware filtering efficiency) but the payload is Protobuf-encoded. This adds encoding overhead but enables cross-vendor interoperability.
- High duty cycle. Nodes must listen continuously (or near-continuously) to participate in mesh relaying. This fundamentally conflicts with battery-powered sensor operation — Meshtastic nodes typically require USB power or frequent charging.
- Target environment. LoRa P2P at 868/915 MHz, 1–20 km range per hop, tens to hundreds of nodes, text messaging and telemetry.
- Scalability concerns. Flooding-based routing generates O(N) transmissions per message in an N-node network. Community experience suggests congestion issues beyond ~100 nodes in a single mesh, particularly on the default LONG_FAST preset with 10% EU duty cycle.
Thread (IEEE 802.15.4 / 6LoWPAN):
Thread is a low-power mesh networking protocol for home automation. It uses 6LoWPAN over IEEE 802.15.4 with RPL for routing and MLE (Mesh Link Establishment) for network management. Thread is a full networking stack (IP-based, with DTLS security, DNS-SD service discovery, and border router integration) designed for always-powered or mains-powered devices. Its resource requirements (64 KB ROM, 32 KB RAM minimum) and always-on radio make it unsuitable for battery-powered LoRa sensors.
Zigbee Mesh:
Zigbee uses a hybrid routing approach (AODV reactive routing + tree routing) over IEEE 802.15.4. Like Thread, it requires substantial stack resources and an always-on radio for routing nodes. Zigbee's mesh is designed for dense, short-range networks (10–100m) with mains-powered routers and battery-powered end devices that do not participate in routing.
iotdata Appendix G mesh — design positioning:
| Dimension | RPL | Meshtastic | Thread/Zigbee | iotdata G |
|---|---|---|---|---|
| Routing strategy | Proactive DODAG | Managed flood | Proactive/reactive | Simple relay |
| Control overhead | DIO/DAO periodic | None (flooding) | MLE + RPL | None |
| Routing table | Yes (per-node) | No | Yes | No |
| IP stack required | Yes (6LoWPAN) | No | Yes | No |
| RAM for routing | 10–20 KB | ~1 KB | 32+ KB | ~100 bytes |
| Payload awareness | No (opaque) | Protobuf | No (opaque) | iotdata-native |
| Relay duty cycle | Always-on | Always-on | Always-on | Duty-cycled |
| Battery-powered relays | Impractical | Impractical | Impractical | Designed for |
| Max hops (practical) | 10+ | 3–7 | 10+ | 3–5 |
| Node scale | 1000+ | ~100 | 250+ | 10–50 |
| Bidirectional | Yes | Yes | Yes | No (uplink only) |
Key trade-offs in iotdata's approach:
-
Simplicity over optimality. iotdata's mesh relay is a simple store-and-forward mechanism: a relay node receives a sensor packet, stores it, and retransmits it in a subsequent TX window. There is no route discovery, no topology management, and no routing table. This is viable because iotdata assumes a sparse, mostly-static topology with a small number of relay hops between sensor and gateway.
-
Duty-cycled relays. Unlike RPL, Meshtastic, Thread, and Zigbee — all of which require routing nodes to listen continuously — iotdata relays can operate on duty-cycled schedules. A relay wakes for a brief RX window, buffers any received packets, and retransmits them in the next TX window. This enables solar-powered or battery-powered relay nodes in locations without mains power.
-
Uplink-only. iotdata's mesh supports only sensor→gateway traffic. There is no downlink path (gateway→sensor) through the mesh. This eliminates the complexity of bidirectional routing but means that remote sensors cannot receive configuration updates, firmware, or acknowledgements via the mesh.
-
Payload-native. Relay nodes can optionally inspect and filter iotdata packets (e.g. suppress duplicate readings, aggregate multiple sensors into a single relay packet). RPL and Thread treat payloads as opaque IP packets.
-
No scalability beyond sparse topologies. The simple relay approach does not handle network congestion, route selection, or topology changes. For dense deployments (>50 nodes) or dynamic topologies (mobile sensors), RPL or a managed-flooding approach would be necessary.
Recommendation: iotdata's mesh relay is appropriate for its target use case: sparse, static sensor networks with 3–5 hops where relay nodes must operate on limited power budgets. For deployments requiring larger scale, bidirectional communication, or dynamic topologies, an IP-based mesh (RPL over 6LoWPAN) or a flooding-based mesh (Meshtastic-style) should be used as the transport layer, with iotdata as the payload format within that transport. The two concerns (mesh routing and payload encoding) are orthogonal — iotdata packets can be carried over any transport, and Appendix G's relay protocol is an optional convenience for deployments that do not need or cannot afford a full mesh networking stack.
This document and the reference implementation are maintained at [https://libiotdata.org].