Oboron Protocol Specification

1.0 draft-rev2 (2026-03-04)

1. Formats
2. Algorithm
3. Key Management
4. Protocol API
5. Compatibility

1. Formats

An Oboron format represents the full transformation of the plaintext to the encrypted text (obtext), including:

Encryption: Plaintext UTF-8 string encrypted to ciphertext bytes using a cryptographic algorithm
Encoding: The binary payload is encoded to a string representation

1.1 Scheme + Encoding = Format

Formats combine a scheme (cryptographic algorithm) with an encoding (string representation):

Scheme: Cryptographic algorithm + mode + parameters (e.g., aasv)
Encoding: String representation method (e.g., c32)
Format: Scheme + encoding = complete transformation (e.g., aasv.c32)

Given an encryption key, the format thus uniquely specifies the complete transformation from a plaintext string to an encoded obtext string.

Formats are represented by identifiers:

ob:{scheme}.{encoding}, (URI-like syntax, e.g., ob:aasv.c32),
{scheme}.{encoding}, when the context is clear

API Notes:

The ob: namespace prefix is NOT used in the oboron API. Formats like aasv.c32 MUST be used directly.
The public interface MUST use enc/dec names for methods and functions. The enc operation comprises the full process, including the encryption and encoding stages.

1.2 Encodings

All encodings MUST produce output with no padding characters.

b32 - standard base32 (RFC 4648 §6): uppercase, no padding. Alphabet: ABCDEFGHIJKLMNOPQRSTUVWXYZ234567
c32 - Crockford base32: lowercase, no padding. Alphabet: 0123456789abcdefghjkmnpqrstvwxyz (excludes i, l, o, u)
b64 - URL-safe base64 (RFC 4648 §5): Most compact, case-sensitive, no padding. Alphabet: A-Za-z0-9 plus - and _
hex - hexadecimal: lowercase, no padding. Alphabet: 0123456789abcdef. Slightly faster performance (~2-3%), longest output.

1.3 Schemes

Schemes define the encryption algorithm and its properties, classified into tiers:

Scheme Tiers

a-tier - Authenticated
- MUST provide both confidentiality and integrity protection
- Examples: ob:aasv, ob:aags, ob:apsv, ob:apgs
- Implementations SHOULD use a-tier schemes for security-critical applications
u-tier - Unauthenticated
- Provides confidentiality only (no integrity protection)
- Example: ob:upbc
- Suitable when integrity is verified externally or not required
- Warning: Vulnerable to ciphertext tampering
z-tier - Obfuscation tier
- Not cryptographically secure - for non-security use only
- Example: ob:zrbcx - deterministic obfuscation with constant IV
- OPTIONAL in the default build; consult implementation documentation

Scheme Properties

The second letter of the scheme ID further describes the properties of the scheme:

.a.. - avalanche, deterministic
- deterministic => same plaintext always produces same obtext
- avalanche => entropy uniformly distributed; change in any byte of plaintext completely changes the entire obtext (hash-like property)
- Examples: ob:aasv, ob:aags
.p.. - probabilistic
- Different output each time
- Examples: ob:apsv, ob:apgs, ob:upbc
.r.. - referenceable prefix
- referenceable prefix => deterministic + high entropy in the prefix, allowing hash-like short references without full avalanche
- Example: ob:zrbcx

Prefix Restructuring

....x: Schemes with 5-letter identifiers have an x suffix, indicating that the underlying primitive's ciphertext is has been modified in order to achieve the referenceable prefix property.

This does not apply to a- or u-tier schemes, which either have the avalanche property (high entropy throughout, including prefix), or they are probabilistic (therefore not referenceable — referenceability implies determinism).

Currently, the only scheme using prefix restructuring is ob:zrbcx: AES-CBC, being a block-chaining algorithm, has high entropy in the tail, but weak entropy in the head, warranting prefix restructuring, detailed in Section 2.1.4 below.

Scheme Cryptographic Algorithms

The remaining two letters in scheme IDs indicate the algorithm:

gs = AES-GCM-SIV
sv = AES-SIV
bc = AES-CBC

Summary Table

Scheme	Algorithm	Deterministic?	Authenticated?	Notes
`ob:aasv`	AES-SIV	Yes	Yes	General purpose, deterministic
`ob:aags`	AES-GCM-SIV	Yes	Yes	Deterministic alternative
`ob:apsv`	AES-SIV	No	Yes	Maximum privacy protection
`ob:apgs`	AES-GCM-SIV	No	Yes	Probabilistic alternative
`ob:upbc`	AES-CBC	No	No	Unauthenticated - use with caution

Key Concepts:

Deterministic: Same input (key + plaintext) always produces same output. Useful for idempotent operations, lookup keys, caching, or hash-like references.
Probabilistic: Incorporates a random nonce, producing different ciphertexts for identical plaintexts. Standard for most cryptographic use cases (non-cached, not used as hidden references).
Authenticated: Ciphertext is tamper-proof. Any modification (even a single bit flipped) results in decryption failure.

Choosing a Scheme

ob:aasv: General-purpose secure encryption with deterministic output and compact size
ob:apsv: Maximum privacy with probabilistic output (larger size due to nonce)
ob:upbc: Only when integrity is handled externally

Note on encryption strength: All a-tier and u-tier schemes MUST use 256-bit AES encryption. The z-tier uses 128-bit AES for performance in non-security contexts.

2. Algorithm

Oboron combines encryption and encoding in a single operation, requiring specific terminology:

enc operation: Combines encryption and encoding stages
dec operation: Combines decoding and decryption stages
obtext: The output of the enc operation (encryption + encoding), distinct from cryptographic ciphertext

The cryptographic ciphertext (bytes, not string) is an internal implementation detail, NOT exposed in the public API.

The high-level process flow is:

enc operation:
    [plaintext] (string)
      -> encryption -> [ciphertext] (bytes)
      -> encoding -> [obtext] (string)

dec operation:
    [obtext] (string)
      -> decoding -> [ciphertext] (bytes)
      -> decryption -> [plaintext] (string)

The above diagram is conceptual; actual implementation includes scheme-specific steps like scheme byte appending and (for z-tier schemes only) optional ciphertext prefix restructuring. With this middle-step included, the diagram becomes:

enc operation:
    [plaintext]
      -> encryption -> [ciphertext]
      -> oboron pack -> [payload]
      -> encoding -> [obtext]

dec operation:
    [obtext]
      -> decoding -> [payload]
      -> oboron unpack -> [ciphertext]
      -> decryption -> [plaintext]

In a-tier and u-tier schemes, the difference between the payload and the ciphertext is in the 2-byte scheme marker that is appended to the ciphertext, enabling scheme autodetection in decoding.

2.1 Payload Construction

This section formally specifies the "oboron pack" and "oboron unpack" steps that transform a raw ciphertext into the encoded payload and back. Implementations MUST conform to this algorithm for cross-implementation interoperability.

2.1.1 Payload Structure

The payload is the byte sequence produced by the oboron pack step, which is then encoded to produce the obtext. The payload is defined as:

payload = ciphertext || xored_marker[0] || xored_marker[1]

Where:

ciphertext is the raw output of the scheme-specific encryption function
xored_marker is the 2-byte scheme marker after XOR mixing (see section 2.1.3)

2.1.2 Scheme Bytes

Each scheme is identified by a 2-byte marker. The marker is a structured 16-bit value:

Byte 1 (bits 7..0): [ext:1][version:4][tier:3]

ext (1 bit): Extension flag. 0 = no extension bytes follow.
version (4 bits): Format version. Current version is 0000.
tier (3 bits): Security tier:
- 001 = a-tier (authenticated)
- 010 = u-tier (unauthenticated)
- 110 = z-tier (obfuscation)

Byte 2 (bits 7..0): [properties:4][algorithm:4]

properties (4 bits): Scheme properties:
- 0000 = probabilistic
- 0001 = deterministic with avalanche
- 0010 = deterministic referenceable (prefix-restricted avalanche)
algorithm (4 bits): Encryption algorithm:
- 0001 = AES-CBC
- 0010 = AES-GCM-SIV
- 0011 = AES-SIV

Concrete marker values for each scheme:

Scheme	Tier	Properties	Algorithm	Byte 1	Byte 2	Marker
`aags`	`001`	`0001`	`0010`	`0x01`	`0x12`	`[0x01, 0x12]`
`apgs`	`001`	`0000`	`0010`	`0x01`	`0x02`	`[0x01, 0x02]`
`aasv`	`001`	`0001`	`0011`	`0x01`	`0x13`	`[0x01, 0x13]`
`apsv`	`001`	`0000`	`0011`	`0x01`	`0x03`	`[0x01, 0x03]`
`upbc`	`010`	`0000`	`0001`	`0x02`	`0x01`	`[0x02, 0x01]`
`zrbcx`	`110`	`0010`	`0001`	`0x06`	`0x21`	`[0x06, 0x21]`

2.1.3 XOR Entropy Mixing

Before appending, each marker byte is XORed with the first byte of the ciphertext. This ensures the marker appears random in the encoded output, even for very short payloads.

Pack (enc direction):

marker    = scheme.marker()            // 2-byte raw marker
first_byte = ciphertext[0]
payload   = ciphertext || (marker[0] XOR first_byte) || (marker[1] XOR first_byte)

Unpack (dec direction):

first_byte = payload[0]
marker     = [ payload[n-2] XOR first_byte, payload[n-1] XOR first_byte ]
ciphertext = payload[0..n-2]

Where n is the length of the payload. Implementations MUST validate the recovered marker against the expected scheme marker and MUST return an error on mismatch.

2.1.4 Z-tier Prefix Restructuring

For z-tier schemes with prefix restructuring (e.g., zrbcx), the encryption function applies an additional XOR transformation before the scheme marker is appended:

if len(ciphertext) > AES_BLOCK_SIZE:
    ciphertext[0..16] = ciphertext[0..16] XOR ciphertext[last_16_bytes]

This concentrates entropy in the output prefix. This step is reversed during decryption before the standard CBC decryption.

2.2 Padding Design

Oboron's CBC schemes use a custom padding scheme optimized for UTF-8 strings:

Uses 0x01 byte for padding (Unicode control character, never valid in UTF-8)
No padding needed when plaintext ends at block boundary
5% performance improvement over PKCS#7
Smaller output size compared to PKCS#7

Rationale: Oboron is defined to operate exclusively on UTF-8 strings, not arbitrary binary data. This is a protocol-level requirement: all enc operations MUST accept a UTF-8 string input and all dec operations MUST return a UTF-8 string. The 0x01 padding byte can never appear in valid UTF-8 input, ensuring unambiguous decoding. Under the UTF-8 input constraint, this padding is functionally equivalent to PKCS#7 and does not weaken security. Implementations MUST enforce the UTF-8 constraint, eliminating padding ambiguity errors at runtime.

2.3 Per-Scheme Ciphertext Layout

This section specifies the exact byte layout of the raw ciphertext produced by each scheme's encryption function (before the oboron pack step). Implementations MUST conform to these layouts for cross-implementation interoperability.

Deterministic schemes (fixed nonce, nonce not included in output)

aasv (AES-SIV): Uses the full 64-byte master key directly as the AES-SIV combined key (RFC 5297). Encrypts with empty associated data headers (headers = []). Ciphertext layout: [16-byte SIV tag || encrypted data]. No external nonce; the SIV tag serves as the synthetic IV.
aags (AES-GCM-SIV): Uses a fixed all-zero nonce of 12 bytes ([0x00; 12]). Ciphertext layout: [encrypted data || 16-byte authentication tag]. The zero nonce is NOT included in the output.

Probabilistic schemes (random nonce prepended to output)

apsv (AES-SIV): Generates a random 16-byte nonce before each encryption. The nonce is passed as an associated data header to AES-SIV (headers = [nonce]). Ciphertext layout: [16-byte random nonce || 16-byte SIV tag || encrypted data].
apgs (AES-GCM-SIV): Generates a random 12-byte nonce before each encryption. Ciphertext layout: [12-byte random nonce || encrypted data || 16-byte authentication tag].
upbc (AES-CBC): Generates a random 16-byte IV before each encryption. Uses the custom 0x01-byte padding (see §2.2) when the plaintext length is not a multiple of 16 bytes; no padding bytes are added when already aligned to a 16-byte boundary. Ciphertext layout: [16-byte random IV || padded encrypted data].

Summary table:

Scheme	Algorithm	Nonce/IV in output	Ciphertext layout (bytes)
`aasv`	AES-SIV	No	`16-byte SIV tag \|\| encrypted data`
`aags`	AES-GCM-SIV	No (zero nonce fixed)	`encrypted data \|\| 16-byte auth tag`
`apsv`	AES-SIV	16-byte nonce (prepended)	`16-byte nonce \|\| 16-byte SIV tag \|\| encrypted data`
`apgs`	AES-GCM-SIV	12-byte nonce (prepended)	`12-byte nonce \|\| encrypted data \|\| 16-byte auth tag`
`upbc`	AES-CBC	16-byte IV (prepended)	`16-byte IV \|\| padded encrypted data`

3. Key Management

3.1 Single Master Key Model

Oboron uses a single 512-bit master key partitioned into algorithm-specific subkeys using fixed byte offsets (no KDF):

Scheme	Key bytes used	Offset	Length
`ob:aasv`, `ob:apsv`	Full key (bytes 0–63)	0	64 bytes
`ob:aags`, `ob:apgs`	Bytes 32–63	32	32 bytes
`ob:upbc`	Bytes 8–39	8	32 bytes

AES-SIV key structure note: For ob:aasv and ob:apsv, the full 64-byte master key is passed directly to the AES-SIV primitive as the combined key per RFC 5297. AES-SIV internally splits this into two 256-bit subkeys: the first 32 bytes (bytes 0–31) serve as the S2V (CMAC) authentication key, and the second 32 bytes (bytes 32–63) serve as the CTR-mode encryption key. Implementations MUST pass all 64 bytes directly to the AES-SIV primitive and MUST NOT split them manually.

Design Rationale: This approach prioritizes low latency for short-string encryption. No hash-based KDF (e.g., HKDF) is used, as this would dominate runtime for intended workloads.

The master key MUST NOT leave the application. Algorithm-specific keys MUST be extracted on-the-fly and MUST NOT be cached or stored.

3.2 Key Format

The default key input format is base64. This is consistent with Oboron's strings-first API design. As any production use will typically read the key from an environment variable, this allows the string format to be directly fed into the constructor.

The base64 format was chosen for its compactness, as an 86-character base64 key is easier to handle manually (in secrets or environment variables management UI) than a 128-character hex key.

While any 512-bit key is accepted by Oboron, the keys generated by Oboron's key generation utilities MUST NOT include any dashes or underscores, in order to ensure the keys are double-click selectable, and to avoid any human visual parsing confusion due to underscores.

3.3 Valid Base64 Keys

Important technical detail: Not every 86-character base64 string is a valid 512-bit key. Since 512 bits requires 85.3 bytes when base64-encoded, the final character is constrained by padding requirements. When generating keys, implementations MUST use one of the following methods:

use Oboron's key generation utility
generate random 64 bytes, then encode as base64
generate random 128 hex characters, then convert hexadecimal to base64

3.4 Alternative Key Interfaces

Implementations SHOULD support the following key input formats in addition to the default base64 format:

Hexadecimal (128 characters): alternative format for hex-oriented workflows
Raw bytes (64 bytes): for programmatic construction of keys

4. Protocol API

All Oboron implementations MUST provide the following abstract interface.

4.1 Core Operations

enc(plaintext: string) → obtext: string — Encrypts and encodes the plaintext using the configured format; returns an obtext string
dec(obtext: string) → plaintext: string — Decodes and decrypts the obtext; returns the original plaintext
autodec(obtext: string) → plaintext: string — Decodes obtext in any supported format encrypted with the same key, without needing to know the format in advance

Empty plaintext: Implementations MUST reject empty string input. The enc operation MUST return an error when given an empty plaintext string.

4.2 Codec Construction

A codec MUST be constructible from a key and a format specifier. Implementations MUST support construction from a base64 key string (primary interface), as well as from a hex key string or raw key bytes.

Rust:

use oboron::AasvC32;

let ob = AasvC32::new(
    &env::var("OBORON_KEY")?
)?;

Python:

from oboron import AasvC32
import os

ob = AasvC32(os.getenv("OBORON_KEY"))

4.3 Key Generation

All implementations MUST provide a key generation utility that produces a valid 512-bit key encoded as a base64 string (86 characters, URL-safe alphabet, no padding characters).