Packed Encoding Rules

The Packed Encoding Rules (PER) are the most compact encoding rules. The goal of PER is to create smaller encodings. A key feature of PER is the way it encodes data by examining constraints.

Example

Age ::= INTEGER (0..7)
firstGrade Age ::= 6
        -- C0

In the following example, integer A has four possible states, and you only need two bits to represent all possible states. Since we need two bits, there's no need to encode a length; it will always be two bits:

A ::= INTEGER (1234567..1234570)
a A ::= 1234568 -- encoded in 2 bits

On the other hand, B in the example below is unbounded and therefore has an infinite number of states. Since we cannot know its length in advance, we must have a length field. The length field takes 8 bits and the value field takes 24 bits.

B ::= INTEGER
b B ::= 1234568 -- encoded in 32 bits
                -- 8 bit length + 24 bit value

Packed Encoding Rules Overview

  • Tags are never encoded. PER differs from BER in that it doesn't send the Tag of the TLV since the order in which the components of a message occur is known.
  • The length of a value is encoded only if the value has a variable length.
  • A bit mask prefixes SET and SEQUENCE types that contain optional components.
  • The components of a SET are sorted in tag order, then encoded as SEQUENCE.
  • The components of a CHOICE are sorted in tag order, then an index is assigned to each. The index is prefixed to the encoded value.
  • PER also uses additional information from the description of an ASN.1 message to eliminate redundant information from the Value portion of the TLV, thus making PER messages compact and suitable for environments in which bandwidth conservation is important.

There are four variants of PER:

  • Aligned (PER)
  • Unaligned (UPER)
  • Canonical Aligned (CPER)
  • Canonical Unaligned (CUPER)

Note that a "canonical" encoding form ensures a consistent encoding for each message, therefore it is useful for comparing binary streams, for digital signatures, etc..

ASN.1 Transformations Restricted

Some transformations permitted with BER are not allowed with PER:

  • CHOICE cannot be replaced with one of its components.
  • A type cannot be replaced with a CHOICE that includes it.
  • An open type cannot be replaced by another type.
  • PER-visible subtype constraints cannot be semantically altered.

PER Visible Constraints

Constraints that affect PER encodings are called PER-visible constraints. The following subtype constraints are PER-visible:

  • Size constraints applied to OCTET STRINGs, BIT STRINGs, and SEQUENCE OF types
  • Value range constraints applied to INTEGERs
  • Single value constraints applied to INTEGERs

Other subtype constraints (e.g., inner type constraints, or single value constraints on an OCTET STRING type) are not PER-visible. However, although these constraints do not affect a PER encoding, they affect a set of abstract values (i.e., decoded values) that are considered to be valid.

Note: PER seeks to minimize the size of constrained types only for typical uses of the subtype constraint notation. Use of constraints that are normally PER-visible can be rendered not PER-visible by combining them with other constraints that are not PER-visible, or by the use of set operators. For example, D ::= OCTET STRING ('FE'H | SIZE (100..200)) does not have a PER-visible subtype constraint because the single value constraint on OCTET STRING is not PER-visible.

  • Size constraint:
    A ::= OCTET STRING (SIZE(100..120))
  • Value range constraint:
    B ::= INTEGER (25..30)
  • Single value constraint of INTEGERs:
    C ::= INTEGER (40 | 55)
  • Permitted alphabet constraint:
    A ::= PrintableString (FROM ("0".."9"))
  • Constraints that do not restrict all possible values are NOT PER visible:
    E ::= IA5String (SIZE(10) | FROM ("0".."9"))
    

Structure of a PER Encoding

  • Preamble - always prefix a CHOICE, and sometimes a SET or SEQUENCE. Preambles are not always present, but when they are, they come first.
  • Length determinant - lengths are not always necessary. Sometimes they are implied by the ASN.1, for example, Z ::= INTEGER (0..3), which will be encoded using two bits. Sometimes the size is explicit, for example, S ::= OCTET STRING (SIZE(5)). In both cases, we know the length in advance, so there's no need to put a length into the encoding. In these cases, the length field is omitted.
  • Contents - are not always necessary (e.g., NULL).

Encoding a Length Determinant

Unlike BER, where lengths are always in octets, PER lengths can be in different units.

Length may stand for the number of:

  • iterations (SEQUENCE OF)
  • bits (BIT STRING)
  • characters (PER-visible character strings)
  • octets (other cases including PER-invisible strings)

No length is encoded if the size is known:

A ::= PrintableString (SIZE(5))
greeting A ::= "Hello"	
	-- encoded as "Hello" (Aligned PER)

Length is present if the size varies:

B ::= PrintableString (SIZE(1..5)
	salutations B ::= "hi"
	-- encoded as 206869 (Aligned PER)

With UNALIGNED PER, the length is encoded in the minimum number of bits if the range is known and the upper bound is less than 64K.

If the range is unbounded or the upper bound is greater than or equal to 64K, the length is encoded as:

0  - 127        0LLLLLLL
128 - 16K-1     10LLLLLL LLLLLLLL
>=16K           11000nnn fragmented 
                nnn (1-4) = # of 16K
                multiples in fragment

Note that the first bit specifies whether the (0) short form or (1) long form is used. When it's the long form, the second bit specifies whether it's (0) unfragmented or (1) fragmented.

The fragmented form is always made up of multiple fragments ending either with the short or long form. That is, if we call the short form S, the long form L, and the fragmentation indicators (C1, C2, C3, or C4) as C, lengths take the forms S, L, CS, or CL (C can repeat as needed), for example:

  • a length of 5 uses S 05
  • a length of 256 uses L 8100
  • a length of 64K uses CS C4...00
  • a length of 128K uses CCS C4...C4...00
  • a length of 129K uses CCL C4...C4...8400

Encoding Types

BOOLEAN

BOOLEAN types are encoded in a single bit:

0 FALSE

1 TRUE

INTEGER

If constrained, INTEGERs are encoded in a field of minimum width. If they are not constrained, a length determinant is used:

Age ::= INTEGER (3..10)
a Age ::= 4
height INTEGER ::= 4
--	a		001
--	height	00000001 00000100

In the first example, a is based on a type, Age, which has a range constraint of (3..10). Only eight states are possible, and they can be represented in 3 bits. A length of 3 bits is implied.

In the second example, height is based on a type that has no constraint. That is, it can take any of an infinite set of states. There is no implied length so one must be explicitly encoded. The first 8 bits are the length and the next 8 bits form the value. Consider the following to decode height:

  1. There is no implied length, so we have to look for one.
  2. The first bit is a 0, so we know we are using the short form of the length.
  3. Since we are using the short form, the length is found in the next 7 bits (we find 0000001, which is a 1).
  4. Since there's no constraint, there is no offset to worry about. The 1 means 1, and since this is an INTEGER, it means one octet.
  5. Peeling off the next octet, we find 00000100, which is a 4. In this example the octet is on a byte boundary, but it doesn't have to be when we're using unaligned PER.
ENUMERATED

An ENUMERATED type specifies a list of states. When we encode it we must know which state we're dealing with. BER uses the state value numbers within the {} brackets, PER doesn't use them except to know which is which. Even though the state value numbers are not continuous, there are only three possible states, and we only need 2 bits to encode three states.

We first sort the states and then assign each state to an index value. You'll see when we sort them that, although red comes first in the list, it is the one with the highest number. We have room for four states within 2 bits, and we assign them as:

  • 00 pink (2)
  • 01 blue (1)
  • 10 red (100)
  • 11 unused
  • The enumeration is sorted into ascending order to assign an index.
  • Index is treated as a constrained INTEGER.
Color ::= ENUMERATED {red(100), pink(2), blue(7)}
hot Color ::= red

--	hot   10    index   0 pink
--                          1 blue
--                          2 red
BIT STRING
  • A BIT STRING with a named bit list is not encoded with trailing zeroes, unless pad bits are needed to satisfy a constraint.
  • Presence of length determinant depends upon the size constraint.
Color ::= BIT STRING 
			  {red(100), pink(2), blue(7)}
cold Color ::= {blue}
--
--	cold   		00001000 00000001
--

A BIT STRING without named bits is relatively straightforward. As in the general case, you have a length (if needed) followed by a value. The units of length are bits, not octets. A length is needed whenever the length is not implied by the ASN.1 syntax.

Encoding a BIT STRING with named bits is somewhat more complicated. Unlike ENUMERATED, where the numbers represent states, in BIT STRING, the numbers represent bits. Bear in mind that whereas ENUMERATED lists the entire set of possible states, BIT STRING does not. Take the example on the slide. Although only three bits have names, any bit can be set. You could easily have '11'B even though neither bit is named. You could also have a very long string even though the highest named bit is number 100 (the 101st bit). We cannot therefore use a transformation table to index the named bits. Playing decoder with the cold example above.

  • We know a length is needed since the length is not implied.
  • We check the first bit to find it is a 0, which means it's a short length.
  • The next seven bits (0001000) show the length is.
  • The next eight bits (00000001) show that blue(7) is on and everything else is off.
OCTET STRING

OCTET STRING types are encoded with an optional length-prefix according to the rules for determining how to encode the length:

NotBounded  ::= OCTET STRING

FixedLength ::= OCTET STRING (SIZE(3))
  shortO NotBounded  ::= '112233'H
  fixedO FixedLength ::= '112233'H
  -- shortO   03112233
  -- fixedO   112233

We have two examples, one unbounded (no length implied) and one of fixed length. The unbounded one, since no length is implied, needs a length determinant. The fixed length one, since the length is always 3, needs no length determinant.

In shortO, 03 is the length determinant.

NULL

NULL is never encoded. Since the value can only have one state, there is no point to encoding it. It therefore has neither length nor contents field.

Character Strings NumericString, PrintableString, VisibleString, IA5String, BMPString, UniversalString

In the UNALIGNED variant, character strings are encoded in the fewest number of bits necessary:

A ::= IA5String (FROM("AMEX")^SIZE(3))
B ::= IA5String
a A ::= "AXE"  -- 00 11 01
b B ::= "AXE"   
  --  00000011 1000001 1011000 1000101

This example illustrates how permitted alphabet constraints work in PER. Note how A is constrained by alphabet and also has a fixed length; B, on the other hand is unconstrained. Since A has a fixed length, the length of a value is implied and is not encoded. Since B can have any length, its length must be encoded.

Since the only characters possible are A, E, M, and X, we can create a 2-bit transformation table, namely, 00=A, 01=E, 10=M, 11=X and so "AXE" is encoded as 001101.

In the ALIGNED variant, characters are encoded in the fewest number of power of 2 bits necessary.

A ::= IA5String (FROM("AMEX")^SIZE(3))
B ::= IA5String
  a A ::= "AXE"  -- 00 11 01
  b B ::= "AXE"   
  --  00000011 01000001 01011000 01000101

Aligned PER character strings have characters that are always n**2 bits long (2, 4, 8, 16, etc.). Even though IA5String uses a 7-bit character table, we must use 8 bits in aligned PER.

Other Character String Types

The value is encoded the same as OCTET STRING using the same rules as BER.

A::= TeletexString (FROM("AMEX")^SIZE(3))
B::= TeletexString
a A ::= "AXE"  
  --  00000011 01000001 01011000 01000101
b B ::= "AXE"   
  --  00000011 01000001 01011000 01000101

For some older string types, such as TeletexString and VideotexString, supported nonetheless in order to maintain backward compatibility, but are not PER-visible. The UTF8String type is also not PER-visible.

SEQUENCE

A preamble starts the SEQUENCE encoding if there are OPTIONAL or DEFAULT components, or if the ASN.1 type contains an extension marker

A ::= SEQUENCE { a INTEGER,
                 b BOOLEAN,
                 c NULL OPTIONAL}
a A ::= {a 5, b TRUE}

--     0 00000001 00000101 1

We use a bitmap to specify whether optional elements are present. The bitmap is found in the preamble. If there are two optional elements, there are two bits in the bitmap; if there are 20, the bitmap is 20 bits.

In this example, we have only one optional element and so the bitmap is only one bit long. The encoding shows the preamble as 0, meaning the optional element is not present. Following the bitmap, we have the length determinant, and following the length, we have the INTEGER and BOOLEAN elements.

CHOICE

A preamble containing an index always starts the CHOICE encoding to identify which of the components is encoded. CHOICE is sorted by tag before assigning the preamble.

A ::= CHOICE { a INTEGER(4..9),
                 b BOOLEAN,
                 c NULL}
chosen A ::= a:5       --  01 001

Since no tags are found in PER, we need a way to specify within a CHOICE which possibility is taken. We do this by means of a choice index, whose size depends upon the number of possibilities within the choice. In this example there are three possibilities. One can accommodate three possibilities with 2 bits, so the choice index would be 2 bits long.

	00	b	[UNIVERSAL 1]
	01	a	[UNIVERSAL 2]
	10	c	[UNIVERSAL 5]
	11	unused

Note in the encoding the choice index followed by the value chosen.

SET OF and SEQUENCE OF

SEQUENCE OF is encoded with a count preamble that is present unless size is fixed. SET OF is encoded like SEQUENCE OF (in non-canonical PER). The count preamble is basically a length, except that the units are not octets, bits, or characters, but iterations of the elements of a SEQUENCE OF or SET OF.

A ::= SET OF CHOICE { a INTEGER(4..9),
                        b BOOLEAN,
                        c NULL}
chosen A ::= { a:5, b:TRUE, c:NULL}       
  --  00000011   01 001   00 1   10

Related Topics