Parolin 0.7.9 6796
Console (soon DLLs) to do a tar like job
Loading...
Searching...
No Matches
range_common.h File Reference

Common things for range encoder and decoder. More...

#include "common.h"

Go to the source code of this file.

Macros

#define RC_SHIFT_BITS   8
 
#define RC_TOP_BITS   24
 
#define RC_TOP_VALUE   (UINT32_C(1) << RC_TOP_BITS)
 
#define RC_BIT_MODEL_TOTAL_BITS   11
 
#define RC_BIT_MODEL_TOTAL   (UINT32_C(1) << RC_BIT_MODEL_TOTAL_BITS)
 
#define RC_MOVE_BITS   5
 
#define bit_reset(prob)
 
#define bittree_reset(probs, bit_levels)
 

Typedefs

typedef uint16_t probability
 Type of probabilities used with range coder.
 

Detailed Description

Common things for range encoder and decoder.

Macro Definition Documentation

◆ bit_reset

#define bit_reset ( prob)
Value:
prob = RC_BIT_MODEL_TOTAL >> 1
#define RC_BIT_MODEL_TOTAL
Definition range_common.h:28

◆ bittree_reset

#define bittree_reset ( probs,
bit_levels )
Value:
for (uint32_t bt_i = 0; bt_i < (1 << (bit_levels)); ++bt_i) \
bit_reset((probs)[bt_i])
#define bit_reset(prob)
Definition range_common.h:37

◆ RC_BIT_MODEL_TOTAL

#define RC_BIT_MODEL_TOTAL   (UINT32_C(1) << RC_BIT_MODEL_TOTAL_BITS)

◆ RC_BIT_MODEL_TOTAL_BITS

#define RC_BIT_MODEL_TOTAL_BITS   11

◆ RC_MOVE_BITS

#define RC_MOVE_BITS   5

◆ RC_SHIFT_BITS

#define RC_SHIFT_BITS   8

◆ RC_TOP_BITS

#define RC_TOP_BITS   24

◆ RC_TOP_VALUE

#define RC_TOP_VALUE   (UINT32_C(1) << RC_TOP_BITS)

Typedef Documentation

◆ probability

typedef uint16_t probability

Type of probabilities used with range coder.

This needs to be at least 12-bit integer, so uint16_t is a logical choice. However, on some architecture and compiler combinations, a bigger type may give better speed, because the probability variables are accessed a lot. On the other hand, bigger probability type increases cache footprint, since there are 2 to 14 thousand probability variables in LZMA (assuming the limit of lc + lp <= 4; with lc + lp <= 12 there would be about 1.5 million variables).

With malicious files, the initialization speed of the LZMA decoder can become important. In that case, smaller probability variables mean that there is less bytes to write to RAM, which makes initialization faster. With big probability type, the initialization can become so slow that it can be a problem e.g. for email servers doing virus scanning.

I will be sticking to uint16_t unless some specific architectures are much faster (20-50 %) with uint32_t.

This needs to be at least 12-bit integer, so uint16_t is a logical choice. However, on some architecture and compiler combinations, a bigger type may give better speed, because the probability variables are accessed a lot. On the other hand, bigger probability type increases cache footprint, since there are 2 to 14 thousand probability variables in LZMA (assuming the limit of lc + lp <= 4; with lc + lp <= 12 there would be about 1.5 million variables).

With malicious files, the initialization speed of the LZMA decoder can become important. In that case, smaller probability variables mean that there is less bytes to write to RAM, which makes initialization faster. With big probability type, the initialization can become so slow that it can be a problem e.g. for email servers doing virus scanning.

I will be sticking to uint16_t unless some specific architectures are much faster (20-50 %) with uint32_t.

Update in 2024: The branchless C and x86-64 assembly was written so that probability is assumed to be uint16_t. (In contrast, LZMA SDK 23.01 assembly supports both types.)