Erik de Jong & Willem Bouma
Arithmetic Coding
Octree
Compression
Surface Approximation
Child Cells Configurations
Single Child cell configurations
Results
Questions
Assign to every symbol a range from the
interval [0-1]. The size of the range represents
the probability of the symbol occurring.
Example:
A:
B:
C:
D:
60%
20%
10%
10%
[ 0.0,
[ 0.6,
[ 0.8,
[ 0.9,
0.6 )
0.8 )
0.9 )
1.0 )
(end of data symbol)
A
B
C
D
Ranges: A – [0,0.6), B – [0.6,0.8), C – [0.8,0.9), D – [0.9,1)
The example shows the decoding of 0.538.
Huffman coding is a specialized case of arithmetic
coding
Every symbol is converted to a bit sequence of integer
length
Probabilities are rounded to negative powers of two
Advantage: Can decode parts of the input stream
Disadvantage: Arithmetic coding comes much closer
to the optimal entropy encoding
SQUEEL
Estimate/Approximate as much as possible
Store only the differences w.r.t. the estimate
A better estimate ->
smaller numbers
lower entropy
better compression
What is an Octree?
Per cell we store only the occupied child cells
Options:
Store a single byte, each bit representing a child cell
(for example 11001100)
Store the number of occupied cells e and a tupel T
with the indices of the occupied cells
(for example e=4, T={0,1,4,5})
We will approximate/estimate/compress:
The surface
Number of non-empty child cells
Child cell configuration
Index compression
Single child cell configuration
Every level of the octree will yield a preliminary
approximation Q of the complete point cloud P
For a cell that is to be subdivided:
Predict surface F based on Moving Least Squares
(MLS) on k nearest points in Q
Prediction of number of non-empty cells e
Based on estimation of the sampling density ρ
Local sampling density ρi at point pi in P:
k, nearest points
pi, point in the point cloud P
qi, point on the surface approximation Q
Sampling density:
Guess the number of child cells e based on:
The area of the plane F
The sampling density ρ
Quality of prediction
Graphs show the difference between te estimated value and the true value.
(a) The level 5 octree
(b) The level 7 octree
(c) The entire octree.
Given the number of non-empty child cells e
there are only a limited number of
configurations:
We have an array with all weighted possible
configurations, sorted in ascending order
Each configuration of the subdivision is
encoded as an index of the array.
Common configurations get lower weights,
means smaller indices, means lower entropy.
Cell centers tend to be close to F.
To find the weight of a configuration:
Sum up the (L1) distances from the cell centers to F
Index of the configuration in the sorted array
is encoded using arithmetic coding under two
contexts
First context: the octree level of the cell C
Second context: the expressiveness e(F)
e(F) reflects the angle of the plane to the
coordinate directions
In order to use e(F) as a context for arithmetic
coding it has to be quantized.
It has been found that five bins was sufficient
and delivered the best results.
We can exploit an observation for cells that
have only one occupied child cell.
Scanning devices often have a regular
sampling grid.
It is possible to predict samples on the
surface, rather than just close to the surface.
This is relevant for the finer levels in the
octree hierarchy.
For cells with only one child cell we can
predict T based on the nearest neighbours of
points
m, centroid of the k nearest neighbours
Quite suprisingly, the cell center projections
on F that are farthest away are most likely to
be occupied.
The area farther away can be seen as
undersampled.
So a sample in that area becomes more likely
since we expect the surface to be regular and
no undersampling should exist.
The weights for the eight possible
configurations are given as
c(T), cell center of T
prj(F,c(T)), projection of c(T) on F
Extra attributes can be encoded with an
octree
Color
Normals
Compressing color
Same two-step method as with coordinates
Different prediction functions are needed
Model
Number of
points
Raw size
Compressed
(bpp)
Compressed
size
Dragon
2.748.318
31,44 MB
5,06
1,66 MB
Venus
~134.000
1569 KB
11,27
184 KB
Rabbit
~67.000
768 KB
11,37
93 KB
MaleWB
~148.000
1734 KB
8,87
160 KB
(bpp) Bits Per Point
Raw size is assumed to use 3 times 4 Bytes per point.
The octree uses 12 levels. Except for the Dragon for which it is unknown.
(b) uses 1.89 bpp
?