Skip to content

Qualitative Estimator for Entropy and Correlation in String

License

Notifications You must be signed in to change notification settings

cocconat/Queecs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Queecs

Qualitative Estimator for Entropy and Correlation in String

#Brief therorical introduction to string compression Through Information theory it s demonstrable [1][2] that bzip is a optimal compressor whose compression ratio tend to the entropy rate defined by Shannon. the idea is then verify the structure of the string we re going to analyze: any sequence (except for bernoullian sequence) has an internal structure which can be modellized as a markovian chain, then inside it there are some sequence which can be predicted (in a probability meaning), for repetition or for any internal rule we re not (directly) interested to. Decomposing the sequence in to many blocks and mixing it we will destroy this rules and the compression will be harder, since a structured text will be much more forecastable, then compressable, then another with same symbols in a different position. Queecs recreate a qualitative curve which express how long and how strong theese correlations are.

#how to run Queecs


make
./queecs file.xxx samples token

-samples is how many statistical samples you want to run -token is a identifier for the data file you re creating

Your results are stored in results folder as curve_token.dat

#details the blocks mixer program is written in C++ the process is execute in bash with a foor loop over a list of block dimensions.

I presented this work as my bachelor thesis, and I m happy in make it reusable, if you re interested mail to me.

#Thank you

Cocconat

About

Qualitative Estimator for Entropy and Correlation in String

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published