Lempel-Ziv Complexity (LZ76)

Tldr

Determines the minimum number of unique substrings needed to decompose a string (Lempel1976, Ziv1978, Ziv1977).

Implementation

measures the difficulty of guessing the signal’s value at that point in time: the more difficult, the more information is gained by learning the signal’s actual value

  • if signal preferentially takes some values, it is easier to predict (low entropy)
  • if signal takes all available values with similar frequency (flat histogram), high uncertainty and observing it would be v informative (high entropy)
  • entropy takes into account the relative frequency of values, not the order in which they appear

The original algorithm Lempel1976 breaks up a signal into patterns and uses the number of distinct patterns to quantify the complexity of that signal. Regular sequences with repeating patterns have low LZ, and rich signals with many have high lZ.

Ziv1978 showed this number of patterns can efficiently estimate entropy rate of the process that generated the signal. Given a discrete stochastic process with complexity , entropy rate can be estimated as:

Limitations

  • application of LZ requires discretised data which loses information & introduces artificial nonlinearities Ibanez-Molina2015 Yeh2018 Mediano2023
  • LZ needs to be provided with relatively large windows of data (on order of seconds for EEG) which limits its temporal resolution impossible to do complexity analyses of non-stationary data, including fast time-locked neural events like ERPs. Mediano2023
  • Not possible to do a principled spectral decomposition of LZ, which limits our understanding of the relationship between complexity and power spectrum Mediano2023

Alternatives

complexity via state-space entropy rate


Lempel-Ziv
Psychedelic studies using Lempel-Ziv complexity

References