Home My Page Projects MPTK: The Matching Pursuit ToolKit
Summary Activity Forums Tracker Lists Docs News SCM Files

Forum: help

Monitor Forum | Start New Thread Start New Thread
RE: Soem questions about MPTK [ Reply ]
By: Rémi Gribonval on 2008-07-07 09:16
[forum:90325]
Dear Supratim,

I think you got it right in terms of relating our version of MP with the dyadic dictionary of Mallat and Zhang.

To improve the frequency resolution you can over-sample the Gabor dictionary using the fftSize parameter (e.g., for windowLen = 32 use fftSize = 512 if you wish).

To add Fourier atoms on a signal of length N, one way is simply to use Gabor atoms with
-windowtype = rectangular
-windowLen = fftSize = N

Loading your data in binary format directly from a file is feasible but may require minor modifications of the C code. The MP_Signal_c class has a method init(filename) which is used to load a sound file (not necessarily in wave format). According to the documentation of the libsndfile library that we are using, it can load some Matlab data formats, so you may experiment with this. If it does not work there is also a method read_from _float_file to read binary format signal files. Using this method requires changing the code of the mpd command line utility + recompiling.

Using the mpf command line utility you can extract a book which only contains a specific set of atoms. Combining it with mpr you can reconstruct each waveform separately. Alternatively, you can play with the bookedit (beta) matlab script which should enable manipulations of this type with various visualisations of time-frequency plots.

There is a (beta) command mpview which can build a Wigner-Ville type TF representation as described in Mallat and Zhang paper. It does not deal with boundary conditions though.

Thank you for your interest in MPTK, I hope these answers will be of some help!

Best regards,

Remi.


Soem questions about MPTK [ Reply ]
By: Supratim Ray on 2008-06-22 23:29
[forum:78781]
Dear MPTK administrators,

I have read and installed your version of MP (mptk), and I find it
very promising and useful. I have used an old implementation of MP based on Mallat and Zhang's original code, and now trying to upgrade to a faster and more flexible implementations. I have a few general questions about MP. Any feedback will be very helpful.

1. As a starting point, it will be nice to compare your version of MP with Mallat and Zhang's (MZ) original code based on a dyadic
dictionary. I have written such a dictionary for a signal of size 1024 (attached below). Please let me know if the following details are correct:

a. For a dyadic dictionary we have {s,u,w} = {2^j, p2^(j-1), 2pi.k.
2^-(j+1)}. Comparing this with your "MPTK: Matching pursuit made tractable" paper, this would correspond to choosing a Gaussian window with windowLen: 2^j, windowShift: 2^(j-1) and fftSize: 2^(j+1), for 0<j<log2(N). Is this correct?

b. If the previous point is correct, then each window in a MZ gabor atom would be exp(-pi*i*i) with i between -1 and 1. Comparing this with the dsp_windows.c program (assuming a is the window opt parameter), the window is (almost) exp(-1/(2a)*i*i) for i = -1/2 to ½. This corresponds to 1/(8a) = pi, or a = 1/(8*pi) ~ 0.04. The standard value for your implementation is half of this (0.02). Am I doing something wrong?

c. In the Mallat and Zhang's code version I have, to further improve the resolution they used a dyadic dictionary for approximate atom localization and then refined the decomposition with a Newton search on a finer grid depending on the scale (they had better temporal sampling of small scale atoms and better frequency sampling of long scale atoms). In your version, only the time axis can be over-sampled by using a smaller shiftWin size, but not the frequency axis because you are always using the FFT. Is there a way to over-sample the frequency domain (i.e., to have more than N points on the frequency axis for FFT window of size N)?

d. I could not figure out how to add Fourier atoms. I suppose that would correspond to MDCT/MDST dictionaries with some parameters. Can you help me with this?

2. Is it possible to load binary input data directly from a file? At
present I'm converting my data to .wav using wavwrite in matlab, but it would be nice to avoid this extra step.

3. Similarly, it would be nice to reconstruct the time-domain signal and time-frequency spectra of individual atoms. Do you have any code to do that?

4. How do you compute the time-frequency spectrum? The Wigner-Ville (WV) distribution is defined only for continuous signals, and I've seen a few different implementations of the WV distribution for discrete signals. In particular, how do you deal with the boundary conditions? For example, for the atomic decomposition, the signal is considered infinitely long with non-overlapping copies of itself, leading to edge effects near the boundaries (at least in the original MZ version). For the time-frequency map, we could consider a similar tiling, so that the energy would also have similar edge effects (an atom at near t=0 would cause some energy to appear near t = 1024 for a
signal of size N=1024). Do you do any "wrapping" in the time-frequency domain?

Thank you for the time and effort you have put for making this code available.

sincerely,
Supratim Ray



<<<Mallat and Zhang Dictionary (without the Fourier atoms) for a signal of size 1024. >>>

<?xml version="1.0" encoding="ISO-8859-1"?>
<dict>
<libVersion>0.2</libVersion>
<blockproperties name="GAUSS-WINDOW">
<param name="windowtype" value="gauss"/>
<param name="windowopt" value="0.039789"/>
</blockproperties>

<!-- Gabor block for scale 1. -->
<block uses="GAUSS-WINDOW">
<param name="type" value="gabor"/>
<param name="windowLen" value="2"/>
<param name="windowShift" value="1"/>
<param name="fftSize" value="4"/>
</block>


<!-- Gabor block for scale 2. -->
<block uses="GAUSS-WINDOW">
<param name="type" value="gabor"/>
<param name="windowLen" value="4"/>
<param name="windowShift" value="2"/>
<param name="fftSize" value="8"/>
</block>


<!-- Gabor block for scale 3. -->
<block uses="GAUSS-WINDOW">
<param name="type" value="gabor"/>
<param name="windowLen" value="8"/>
<param name="windowShift" value="4"/>
<param name="fftSize" value="16"/>
</block>


<!-- Gabor block for scale 4. -->
<block uses="GAUSS-WINDOW">
<param name="type" value="gabor"/>
<param name="windowLen" value="16"/>
<param name="windowShift" value="8"/>
<param name="fftSize" value="32"/>
</block>


<!-- Gabor block for scale 5. -->
<block uses="GAUSS-WINDOW">
<param name="type" value="gabor"/>
<param name="windowLen" value="32"/>
<param name="windowShift" value="16"/>
<param name="fftSize" value="64"/>
</block>


<!-- Gabor block for scale 6. -->
<block uses="GAUSS-WINDOW">
<param name="type" value="gabor"/>
<param name="windowLen" value="64"/>
<param name="windowShift" value="32"/>
<param name="fftSize" value="128"/>
</block>


<!-- Gabor block for scale 7. -->
<block uses="GAUSS-WINDOW">
<param name="type" value="gabor"/>
<param name="windowLen" value="128"/>
<param name="windowShift" value="64"/>
<param name="fftSize" value="256"/>
</block>


<!-- Gabor block for scale 8. -->
<block uses="GAUSS-WINDOW">
<param name="type" value="gabor"/>
<param name="windowLen" value="256"/>
<param name="windowShift" value="128"/>
<param name="fftSize" value="512"/>
</block>


<!-- Gabor block for scale 9. -->
<block uses="GAUSS-WINDOW">
<param name="type" value="gabor"/>
<param name="windowLen" value="512"/>
<param name="windowShift" value="256"/>
<param name="fftSize" value="1024"/>
</block>


<!-- Dirac block -->
<block>
<param name="type" value="dirac"/>
</block>
</dict>