This codebook.txt file was generated on 20170507 by Deepashree Sengupta


-------------------
GENERAL INFORMATION
-------------------


1. Title of Dataset: Filtered Audio Clips from Approximate FIR Filters Designed Using the SABER Algorithm 


2. Author Information


  Principal Investigator Contact Information
        Name: Dr. Sachin S. Sapatnekar
           Institution: University of Minnesota
           Address: 200 Union Street SE, Minneapolis, MN 55455
           Email: sachin@umn.edu

  Associate or Co-investigator Contact Information
           Name: Dr. Jiang Hu
           Institution: Texas A&M University
           Address: MS 3128, Department of ECE, Texas A&M University, College Station, TX 77843
           Email: jianghu@tamu.edu

  Associate or Co-investigator Contact Information
        Name: Farhana S. Snigdha
           Institution: University of Minnesota
           Address: 200 Union Street SE, Minneapolis, MN 55455
           Email: sharm304@umn.edu

  Associate or Co-investigator and creator of the dataset Contact Information
           Name: Deepashree Sengupta
           Institution: University of Minnesota
           Address: 200 Union Street SE, Minneapolis, MN 55455
           Email: deepashreesengupta@gmail.com


3. Date of data collection (single date, range, approximate date): 2016-11-17


4. Geographic location of data collection (where was data collected?): University of Minnesota, Twin Cities 


5. Information about funding sources that supported the collection of the data: This work was supported in part by the NSF under awards CCF- 1162267, CCF-1525925, and CCF-1525749, and the 2016 Doctoral Dissertation Fellowship from the University of Minnesota.


--------------------------
SHARING/ACCESS INFORMATION
-------------------------- 


1. Licenses/restrictions placed on the data: CC0 1.0 Universal Public Domain Dedication


2. Links to publications that cite or use the data: Sengupta, Deepashree; Snigdha, Farhana, S.; Hu, Jiang; Sapatnekar, Sachin S., "SABER: Selection of Approximate Bits for the Design of Error Tolerant Circuits", In Proceedings of the ACM/EDAC/IEEE Design Automation Conference, 2017 DOI:10.1145/3061639.3062314 http://dx.doi.org/10.1145/3061639.3062314 (Publication date: June 2017)


3. Was data derived from another source? Yes
           If yes, list source(s): http://marsyasweb.appspot.com/download/data_sets/, Last accessed May 2, 2017


4. Recommended citation for the data: Sengupta, Deepashree; Snigdha, Farhana, S; Hu, Jiang; Sapatnekar, Sachin S. (2017). Filtered Audio Clips from Approximate FIR Filters Designed Using the SABER Algorithm. Retrieved from the Data Repository for the University of Minnesota, https://doi.org/10.13020/D6BP4X.



---------------------
DATA & FILE OVERVIEW
---------------------

The results of using accurate and approximate finite impulse response (FIR) filters on noisy song clips are summarized here. 
 The names of the clips are listed along with their descriptions. 
 For each song/clip (e.g., Country genre), there are four versions:
 
 a. Country_1NOISY - Noisy version of the song downloaded from the GTZAN Genre Collection (marsyas.info/downloads/datasets.html).
 b. Country_2EXACTfilter - Filtered version using exact hardware of the FIR filter
 c. Country_3APPROXifilterBudget100K - Filtered version using approximate hardware of the FIR filter with error variance 100,000.
 d. Country_4APPROXifilterBudget200K - Filtered version using approximate hardware of the FIR filter with error variance 200,000.
 e. Country_5APPROXifilterBudget400K - Filtered version using approximate hardware of the FIR filter with error variance 400,000.
 
 The original songs were processed with low pass filter and 6 seconds of each data was selected for our analysis from each of the eight genres specified. Colored noise (high frequency) was added to the 6 seconds of data to generate the noisy signal. This signal was then passed through four different versions of an order-33 FIR filter to obtain the filtered versions. Each version of the filter corresponds to the exact, and approximate configurations with three different error variance budgets, and designed using the SABER algorithm described in the next section.
 
 While the filtered signal still has some noise, it is within acceptable auditory range of human ears as compared to the noisy signal. The corresponding publication shows how using the approximate version of the FIR filter can lead to power savings compared to the exact version, with minimal compromise on the user experience in terms of the quality of the output.


--------------------------
METHODOLOGICAL INFORMATION
--------------------------


1. Description of methods used for collection/generation of data: 

We developed an algorithm, Selection of Approximate Bits for the Design of Error Tolerant Circuits (SABER), to generate an approximate circuit with the aim of maximizing the number of approximate bits in a circuit (which translates to power/area minimization) so that it uses minimal resources under a specified error budget. Our work demonstrates results on fixed-point integer arithmetic operations.

The key ingredient of any methodology based on approximate design is an accurate quantification of the error injected into a computation by the approximation scheme. We use the variance of this error as the error metric to be constrained within a user-specified budget. We use an analytical expression of this error variance as a function of the total approximation in a circuit.

Let us consider a circuit representing an arithmetic operation with two N-bit operands, producing an N-bit output. An approximate implementation of the hardware unit yields the benefit of using fewer resources than its exact counterpart. Typically some of the least significant bits (LSBs) can be allowed to be erroneous, as this introduces a limited level of approximation. Hence the hardware connected to y LSBs, for example, is approximate, while that connected to the (N-y) most significant bits (MSBs) is accurate. Clearly, the higher the value of y, the greater is the power saving due to the imprecise hardware, although the error is also higher. We use the parameter, y, referred to as the number of approximate LSBs, to quantify the amount of approximation.

The variance of the error at the output of an adder node is formulated empirically as a function of the number of approximate LSBs in it. Using this function, we can compute the error variance of an arithmetic circuit abstracted as a directed acyclic graph (DAG) whose nodes are approximate adders. The detailed derivations can be found in the corresponding publication. The results for the adder DAG can be generalized to DAGs whose nodes contain adders, subtractors, multipliers, and dividers since the fundamental element of these operations is an adder, with shifters being implemented by appropriately routing the outputs of one DAG node to the inputs of others. 

Let us consider a DAG with multiple adder nodes. The power savings increase with increasing levels of approximation in the DAG, and all components of the power savings (dynamic and leakage) are proportional to the number of approximate LSBs. Therefore, the total number of approximate LSBs in the DAG, is a good surrogate objective function that captures the essential trend of power savings, which we aim to maximize. If the total error variance, accumulated as a result of this approximation is constrained by a maximum error variance budget specified by the user, then we can formulate an integer non-linear optimization problem to find the number of approximate LSBs in each node of the DAG. However this is an NP-Hard problem; hence, we develop fast heuristics (the SABER algorithm) to solve this problem. 

SABER begins by relaxing the integer constraint on the optimization problem for which an analytical solution is obtained. Next, special techniques are applied to convert the solution of the relaxed problem to that of the original problem.    

Please refer to the corresponding publication for detailed solution of the optimization problem. 


2. Methods for processing the data: 

The data was generated to evaluate our algorithm on a real-world example, by checking the sound quality of filtered signals from an approximate finite impulse response (FIR) filter. The results of filtering by approximate filters designed through SABER are the audio files uploaded here. 

The signals under study comprise of 150,000 samples of eight different genres of audio clips sampled at their prespecified frequency of 22.05KHz and mixed with a high frequency noise. We constrain the signal to noise ratio (SNR) degradation between an exact filter and an approximate filter to be 50dB, to ensure comfortable loudness and clarity.

The normalized pass band and stop band frequencies of the FIR filter are 0.50 and 0.65, respectively, and the minimum order filter that MATLAB generated, had order=33. The filter coefficients have been scaled by 1024 to facilitate integer arithmetic. All adders have word length of 20 bits, and the multiplications are implemented by array multipliers with add and shift operations. 

To formulate the optimization problem, we need the error variance budget for each audio clip. Since the different genres of music are differently sensitive to approximation in the FIR filter, we select the respective error budgets from the tradeoff plot of SNR degradation versus error variance for each clip as depicted in Fig. 6. We obtain this plot by sweeping the error variance to first obtain different configurations of approximate filters using SABER. We then filter the noisy signal using each such filter and compute the SNR degradation from the accurately filtered signal. Using such a plot, we can select the target error variance for various target SNR degradation values.

The three error variance budgets are chosen to be 100,000, 200,000, and 400,000, corresponding to which three versions of approximate filters are constructed.

The resulting SNR degradation values for the eight audio clips while using the three filters are summarized in the corresponding publication. 

The audio clips along with their filtered counterparts are listed in this archive.


3. Instrument- or software-specific information needed to interpret the data:

The audio files are in the .wav format and can be played using any generic audio player software. The audio data can be read as a vector using the wavread function MATLAB as well.


4. People involved with sample collection, processing, analysis and/or submission: Data authors 


-----------------------------------------
DATA-SPECIFIC INFORMATION: 
-----------------------------------------

The name of each audio clip is listed along with its description. 
For each clip (e.g., Country genre), there are four versions:
 
 a. Country_1NOISY - Noisy version of the song downloaded from the GTZAN Genre Collection (marsyas.info/downloads/datasets.html).
 b. Country_2EXACTfilter - Filtered version using exact hardware of the FIR filter
 c. Country_3APPROXifilterBudget100K - Filtered version using approximate hardware of the FIR filter with error variance 100,000.
 d. Country_4APPROXifilterBudget200K - Filtered version using approximate hardware of the FIR filter with error variance 200,000.
 e. Country_5APPROXifilterBudget400K - Filtered version using approximate hardware of the FIR filter with error variance 400,000.