QCanvas: Implementation and Functions Data clustering QCanvas provides a total of eight popular measures for generating the similarity matrix-i.e., Correlation uncenter, Correlation center, Absolute corr-uncenter, Absolute corrcenter, Spearman rank, Kendall's tau, Euclidean distance, and City-block distance. All of these measures have typically been included among the data clustering methods of previous tools [4]. In QCanvas, the calculation of the similarity matrix is selectively applied to the data for the x-axis and the y-axis independently. Hierarchical clustering is simultaneously carried out based on the established similarity matrices. QCanvas provides diverse algorithms for hierarchical clustering, such as the average method, centroid method, single method, and complete method. QCanvas uses a standard window-based graphical user interface (GUI), providing multiple windows to comparatively visualize patterns of various combinations of similarity matrices and hierarchical clustering methods. This program provides quantitative trees for displaying clustering patterns and similarity measures together. Heatmap optimization for pattern recognition QCanvas software recognizes text-based data in a matrix format. For demonstration purposes, a small microarray gene expression dataset is included in the software package and can be downloaded from the website (http://compbio.sookmyung.ac.kr/~qcanvas). Once the input data are imported into the QCanvas window, a heatmap of the non-clustered data is displayed (Fig. 2A). The user can easily test various data-clustering and tree-building methods on the raw data and interactively select appropriate heatmaps with tree structures (Fig. 2B). The GUI provides various menu-based options to optimize the display of heatmaps, trees, and annotations. The colors, locations, and sizes of the trees and the annotations can be customized in a flexible manner. The scale and color scheme of the heatmaps can also be adjusted in an interactive window. The node colors can be customized for positive, negative, missing, or zero values. The color contrast between nodes can also be interactively adjusted. The overall vertical or horizontal size of a component of a figure can be customized and saved in postscript format for a high-image quality. Data filtering for the selection of major markers Heatmaps that are based on data clustering display the overall profiles of the experimental values for the given samples. QCanvas provides a data-filtering option to selectively display data nodes satisfying a given threshold. In the example shown in Fig. 2C, data points with a 2-fold change (increase or decrease) in gene expression are selectively displayed. In many cases, a dataset includes experimental values and statistical confidence levels together. The option for data filtering in QCanvas is useful for analyzing patterns in the experimental data that are statistically significant. One can filter the heatmap profiles using statistical confidence data that are included in a separate file. In the example shown in Fig. 2D, the gene expression data are filtered based on the p-values for the fold-change. QCanvas can import two separate files together for simultaneous data clustering and filtering. The GUI menu for data filtering enables the pattern analysis to be performed easily, without the need for manual data processing or the use of scripting languages.