This dissertation studies saliency and its applications in audio and visual signals. For each portion of the signal, its saliency means the likelihood of attracting bottom-up attention in the perception process. In computer vision, image saliency is described by local contrasts of features.Each local patch is divided into center and surround regions based on their spatial distance to the center of the patch. Differences of features between the two regions are used as the saliency estimation. The existing saliency framework used different methods to measure the center and surround difference and was able to detect image saliency reasonably well. Although these frameworks are suitable for detecting image saliency, they may not be suitable for detecting saliency in signals containing temporal information such as image sequences. In this dissertation, we propose a new saliency framework based on outlierness of the center region comparing to the surround region. Specifically, the surround region is divided into several subregions and the feature distances between the center and surround subregions are computed. The kth nearest distance is used as the outlierness to estimate the saliency of the center region. Based on this framework, we propose a novel image saliency detection algorithm and compare its performance with existing algorithms. This framework is also successfully applied to image sequences to detect foreground in dynamic scenes. Besides foreground detection, we also propose two new applications of saliency detection on images and audio. First, we propose an algorithm of a license plate detection inspired by the observation of license plate being salient. Characters are first located using a segmentation of the intensity saliency map. Some saliency-based features are extracted on the neighborhood of the characters to detect license plates accurately. Second, we propose an algorithm maximizing the saliency of audio spectrograms. This audio visualization enables efficient audio-visual browsing for faster-than-real-time human acoustic event detection. Visual saliency has been used as a metric to evaluate different information visualizations in the literature. In our work we not only formulate a new saliency-based metric for information visualization but also use this metric to automatically enhance the spectrogram.