Combining Length restrictions and N-Best Techniques in multiple-pas search strategies Rodríguez Fonollosa, José Adrián
Blind beamforming for DS-CDMA systems Pérez Palomar, Daniel; Lagunas Hernandez, Miguel A. Code Division Multiple Access (CDMA) has been proposed as an efficient access method for cellular and personal communication systems outperforming the classical FDMA and TDMA techniques. But, in contrast to them, CDMA systems are interference limited, being the interferences mainly the other users. A variety of methods have been proposed to combat this multi-access interference (MAI), such as the correlation detector, the maximum likelihood detector, linear detectors, subtractive interference cancellation detectors. However, spatial diversity can be combined with the existing methods to further increase the reduction of the interference. Many efforts have been spent in this direction during the last years. In this paper, a novel blind beamforming algorithm based on the introduction of an explicit redundancy structure within the spreading codes is described.
A video object generation tool allowing friendly user interaction Marcotegui Iturmendi, Beatriz; Correia Fernandez-Pereira, Paulo; Marqués Acosta, Fernando; Mech, R.; Rosa, R; Wollborn, M; Zanoguera, Francisca In this paper we describe an interactive video object segmentation tool developed in the framework of the ACTS-AC098 MOMUSYS project. The Video Object Generator with User Environment (VOGUE) combines three different sets of automatic and semi-automatic-tool (spatial segmentation, object tracking and temporal segmentation) with general purpose tools for user interaction. The result is an integrated environment allowing the user-assisted segmentation of any sort of video sequences in a friendly and efficient manner.
A proposal for dependent optimization in scalabale region-based coding systems Morros, R; Marqués Acosta, Fernando We address in this paper the problem of optimal coding in the framework of region-based video coding systems, with a special stress on content-based functionalities. We present a coding system that can provide scaled layers (using PSNR or temporal content-based scalability) such that each one has an optimal partition with optimal bit allocation among the resulting regions. This coding system is based on a dependent optimization algorithm that can provide joint optimality for a group of layers or a group of frames.
Representing and retrieving regions using binary partition trees Garrido Ostermann, Luis; Salembier Clairon, Philippe Jean; Casas Pla, Josep Ramon This paper discusses the interest of Binary Partition Trees for image and region representation in the context of indexing and similarity based retrieval. Binary Partition Trees concentrate in a compact and structured way the set of regions that compose an image. Since the tree is able to represent images in a multiresolution way, only simple descriptors need to be attached to the nodes. Moreover, this representation is used for similarity based region retrieval.
Speaker recognition using frequency filtered spectral energies Hernando Pericás, Francisco Javier The spectral parameters that result from filtering the frequency sequence of log mel-scaled filter-bank energies with a simple first or second order FIR filter have proved to be an efficient speech representation in terms of both speech recognition rate and computational load. Recently, the authors have shown that this frequency filtering can approximately equalize the cepstrum variance enhancing the oscillations of the spectral envelope curve that are most effective for discrimination between speakers. Even better speaker identification results than using melcepstrum have been obtained on the TIMIT database, especially when white noise was added. On the other hand, the hybridization of both linear prediction and filter-bank spectral analysis using either cepstral transformation or the alternative frequency filtering has been explored for speaker verification. The combination of hybrid spectral analysis and frequency filtering, that had shown to be able to outperform the conventional techniques in clean and noisy word recognition, has yield good text-dependent speaker verification results on the new speaker-oriented telephone-line POLYCOST database.
A second opinion approach for speech recognition verification Hernández-Ábrego, G; Mariño Acebal, José Bernardo In order to improve the reliability of speech recognition results, a verifying system, that takes profit of the information given from an alternative recognition step is proposed. The alternative results are considered as a second opinion about the nature of the speech recognition process. Some features are extracted from both opinion sources and compiled, through a fuzzy inference system, into a more discriminant confidence measure able to verify correct results and disregard wrong ones. This approach is tested in a keyword spotting task taken form the Spanish SpeechDat database. Results show a considerable reduction of false rejections at a fixed false alarm rate compared to baseline systems.
Minimum confusibility training of context dependent demiphones Nogueiras Rodríguez, Albino; Mariño Acebal, José Bernardo During the last years two different approaches have been widely used in order to improve the acoustic modeling in continuous speech recognition systems: discriminative training algorithms and context dependent subword units. However, while the use of each of these techniques leads to much better results than standard maximum likelihood trained phone models, their combination, i.e. discriminative training of context dependent units, has revealed to be a much more dificult task. In this paper we deal with minimum confusibility training of demiphones using TIMIT database. By applying this approach recently introduced by the authors, the string error rate in the recognition of TIDIGITS using demiphones is reduced some 24% with respect to maximum likelihood training. This improvement is added to the 8% reduction already provided by demiphones with respect to minimum confusibility trained phones.
X-Type Interface for Management of Multidomain Multitechnology Networks Serrat Fernández, Juan The specification and implementation of Xcoop interfaces has received great attention in the last few years. In fact, the appropriate design of this system component is a key aspect for efficient and seamless co-operative management. In this context it is worth mentioning the EURESCOM P408 project and the standards of the European Telecommunication Standards Institute (ETSI) in Europe and the ITU-T and Telemanagement Forum related work worldwide. The Xcoop specification presented in this paper, produced as part of the results of the project MISA co-funded by the Commission of the European Union, is a step ahead in the evolution of this system interface. Distinguished from preceding works, this one allows interactions between management systems independently to the underlying network technology, ATM, SDH or hybrid. This is achieved by defining appropriate functionality and an information model, indeed, where the specific characteristics of ATM and SDH resources are abstracted and merged in common classes.
Fuzzy reasoning in confidence evaluation of speech recognition Hernández-Abrego, G; Mariño Acebal, José Bernardo Confidence measures represent a systematic way to express reliability of speech recognition results. A common approach to confidence measuring is to take profit of the information that several recognition-related features offer and to combine them, through a given compilation mechanism , into a more effective way to distinguish between correct and incorrect recognition results. We propose to use a fuzzy reasoning scheme to perform the information compilation step. Our approach opposes the previously proposed ones because ours treats the uncertainty of recognition hypotheses in terms of
Automatic database acquisition software for ISDN PC cards and analogue boards Rodríguez Fonollosa, José Adrián; Moreno Bilbao, M. Asunción This paper describes an application for automatic speechdatabases acquisition (ADA) developed by the authors in the framework of the EC Telematics Project SpeechDat II. The software is able to work with standard inexpensive PC cards for ISDN lines, as well as Dialogic Boards for analogue telephone lines. Both program versions share a common file format and configuration. Other important characteristics of the recording software are its simple set-up, a fast and flexible configuration of the recording session, the real-time monitoring of calls and disk space, and its proven robustness.
Phoneme recognition with statisticasl modeling of the prediction of the error of neural networks Freitag, Fèlix; Monte Moreno, Enrique This paper presents a speech recognition system which incorporates predictive neural networks. The neural networks are used to predict observation vectors of speech. The prediction error vectors are modeled on the state level by Gaussian densities, which provide the local similarity measure for the Viterbi algorithm during recognition. The system is evaluated on a continuous speech phoneme recognition task. Compared with a HMM reference system, the proposed system obtained better results in the speech recognition experiments.
Fluid-structure interaction of a reed type valve González Acedo, Ignacio; Lehmkuhl Barba, Oriol; Naseri, Alireza; Rigola Serrano, Joaquim; Oliva Llena, Asensio This paper presents a complete numerical procedure to study the fluid-structure interaction problem of incompressible flow through reed valves, typically employed in hermetic reciprocating compressors. A partitioned semi-implicit coupling scheme is implemented, which only strongly couples the added-mass-effect (pressure term) of the fluid to the structure hence, assuring numerical stability and avoiding excessive computational cost. The fluid is solved by a three-dimensional CFD solver using large eddy simulation closures to model the turbulent flow, while the reed valve is described with the classical plate theory and the normal mode summation method. To showcase the potentiality of the proposed methodology, a sensitivity analysis regarding valve thickness is carried out for a given velocity in the feeding channel. Considerable differences, mainly in valve lift and pressure drop, are appreciated between the considered configurations.
Design of single shaped reflector antennas with a single feed applying genetic algorithms and graphical processing techniques
Design of single shaped reflector antennas with a single feed applying genetic algorithms and graphical processing techniques Vall-Llossera Ferran, Mercedes Magdalena; Rius Casals, Juan Manuel; García, M; Duffo Ubeda, Núria A genetic algorithm GA has been de ( ) eloped for designing single-shaped reflector antennas for the synthesis of shaped contour beams. The graphical processing technique is used in order to obtain the antenna radiation patterns ery efficiently. Results comparing with the classical conjugate gradient are included to pro ide alidation. 2000 John Wiley & Sons, Inc. Microwave Opt Technol Lett 27: 358 361, 2000.
Spatial distribution analysis with capture effect of a mobile S-ALOHA network Covarrubias, David; Ruiz Boqué, Sílvia; Huguet, Joan; Olmos Bonafé, Juan José The throughput performance of a mobile S-ALOHA network can be improved considering the capture phenomenon, which also depends on the spatial distribution of the mobiles within the cell. We have studied the capture probabilities that arise in a mobile radio scenario in the presence of fading and shadowing, and considering both, uniform and non-uniform spatial distribution models. In particular we were interested in the limit behaviour of these models, which has been shown to be directly related to the capture probability. This analysis allows a quantitative comparison of three spatial distribution models for mobile users under real mobile channels. The use of an exponential backoff retransmission algorithm is considered. With these assumptions the performance of the anarchic ALOHA is improved considerably obtaining higher throughput values with stabilised behaviour and lower delays values.
Aneto: a tool for prosody analysis of speech Febrer, M; Febrer, A; Bonafonte Cávez, Antonio; Esquerra Llucià, Ignasi The developed tool provides utilities for prosody analysis and labeling of voice signals. It works under Windows 95 and Windows NT environments and uses the Microsoft Win32 application programming interface (API) for audio playing and recording. The application detects the prosody of speech signal and then the original intonation can be stylized in order to observe the pitch contour. Besides, the original intonation can be easily modified and it is possible to resynthesize the voice signal according to the new intonation. Listening to the resynthesized signal, the user can evaluate the results of the prosodic modification. 1 Introduction This application aspires to be a helpful tool for prosody analysis and database labeling in the context of the development of the Text-to-Speech (TTS) system that is being developed at the Universitat Politcnica de Catalunya (UPC) . The UPC-TTS system is a bilingual system able to read text in Spanish and Catalan that works using concate...
Subset selection for multi-Gabor and non-orthogonal wavelets expansions Rebollo-Neira, L; Fernández Rubio, Juan Antonio; Janer, L Non-orthogonal wavelets and Gabor or multi-windows Gabor expansions involving well-localized synthesis/analysis functions are characterized by being redundant. This entails that the signal modeling is carried out through a rank deficient linear transformation and the expansion coefficients are not unique. In the finite dimensional case one solution for the coefficients (which provides the coefficients of minimum norm) is approached by the pseudo-inverse of the concomitant rank deficient transformation. In many applications this makes a great deal of sense. In other applications, however, the model-builder is not interested in a predictor that involves all the redundant factors. Instead, a predictor constructed out of the independent factors is sought. How to pick these factors is a problem of subset selection and we advance a new method for accomplishing such a goal.
The effects of roughness on the boundary layer development of a circular cylinder Rodríguez Pérez, Ivette María; Lehmkuhl Barba, Oriol; Piomelli, Ugo; Chiva Segura, Jorge; Borrell, Ricard; Oliva Llena, Asensio This paper focuses on the effects of surface roughness in the flow past a circular cylinder at different Reynolds numbers. Large eddy simulations of the flow, from subcritical to transcritical Reynolds numbers and at relatively high equivalent sand grain roughness of ks / D = 0:02 are performed. In order to determine the effects of the surface roughness on the boundary layer transition and as a consequence on the wake topology, results are compared to literature available data for the rough and smooth cylinders. Results show that surface roughness triggers the transition to turbulence in the boundary layer at all Reynolds numbers, thus leading to an early separation caused by the increased drag and momentum deficit. In fact, even at subcritical Reynolds numbers boundary layer instabilities are triggered in the roughness sublayer which eventually lead to the transition to turbulence and the separation before the cylinder apex. For the transcritical Reynolds number (i.e. Re = 4:2x105), transition to turbulence is observed in the attached boundary layer. Largest changes in the flow topology are observed at Re = 4:2x105, as the wake is wider than that of the smooth cylinder at these Reynolds numbers, with larger Reynolds stresses along the boundary layer and the near wake.
Conditional maximum likelihood frequency estimation for staggered modulations Riba Sagarra, Jaume; Vázquez Grau, Gregorio The use of spectrally efficient continuous phase modulations for mobile communications may lead to a serious performance degradation of the classical frequency error detectors (FEDs) due to the presence of self-noise. This article presents a new statistically efficient frequency estimation algorithm for staggered modulations. The cancellation of the self-noise is accomplished by the use of the conditional ML principle, well known in the context of array processing, as an alternative to the unconditional ML, typically applied in the communications field. The paper also provides a new Cramer Rao bound (CRB) which is more accurate than the so-called modified CRB (MCRB) extensively applied to synchronization problems.