Gerhard Rigoll

Orcid: 0000-0003-1096-1596

Affiliations:
  • TU Munich, Germany


According to our database1, Gerhard Rigoll authored at least 476 papers between 1986 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
CSANet: Cuboid-Wise Shape Augmentation 3D Object Detector for Occluded Targets.
IEEE Signal Process. Lett., 2024

Spatial-Temporal Multi-Cuts for Online Multiple-Camera Vehicle Tracking.
CoRR, 2024

Unleashing HyDRa: Hybrid Fusion, Depth Consistency and Radar for Unified 3D Perception.
CoRR, 2024

EarlyBird: Early-Fusion for Multi-View Tracking in the Bird's Eye View.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2024

Do We Still Need Non-Maximum Suppression? Accurate Confidence Estimates and Implicit Duplication Modeling with IoU-Aware Calibration.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Visual Complexity in VR: Implications for Cognitive Load.
Proceedings of the IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops, 2024

Visual Perception in VR Training: Impact of Information Transfer Methods.
Proceedings of the IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops, 2024

Multi-Modal Human-Machine Interaction: Joint Optimization of Single Modalities and Automatic Learning of Communication Channel Fusion.
Proceedings of the 19th International Joint Conference on Computer Vision, 2024

Lifting Multi-View Detection and Tracking to the Bird's Eye View.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Optimizing Medical Device Training: The Role of Multi-User VR and Expert Guidance.
Proceedings of the Companion Publication of the 2024 Conference on Computer-Supported Cooperative Work and Social Computing, 2024

2023
Wavelet regularization benefits adversarial training.
Inf. Sci., November, 2023

3-D Localization of Multiagent Systems Under Random Environments Based on Iterative Learning.
IEEE Trans. Control. Netw. Syst., September, 2023

Touching the future of training: investigating tangible interaction in virtual reality.
Frontiers Virtual Real., March, 2023

Explainable Model-Agnostic Similarity and Confidence in Face Verification.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2023

Synthehicle: Multi-Vehicle Multi-Camera Tracking in Virtual Cities.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2023

The Box Size Confidence Bias Harms Your Object Detector.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Collaborative VR: Conveying a Complex Disease and Its Treatment.
Proceedings of the IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops, 2023

Tackling Face Verification Edge Cases: In-Depth Analysis and Human-Machine Fusion Approach.
Proceedings of the 18th International Conference on Machine Vision and Applications, 2023

Enhancing VR Training: Impact of Information Transfer Methods.
Proceedings of the IEEE International Symposium on Mixed and Augmented Reality Adjunct, 2023

Introducing A Framework for Single-Human Tracking Using Event-Based Cameras.
Proceedings of the IEEE International Conference on Image Processing, 2023

Octuplet Loss: Make Face Recognition Robust to Image Resolution.
Proceedings of the 17th IEEE International Conference on Automatic Face and Gesture Recognition, 2023

2022
Susceptibility to Image Resolution in Face Recognition and Training Strategies to Enhance Robustness.
Leibniz Trans. Embed. Syst., 2022

Dissected 3D CNNs: Temporal skip connections for efficient online video processing.
Comput. Vis. Image Underst., 2022

Wavelet Regularization Benefits Adversarial Training.
CoRR, 2022

Face Morphing: Fooling a Face Recognition System Is Simple!
CoRR, 2022

Do You Notice Me? How Bystanders Affect the Cognitive Load in Virtual Reality.
Proceedings of the IEEE Conference on Virtual Reality and 3D User Interfaces, 2022

VR Training: The Unused Opportunity to Save Lives During a Pandemic.
Proceedings of the 2022 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops, 2022

Efficient Active Learning Strategies for Monocular 3D Object Detection.
Proceedings of the 2022 IEEE Intelligent Vehicles Symposium, 2022

Defuse the Training of Risky Tasks: Collaborative Training in XR.
Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, 2022

Towards a Deeper Understanding of Skeleton-based Gait Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

2021
Adversarial joint training with self-attention mechanism for robust end-to-end speech recognition.
EURASIP J. Audio Speech Music. Process., 2021

Image Resolution Susceptibility of Face Recognition Models.
CoRR, 2021

Towards Constructing HMM Structure for Speech Recognition With Deep Neural Fenonic Baseform Growing.
IEEE Access, 2021

Driver Anomaly Detection: A Dataset and Contrastive Learning Approach.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

U-Net based Zero-hour Defect Inspection of Electronic Components and Semiconductors.
Proceedings of the 16th International Joint Conference on Computer Vision, 2021

Induced Local Attention for Transformer Models in Speech Recognition.
Proceedings of the Speech and Computer - 23rd International Conference, 2021

Regularized Forward-Backward Decoder for Attention Models.
Proceedings of the Speech and Computer - 23rd International Conference, 2021

A Global Discriminant Joint Training Framework for Robust Speech Recognition.
Proceedings of the 33rd IEEE International Conference on Tools with Artificial Intelligence, 2021

Gaitgraph: Graph Convolutional Network for Skeleton-Based Gait Recognition.
Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Attention-Based Partial Face Recognition.
Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Face Aggregation Network For Video Face Recognition.
Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Face Texture Generation And Identity-Preserving Rectification.
Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

Lightweight Multi-Branch Network For Person Re-Identification.
Proceedings of the 2021 IEEE International Conference on Image Processing, 2021

How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Cross-Quality LFW: A Database for Analyzing Cross- Resolution Image Face Recognition in Unconstrained Environments.
Proceedings of the 16th IEEE International Conference on Automatic Face and Gesture Recognition, 2021

A Coarse-to-Fine Dual Attention Network for Blind Face Completion.
Proceedings of the 16th IEEE International Conference on Automatic Face and Gesture Recognition, 2021

VR-based Equipment Training for Health Professionals.
Proceedings of the CHI '21: CHI Conference on Human Factors in Computing Systems, 2021

2020
Online Dynamic Hand Gesture Recognition Including Efficiency Analysis.
IEEE Trans. Biom. Behav. Identity Sci., 2020

Synchronized Forward-Backward Transformer for End-to-End Speech Recognition.
Proceedings of the Speech and Computer - 22nd International Conference, 2020

CTC-Segmentation of Large Corpora for German End-to-End Speech Recognition.
Proceedings of the Speech and Computer - 22nd International Conference, 2020

Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech Recognition.
Proceedings of the Speech and Computer - 22nd International Conference, 2020

MP3 Compression to Diminish Adversarial Noise in End-to-End Speech Recognition.
Proceedings of the Speech and Computer - 22nd International Conference, 2020

Lightweight End-to-End Speech Recognition from Raw Audio Data Using Sinc-Convolutions.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Comparative Analysis of CNN-Based Spatiotemporal Reasoning in Videos.
Proceedings of the Pattern Recognition. ICPR International Workshops and Challenges, 2020

Deep Attention Based Semi-supervised 2D-Pose Estimation for Surgical Instruments.
Proceedings of the Pattern Recognition. ICPR International Workshops and Challenges, 2020

Small-Footprint Keyword Spotting on Raw Audio Data with Sinc-Convolutions.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

DriverMHG: A Multi-Modal Dataset for Dynamic Recognition of Driver Micro Hand Gestures and a Real-Time Recognition Framework.
Proceedings of the 15th IEEE International Conference on Automatic Face and Gesture Recognition, 2020

Attention Fusion for Audio-Visual Person Verification Using Multi-Scale Features.
Proceedings of the 15th IEEE International Conference on Automatic Face and Gesture Recognition, 2020

A Multi-Task Comparator Framework for Kinship Verification.
Proceedings of the 15th IEEE International Conference on Automatic Face and Gesture Recognition, 2020

2019
A dual CNN-RNN for multiple people tracking.
Neurocomputing, 2019

Person identification from partial gait cycle using fully convolutional neural networks.
Neurocomputing, 2019

You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization.
CoRR, 2019

On Flow Profile Image for Video Representation.
CoRR, 2019

A Simulation for Examining the Effects of Inaccurate Head Tracking on Drivers of Vehicles with Transparent Cockpit Projections.
Proceedings of the IEEE Conference on Virtual Reality and 3D User Interfaces, 2019

Acceptance and User Experience of Driving with a See-Through Cockpit in a Narrow-Space Overtaking Scenario.
Proceedings of the IEEE Conference on Virtual Reality and 3D User Interfaces, 2019

Deep Neural Network Quantizers Outperforming Continuous Speech Recognition Systems.
Proceedings of the Speech and Computer - 21st International Conference, 2019

Exploring Hybrid CTC/Attention End-to-End Speech Recognition with Gaussian Processes.
Proceedings of the Speech and Computer - 21st International Conference, 2019

Exploring the Use of Augmented Reality Interfaces for Driver Assistance in Short-Notice Takeovers.
Proceedings of the 2019 IEEE Intelligent Vehicles Symposium, 2019

Real-Time Driver State Monitoring Using a CNN Based Spatio-Temporal Approach.
Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference, 2019

VisMAP: Visual Mining of Attribute-Based Access Control Policies.
Proceedings of the Information Systems Security - 15th International Conference, 2019

Convolutional Neural Networks with Layer Reuse.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Outlier-Robust Neural Aggregation Network for Video Face Identification.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Gait Energy Image Restoration Using Generative Adversarial Networks.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Talking With Your Hands: Scaling Hand Gestures and Recognition With CNNs.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Resource Efficient 3D Convolutional Neural Networks.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Corneal-Reflection-Based Wide Range Gaze Tracking for a Car.
Proceedings of the Human Interface and the Management of Information. Information in Intelligent Systems, 2019

Real-time Hand Gesture Detection and Classification Using Convolutional Neural Networks.
Proceedings of the 14th IEEE International Conference on Automatic Face & Gesture Recognition, 2019

2018
A deep convolutional neural network for video sequence background subtraction.
Pattern Recognit., 2018

Catch My Drift: Elevating Situation Awareness for Highly Automated Driving with an Explanatory Windshield Display User Interface.
Multimodal Technol. Interact., 2018

Multiple People Tracking Using Hierarchical Deep Tracklet Re-identification.
CoRR, 2018

Person Identification from Partial Gait Cycle Using Fully Convolutional Neural Network.
CoRR, 2018

Supporting Driver Situation Awareness for Autonomous Urban Driving with an Augmented-Reality Windshield Display.
Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, 2018

An Explanatory Windshield Display Interface with Augmented Reality Elements for Urban Autonomous Driving.
Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, 2018

Analysis on Temporal Dimension of Inputs for 3D Convolutional Neural Networks.
Proceedings of the IEEE International Conference on Image Processing, 2018

Occlusion Handling in Tracking Multiple People Using RNN.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Gait Recognition from Incomplete Gait Cycle.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Gait Energy Image Reconstruction from Degraded Gait Cycle Using Deep Learning.
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Robust Facial Landmark Detection via a Fully-Convolutional Local-Global Context Network.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Motion Fused Frames: Data Level Fusion Strategy for Hand Gesture Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

2017
Combined segmentation, reconstruction, and tracking of multiple targets in multi-view video sequences.
Comput. Vis. Image Underst., 2017

A Deep Convolutional Neural Network for Background Subtraction.
CoRR, 2017

A diminished reality simulation for driver-car interaction with transparent cockpits.
Proceedings of the 2017 IEEE Virtual Reality, 2017

Multi-view human activity recognition using motion frequency.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

Joint tracking and gait recognition of multiple people in video.
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017

View-Invariant Gait Representation Using Joint Bayesian Regularized Non-negative Matrix Factorization.
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

Improving Facial Landmark Detection via a Super-Resolution Inception Network.
Proceedings of the Pattern Recognition - 39th German Conference, 2017

GazeEverywhere: Enabling Gaze-only User Interaction on an Unmodified Desktop PC in Everyday Scenarios.
Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 2017

Examining the Impact of See-Through Cockpits on Driving Performance in a Mixed Reality Prototype.
Proceedings of the Adjunct Proceedings of the 9th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, 2017

2016
Immersive Interactive SAR Image Representation Using Non-negative Matrix Factorization.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2016

Toward semantic attributes in dictionary learning and non-negative matrix factorization.
Pattern Recognit. Lett., 2016

Immersive visualization of visual data using nonnegative matrix factorization.
Neurocomputing, 2016

Discriminative Nonnegative Matrix Factorization for dimensionality reduction.
Neurocomputing, 2016

Blending Entropy: A Term for Addressing Information Density in Mediated Reality.
CoRR, 2016

Mono camera multi-view diminished reality.
Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision, 2016

Capturing facial videos with Kinect 2.0: A multithreaded open source tool and database.
Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision, 2016

Exploring floating stereoscopic driver-car interfaces with wide field-of-view in a mixed reality simulation.
Proceedings of the 22nd ACM Conference on Virtual Reality Software and Technology, 2016

RayOnPlane: A translation technique minimizing gesture size.
Proceedings of the 2016 IEEE Virtual Reality, 2016

Comparison of mobile touch interfaces for object identification and troubleshooting tasks in augmented reality.
Proceedings of the 2016 IEEE Virtual Reality, 2016

Multi-view gait recognition using 3D convolutional neural networks.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Wavelet contrast-based image inpainting with sparsity-driven initialization.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Relative attribute guided dictionary learning.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Sensor Fusion for Sparse SLAM with Descriptor Pooling.
Proceedings of the Computer Vision - ECCV 2016 Workshops, 2016

Pixel Level Tracking of Multiple Targets in Crowded Environments.
Proceedings of the Computer Vision - ECCV 2016 Workshops, 2016

Biomechanics of Thumb Touch Gestures on Handheld Devices.
Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 2016

SPOCK: A Smooth Pursuit Oculomotor Control Kit.
Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 2016

2015
Visualization-Based Active Learning for the Annotation of SAR Images.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2015

Modelling, synthesis and characterisation of occlusion in videos.
IET Comput. Vis., 2015

Multimodal Human-Robot Interaction from the Perspective of a Speech Scientist.
Proceedings of the Speech and Computer - 17th International Conference, 2015

Interactive feature learning from SAR image patches.
Proceedings of the 2015 IEEE International Geoscience and Remote Sensing Symposium, 2015

Subjective and objective evaluation of image inpainting quality.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Attribute constrained subspace learning.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Photorealistic Face Transfer in 2D and 3D Video.
Proceedings of the Pattern Recognition - 37th German Conference, 2015

Off-the-shelf sensor integration for mono-SLAM on smart devices.
Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2015

Cross-corpus acoustic emotion recognition: Variances and strategies (Extended abstract).
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

Impact of annotation dimensionality under variable task complexity in remote guidance.
Proceedings of the 2015 IEEE Symposium on 3D User Interfaces, 2015

2014
Memory-Enhanced Neural Networks and NMF for Robust ASR.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

The TUM Gait from Audio, Image and Depth (GAID) database: Multimodal recognition of subjects and traits.
J. Vis. Commun. Image Represent., 2014

Feature enhancement by deep LSTM networks for ASR in reverberant multisource environments.
Comput. Speech Lang., 2014

A Broadcast News Corpus for Evaluation and Tuning of German LVCSR Systems.
CoRR, 2014

Supporting remote guidance through 3D annotations.
Proceedings of the 2nd ACM Symposium on Spatial User Interaction, 2014

Simulator for developing gaze sensitive environment using corneal reflection-based remote gaze tracker.
Proceedings of the 2nd ACM Symposium on Spatial User Interaction, 2014

On the Influence of Alcohol Intoxication on Speaker Recognition.
Proceedings of the AES International Conference on Semantic Audio 2014, 2014

Touch gestures for improved 3D object manipulation in mobile augmented reality.
Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, 2014

Creating automatically aligned consensus realities for AR videoconferencing.
Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, 2014

Robust speech recognition using long short-term memory recurrent neural networks for hybrid acoustic modelling.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Investigating NMF speech enhancement for neural network based acoustic models.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Impact of Coordinate Systems on 3D Manipulations in Mobile Augmented Reality.
Proceedings of the 16th International Conference on Multimodal Interaction, 2014

Acoustic Gait-based Person Identification using Hidden Markov Models.
Proceedings of the 2014 Workshop on Mapping Personality Traits Challenge and Workshop, 2014

PID-based regulation of background dynamics for foreground segmentation.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Farness preserving Non-negative matrix factorization.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Augmented Reality Evaluation: A Concept Utilizing Virtual Reality.
Proceedings of the Virtual, Augmented and Mixed Reality. Designing and Developing Virtual and Augmented Environments, 2014

Evaluation of Industrial Touch Interfaces Using a Modular Software Architecture.
Proceedings of the Human-Computer Interaction. Theories, Methods, and Tools, 2014

Don't Walk into Walls: Creating and Visualizing Consensus Realities for Next Generation Videoconferencing.
Proceedings of the Virtual, Augmented and Mixed Reality. Designing and Developing Virtual and Augmented Environments, 2014

Comparing the information extracted by feature descriptors from EO images using Huffman coding.
Proceedings of the 12th International Workshop on Content-Based Multimedia Indexing, 2014

Locally Linear Salient Coding for image classification.
Proceedings of the 12th International Workshop on Content-Based Multimedia Indexing, 2014

2013
Keyword spotting exploiting Long Short-Term Memory.
Speech Commun., 2013

LSTM-Modeling of continuous emotions in an audiovisual affect recognition framework.
Image Vis. Comput., 2013

Towards using covariance matrix pyramids as salient point descriptors in 3D point clouds.
Neurocomputing, 2013

Using Segmented 3D Point Clouds for Accurate Likelihood Approximation in Human Pose Tracking.
Int. J. Comput. Vis., 2013

Noise robust ASR in reverberated multisource environments applying convolutive NMF and Long Short-Term Memory.
Comput. Speech Lang., 2013

Multimodale Interaktion auf einer sozialen Roboterplattform.
Autom., 2013

Large-scale audio feature extraction and SVM for acoustic scene classification.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2013

Programming concept for an industrial HRI packaging cell.
Proceedings of the IEEE International Symposium on Robot and Human Interactive Communication, 2013

Classification of images in fog and fog-free scenes for use in vehicles.
Proceedings of the 2013 IEEE Intelligent Vehicles Symposium (IV), 2013

A multi lane Car Following Model for cooperative ADAS.
Proceedings of the 16th International IEEE Conference on Intelligent Transportation Systems, 2013

Detecting overlapping speech with long short-term memory recurrent neural networks.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Using linguistic information to detect overlapping speech.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Exploiting gradient histograms for gait-based person identification.
Proceedings of the IEEE International Conference on Image Processing, 2013

Feature enhancement by bidirectional LSTM networks for conversational speech recognition in highly non-stationary noise.
Proceedings of the IEEE International Conference on Acoustics, 2013

Probabilistic asr feature extraction applying context-sensitive connectionist temporal classification networks.
Proceedings of the IEEE International Conference on Acoustics, 2013

Gait-based person identification by spectral, cepstral and energy-related audio features.
Proceedings of the IEEE International Conference on Acoustics, 2013

iProgram: intuitive programming of an industrial hri cell.
Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction, 2013

Hypergraphs for Joint Multi-view Reconstruction and Multi-object Tracking.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Assessment of dimensionality reduction based on communication channel model; application to immersive information visualization.
Proceedings of the 2013 IEEE International Conference on Big Data (IEEE BigData 2013), 2013

Immersive Interactive Information Mining with Application to Earth Observation Data Retrieval.
Proceedings of the Availability, Reliability, and Security in Information Systems and HCI, 2013

2012
Depth Inpainting with Tensor Voting using Local Geometry.
Proceedings of the VISAPP 2012, 2012

Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature Sets.
Proceedings of the Working Notes Proceedings of the MediaEval 2012 Workshop, 2012

Image based fog detection in vehicles.
Proceedings of the 2012 IEEE Intelligent Vehicles Symposium, 2012

Interface design for an inexpensive hands-free collaborative videoconferencing system.
Proceedings of the 11th IEEE International Symposium on Mixed and Augmented Reality, 2012

Temporal and Situational Context Modeling for Improved Dominance Recognition in Meetings.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Convolutive Non-Negative Sparse Coding and New Features for Speech Overlap Handling in Speaker Diarization.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Improving generalisation and robustness of acoustic affect recognition.
Proceedings of the International Conference on Multimodal Interaction, 2012

Improved Gait Recognition using Gradient Histogram Energy Image.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Combined face and gait recognition using alpha matte preprocessing.
Proceedings of the 5th IAPR International Conference on Biometrics, 2012

Non-negative matrix factorization for highly noise-robust ASR: To enhance or to recognize?
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Speech overlap detection and attribution using convolutive non-negative sparse coding.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Speech overlap detection using convolutive non-negative sparse coding: New improvements and insights.
Proceedings of the 20th European Signal Processing Conference, 2012

Background segmentation with feedback: The Pixel-Based Adaptive Segmenter.
Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012

2.5D gait biometrics using the Depth Gradient Histogram Energy Image.
Proceedings of the IEEE Fifth International Conference on Biometrics: Theory, 2012

Fully Automatic Audiovisual Emotion Recognition: Voice, Words, and the Face.
Proceedings of the 10th ITG Conference on Speech Communication, 2012

2011
Occlusion detection and gait silhouette reconstruction from degraded scenes.
Signal Image Video Process., 2011

Dense point-to-point correspondences between 3D faces using parametric remeshing for constructing 3D Morphable Models.
Proceedings of the IEEE Workshop on Applications of Computer Vision (WACV 2011), 2011

Identification and Reconstruction of Complete Gait Cycles for Person Identification in Crowded Scenes.
Proceedings of the VISAPP 2011, 2011

Event Detection in a Smart Home Environment using Viterbi Filtering and Graph Cuts in a 3D Voxel Occupancy Grid.
Proceedings of the VISAPP 2011, 2011

Gaze-based interaction on multiple displays in an automotive environment.
Proceedings of the IEEE International Conference on Systems, 2011

A large-scale LED array to support anticipatory driving.
Proceedings of the IEEE International Conference on Systems, 2011

Semantic Speech Tagging: Towards Combined Analysis of Speaker Traits.
Proceedings of the AES International Conference Semantic Audio 2011, 2011

Classification and Quantification of Occlusion Using Hidden Markov Model.
Proceedings of the Pattern Recognition and Machine Intelligence, 2011

Feature Frame Stacking in RNN-Based Tandem ASR Systems - Learned vs. Predefined Context.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Using Multiple Databases for Training in Emotion Recognition: To Unite or to Vote?
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Learning New Acoustic Events in an HMM-Based System Using MAP Adaptation.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Dense point-to-point correspondences between 3D faces with large variations for constructing 3D Morphable Models.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

A multi-stream ASR framework for BLSTM modeling of conversational speech.
Proceedings of the IEEE International Conference on Acoustics, 2011

Localization of non-linguistic events in spontaneous speech by Non-Negative Matrix Factorization and Long Short-Term Memory.
Proceedings of the IEEE International Conference on Acoustics, 2011

Impact and Modeling of Driver Behavior Due to Cooperative Assistance Systems.
Proceedings of the Digital Human Modeling, 2011

Late fusion for person detection in camera networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2011

A novel bottleneck-BLSTM front-end for feature-level context modeling in conversational speech recognition.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010
Cross-Corpus Acoustic Emotion Recognition: Variances and Strategies.
IEEE Trans. Affect. Comput., 2010

Combining Long Short-Term Memory and Dynamic Bayesian Networks for Incremental Emotion-Sensitive Artificial Listening.
IEEE J. Sel. Top. Signal Process., 2010

Determination of Nonprototypical Valence and Arousal in Popular Music: Features and Performances.
EURASIP J. Audio Speech Music. Process., 2010

Bidirectional LSTM Networks for Context-Sensitive Keyword Detection in a Cognitive Virtual Agent Framework.
Cogn. Comput., 2010

Multimodal Interaction: Methods and Applications for Joint Cooperation between Humans and Cognitive Systems.
Proceedings of the Neural Nets WIRN10, 2010

Tracking of Facial Feature Points by Combining Singular Tracking Results with a 3D Active Shape Model.
Proceedings of the VISAPP 2010 - Proceedings of the Fifth International Conference on Computer Vision Theory and Applications, Angers, France, May 17-21, 2010, 2010

3d gesture recognition applying long short-term memory and contextual knowledge in a CAVE.
Proceedings of the 1st ACM international workshop on Multimodal pervasive video analysis, 2010

Vocalist Gender Recognition in Recorded Popular Music.
Proceedings of the 11th International Society for Music Information Retrieval Conference, 2010

Recognition of spontaneous conversational speech using long short-term memory phoneme predictions.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

GMM-UBM based open-set online speaker diarization.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Registration of 3D facial surfaces using covariance matrix pyramids.
Proceedings of the IEEE International Conference on Robotics and Automation, 2010

Graphical Models for real-time capable gesture recognition.
Proceedings of the International Conference on Image Processing, 2010

Tracking using Bayesian inference with a two-layer Graphical Model.
Proceedings of the International Conference on Image Processing, 2010

Cue-independent extending inverse kinematics for robust pose estimation in 3D point clouds.
Proceedings of the International Conference on Image Processing, 2010

Robust tracking of facial feature points with 3D Active Shape Models.
Proceedings of the International Conference on Image Processing, 2010

Selecting Features Using the SFS in Conjunction with Vector Quantization.
Proceedings of the International Conference on Frontiers in Handwriting Recognition, 2010

Optimizing the Number of States for HMM-Based On-line Handwritten Whiteboard Recognition.
Proceedings of the International Conference on Frontiers in Handwriting Recognition, 2010

Spoken term detection with Connectionist Temporal Classification: A novel hybrid CTC-DBN decoder.
Proceedings of the IEEE International Conference on Acoustics, 2010

Non-negative matrix factorization as noise-robust feature extractor for speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010

Depth gradient based segmentation of overlapping foreground objects in range images.
Proceedings of the 13th Conference on Information Fusion, 2010

A Graphical Model for unifying tracking and classification within a multimodal Human-Robot Interaction scenario.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2010

Automated pose estimation in 3D point clouds applying annealing particle filters and inverse kinematics on a GPU.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2010

Dense spatio-temporal motion segmentation for tracking multiple self-occluding people.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2010

Multiple Parallel Vision-Based Recognition in a Real-Time Framework for Human-Robot-Interaction Scenarios.
Proceedings of the ACHI 2010, 2010

2009
Novel script line identification method for script normalization and feature extraction in on-line handwritten whiteboard note recognition.
Pattern Recognit., 2009

Being bored? Recognising natural interest by extensive audiovisual integration for real-life application.
Image Vis. Comput., 2009

A multidimensional dynamic time warping algorithm for efficient multimodal fusion of asynchronous data streams.
Neurocomputing, 2009

Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement.
EURASIP J. Audio Speech Music. Process., 2009

Non-rigid registration of 3D facial surfaces with robust outlier detection.
Proceedings of the IEEE Workshop on Applications of Computer Vision (WACV 2009), 2009

Applying Bayes Markov chains for the detection of ATM related scenarios.
Proceedings of the IEEE Workshop on Applications of Computer Vision (WACV 2009), 2009

Using Liquid Lenses to Extend the Operating Range of a Remote Gaze Tracking System.
Proceedings of the IEEE International Conference on Systems, 2009

Improving Keyword Spotting with a Tandem BLSTM-DBN Architecture.
Proceedings of the Advances in Nonlinear Speech Processing, 2009

Using graphical models for mixed-initiative dialog management systems with realtime Policies.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Recognising interest in conversational speech - comparing bag of frames and supra-segmental features.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Multimodal data communication for human-robot interactions.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Audio chord labeling by musiological modeling and beat-synchronization.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Novel VQ with constraints on the quantization error distribution.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Boosting multi-modal camera selection with semantic features.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Learning weighted similarity measurements for unconstrained face recognition.
Proceedings of the International Conference on Image Processing, 2009

Graphical models for multi-modal automatic video editing in meetings.
Proceedings of the 16th International Conference on Digital Signal Processing, 2009

Resolving partial occlusions in crowded environments utilizing range data and video cameras.
Proceedings of the 16th International Conference on Digital Signal Processing, 2009

A hierarchical approach for visual suspicious behavior detection in aircrafts.
Proceedings of the 16th International Conference on Digital Signal Processing, 2009

"The Godfather" vs. "Chaos": Comparing Linguistic Analysis Based on On-line Knowledge Sources and Bags-of-N-Grams for Movie Review Valence Estimation.
Proceedings of the 10th International Conference on Document Analysis and Recognition, 2009

Selecting Features in On-Line Handwritten Whiteboard Note Recognition: SFS or SFFS?
Proceedings of the 10th International Conference on Document Analysis and Recognition, 2009

GMs in On-Line Handwritten Whiteboard Note Recognition: The Influence of Implementation and Modeling.
Proceedings of the 10th International Conference on Document Analysis and Recognition, 2009

Robust discriminative keyword spotting for emotionally colored spontaneous speech using bidirectional LSTM networks.
Proceedings of the IEEE International Conference on Acoustics, 2009

Voronoi cell shaping for feature selection with discrete HMMs.
Proceedings of the IEEE International Conference on Acoustics, 2009

Graphical Models: Statistical inference vs. determination.
Proceedings of the IEEE International Conference on Acoustics, 2009

Multi-modal activity and dominance detection in smart meeting rooms.
Proceedings of the IEEE International Conference on Acoustics, 2009

Statistics-Based Cognitive Human-Robot Interfaces for Board Games - Let's Play!
Proceedings of the Human Interface and the Management of Information. Information and Interaction, 2009

Using Graphical Models for an Intelligent Mixed-Initiative Dialog Management System.
Proceedings of the Human Interface and the Management of Information. Information and Interaction, 2009

cfHMI: A Novel Contact-Free Human-Machine Interface.
Proceedings of the Human-Computer Interaction. Novel Interaction Methods and Techniques, 2009

Guiding a Driver's Visual Attention Using Graphical and Auditory Animations.
Proceedings of the Engineering Psychology and Cognitive Ergonomics, 2009

Agent-Based Driver Abnormality Estimation.
Proceedings of the Human-Computer Interaction. Ambient, 2009

Using 3D Touch Interaction for a Multimodal Zoomable User Interface.
Proceedings of the Human Interface and the Management of Information. Designing Information Environments, 2009

Did I Get It Right: Head Gestures Analysis for Human-Machine Interactions.
Proceedings of the Human-Computer Interaction. Novel Interaction Methods and Techniques, 2009

A Multimodal Human-Robot-Interaction Scenario: Working Together with an Industrial Robot.
Proceedings of the Human-Computer Interaction. Novel Interaction Methods and Techniques, 2009

Robust vocabulary independent keyword spotting with graphical models.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

Acoustic emotion recognition: A benchmark comparison of performances.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008
Tango or Waltz?: Putting Ballroom Dance Style into Tempo Detection.
EURASIP J. Audio Speech Music. Process., 2008

Low-Level Fusion of Audio, Video Feature for Multi-Modal Emotion Recognition.
Proceedings of the VISAPP 2008: Proceedings of the Third International Conference on Computer Vision Theory and Applications, Funchal, Madeira, Portugal, January 22-25, 2008, 2008

Translation and rotation of virtual objects in Augmented Reality: A comparison of interaction devices.
Proceedings of the IEEE International Conference on Systems, 2008

How infrared tracking increases the realism of multi-person videoconferencing in collaborative Augmented Reality.
Proceedings of the IEEE International Conference on Systems, 2008

Emotion sensitive speech control for human-robot interaction in minimal invasive surgery.
Proceedings of the 17th IEEE International Symposium on Robot and Human Interactive Communication, 2008

On the Influence of Phonetic Content Variation for Acoustic Emotion Recognition.
Proceedings of the Perception in Multimodal Dialogue Systems, 2008

Static and Dynamic Modelling for the Recognition of Non-verbal Vocalisations in Conversational Speech.
Proceedings of the Perception in Multimodal Dialogue Systems, 2008

Balancing spoken content adaptation and unit length in the recognition of emotion and interest.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Combining statistical and syntactical systems for spoken language understanding with graphical models.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Prosodic and spectral features within segment-based acoustic modeling.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Speech recognition in noisy environments using a switching linear dynamic model for feature enhancement.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Detection of security related affect and behaviour in passenger transport.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Edge-preserving unscented Kalman filter for speckle reduction.
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

Neural net vector quantizers for discrete HMM-based on-line handwritten whiteboard-note recognition.
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

Combining speech recognition and acoustic word emotion models for robust text-independent emotion recognition.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

A multi-step alignment scheme for face recognition in range images.
Proceedings of the International Conference on Image Processing, 2008

Omnidirectional tracking and recognition of persons in planar views.
Proceedings of the International Conference on Image Processing, 2008

Applying multi layer homography for multi camera person tracking.
Proceedings of the 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras, 2008

Brute-forcing hierarchical functionals for paralinguistics: A waste of feature space?
Proceedings of the IEEE International Conference on Acoustics, 2008

Omni-directional multiperson tracking in meeting scenarios combining simulated annealing and particle filtering.
Proceedings of the 8th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2008), 2008

Contact-analog information representation in an automotive head-up display.
Proceedings of the Eye Tracking Research & Application Symposium, 2008

Switching Linear Dynamic Models for Noise Robust In-Car Speech Recognition.
Proceedings of the Pattern Recognition, 2008

Novel VQ Designs for Discrete HMM On-Line Handwritten Whiteboard Note Recognition.
Proceedings of the Pattern Recognition, 2008

Resolution Enhancement of PMD Range Maps.
Proceedings of the Pattern Recognition, 2008

In-car interaction using search-based user interfaces.
Proceedings of the 2008 Conference on Human Factors in Computing Systems, 2008

Music Thumbnailing Incorporating Harmony- and Rhythm Structure.
Proceedings of the Adaptive Multimedia Retrieval. Identifying, 2008

A novel sensor system for 3D face scanning based on infrared coded light.
Proceedings of the Conference on Three-Dimensional Image Capture and Applications 2008, 2008

2007
Context-aware kitchen utilities.
Proceedings of the 1st International Conference on Tangible and Embedded Interaction 2007, 2007

H-MMI - Interaktionskonzept für variable Daten und Funktionen.
Proceedings of the Mensch & Computer 2007 Workshopband, 2007

3D Face Scanning Systems Based on Invisible Infrared Coded Light.
Proceedings of the Advances in Visual Computing, Third International Symposium, 2007

Combining frame and turn-level information for robust recognition of emotions within speech.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Audiovisual recognition of spontaneous interest within conversations.
Proceedings of the 9th International Conference on Multimodal Interfaces, 2007

Surveillance and Activity Recognition with Depth Information.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Adaptive Human-Machine Interfaces in Cognitive Production Environments.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Hidden Conditional Random Fields for Meeting Segmentation.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Wearable Assistance for the Ballroom-Dance Hobbyist - Holistic Rhythm Analysis and Dance-Style Classification.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

A Framework for Modular Signal Processing Systems with High-Performance Requirements.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Suspicious Behavior Detection in Public Transport by Fusion of Low-Level Video Descriptors.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Automatic Multi-Modal Meeting Camera Selection for Video-Conferences and Meeting Browsers.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Eye Gaze Studies Comparing Head-Up and Head-Down Displays in Vehicles.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Improved Image Segmentation using Photonic Mixer Devices.
Proceedings of the International Conference on Image Processing, 2007

Robust Multi-Modal Group Action Recognition in Meetings from Disturbed Videos with the Asynchronous Hidden Markov Model.
Proceedings of the International Conference on Image Processing, 2007

Fast and Robust Meter and Tempo Recognition for the Automatic Discrimination of Ballroom Dance Styles.
Proceedings of the IEEE International Conference on Acoustics, 2007

Audiovisual Behavior Modeling by Combined Feature Spaces.
Proceedings of the IEEE International Conference on Acoustics, 2007

Static and Dynamic Hand-Gesture Recognition for Augmented Reality Applications.
Proceedings of the Human-Computer Interaction. HCI Intelligent Multimodal Interaction Environments, 2007

A Multifunctional VR-Simulator Platform for the Evaluation of Automotive User Interfaces.
Proceedings of the Human-Computer Interaction. HCI Applications and Services, 2007

Context-Aware Information Agents for the Automotive Domain Using Bayesian Networks.
Proceedings of the Human Interface and the Management of Information. Methods, 2007

Comparing one and two-stage acoustic modeling in the recognition of emotion in speech.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Frame vs. Turn-Level: Emotion Recognition from Speech Considering Static and Dynamic Processing.
Proceedings of the Affective Computing and Intelligent Interaction, 2007

On the Necessity and Feasibility of Detecting a Driver's Emotional State While Driving.
Proceedings of the Affective Computing and Intelligent Interaction, 2007

2006
Hybrid NN/HMM acoustic modeling techniques for distributed speech recognition.
Speech Commun., 2006

Multi-person Tracking in Meetings: A Comparative Study.
Proceedings of the Machine Learning for Multimodal Interaction, 2006

Using Audio, Visual, and Lexical Features in a Multi-modal Virtual Meeting Director.
Proceedings of the Machine Learning for Multimodal Interaction, 2006


Multimodal Face Detection, Head Orientation and Eye Gaze Tracking.
Proceedings of the 2006 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, 2006

Timing levels in segment-based speech emotion recognition.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Recognition of interest in human conversational speech.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Efficient Recognition of Authentic Dynamic Facial Expressions on the Feedtum Database.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Musical Signal Type Discrimination based on Large Open Feature Sets.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Evolutionary Feature Generation in Speech Emotion Recognition.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Segmentation and Recognition of Meeting Events using a Two-Layered HMM and a Combined MLP-HMM Approach.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

A Two-Layer Graphical Model for Combined Video Shot and Scene Boundary Detection.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

A Hierarchical ASM/AAM Approach in a Stochastic Framework for Fully Automatic Tracking and Recognition.
Proceedings of the International Conference on Image Processing, 2006

Submotions for Hidden Markov Model Based Dynamic Facial Action Recognition.
Proceedings of the International Conference on Image Processing, 2006

A Combined LSTM-RNN - HMM - Approach for Meeting Event Segmentation and Recognition.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Reduced Complexity and Scaling for Asynchronous HMMS in a Bimodal Input Fusion Application.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2D Multi-person Tracking: A Comparative Study in AMI Meetings.
Proceedings of the Multimodal Technologies for Perception of Humans, 2006

A new approach of a context-adaptive search agent for automotive environments.
Proceedings of the Extended Abstracts Proceedings of the 2006 Conference on Human Factors in Computing Systems, 2006

2005
Multimodal Integration for Meeting Group Action Segmentation and Recognition.
Proceedings of the Machine Learning for Multimodal Interaction, 2005

Multi-task learning strategies for a recurrent neural net in a hybrid tied-posteriors acoustic model.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Feature Selection and Stacking for Robust Discrimination of Speech, Monophonic Singing, and Polyphonic Music.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Speaker Independent Speech Emotion Recognition by Ensemble Classification.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

A neural-field-like approach for modeling human group actions in meetings.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Video Based Online Behavior Detection Using Probabilistic Multi Stream Fusion.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

A Multi-Modal Mixed-State Dynamic Bayesian Network for Robust Meeting Event Recognition from Disturbed Data.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

A multi-modal graphical model for robust recognition of group actions in meetings from disturbed videos.
Proceedings of the 2005 International Conference on Image Processing, 2005

Two-Stage Speaker Adaptation of Hybrid Tied-Posterior Acoustic Models.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Meta-Classifiers in Acoustic and Linguistic Feature Fusion-Based Affect Recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Multimodal Meeting Analysis by Segmentation and Classification of Meeting Events based on a Higher Level Semantic Approach.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Bimodal fusion of emotional data in an automotive environment.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Robust tracking of persons in real-world scenarios using a statistical computer vision approach.
Image Vis. Comput., 2004

Handwritten Address Recognition Using Hidden Markov Models.
Proceedings of the Reading and Learning, Adaptive Content Recognition, 2004

A hybrid SVM/HMM acoustic modeling approach to automatic speech recognition.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Face Tracking in Meeting Room Scenarios Using Omnidirectional Views.
Proceedings of the 17th International Conference on Pattern Recognition, 2004

Segmentation and Classification of Meeting Events using Multiple Classifier Fusion and Dynamic Programming.
Proceedings of the 17th International Conference on Pattern Recognition, 2004

Spoken Document Classification with SVMs Using Linguistic Unit Weighting and Probabilistic Couplers.
Proceedings of the 17th International Conference on Pattern Recognition, 2004

Discrimination of speech and monophonic singing in continuous audio streams applying multi-layer support vector machines.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Emotion recognition in the manual interaction with graphical user interfaces.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Multimodal music retrieval for large databases.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Applying Bayesian belief networks in approximate string matching for robust keyword-based retrieval.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Recognition of partly occluded person actions in meeting scenarios.
Proceedings of the 2004 International Conference on Image Processing, 2004

Action segmentation and recognition in meeting room scenarios.
Proceedings of the 2004 International Conference on Image Processing, 2004

Reconstruction-free matching for fingerprint sweep sensors.
Proceedings of the 2004 International Conference on Image Processing, 2004

SVC2004: First International Signature Verification Competition.
Proceedings of the Biometric Authentication, First International Conference, 2004

Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Synthesis and Recognition of Face Profiles.
Proceedings of the 8th International Fall Workshop on Vision, Modeling, and Visualization, 2003

Comparing an Innovative 3D and a Standard 2D User Interface for Automotive Infotainment Applications.
Proceedings of the Mensch & Computer 2003: Interaktion in Bewegung, 2003

Distributed speech recognition on the WSJ task.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

A real-time system for hand gesture controlled operation of in-car devices.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

A hybrid music retrieval system using belief networks to integrate multimodal queries and contextual knowledge.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

HMM-based music retrieval using stereophonic feature information and framelength adaptation.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

A flexible multimodal object tracking system.
Proceedings of the 2003 International Conference on Image Processing, 2003

Confidence Measures for an Address Reading System.
Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR 2003), 2003

Flexible feature extraction and HMM design for a hybrid distributed speech recognition system in noisy environments.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Hidden Markov model-based speech emotion recognition.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Gesture Components for Natural Interaction with In-Car Devices.
Proceedings of the Gesture-Based Communication in Human-Computer Interaction, 2003

Robust Video-Based Recognition of Dynamic Head Gestures in Various Domains - Comparing a Rule-Based and a Stochastic Approach.
Proceedings of the Gesture-Based Communication in Human-Computer Interaction, 2003

Evaluating Multimodal Interaction Patterns in Various Application Scenarios.
Proceedings of the Gesture-Based Communication in Human-Computer Interaction, 2003

2002
Combining HMM-Based Two-Pass Classifiers for Off-Line Word Recognition.
Proceedings of the 16th International Conference on Pattern Recognition, 2002

Facial Expression Recognition Using Pseudo 3-D Hidden Markov Models.
Proceedings of the 16th International Conference on Pattern Recognition, 2002

Comparing Normalization and Adaptation Techniques for On-Line Handwriting Recognition.
Proceedings of the 16th International Conference on Pattern Recognition, 2002

Automatic topic identification in multimedia broadcast data.
Proceedings of the 2002 IEEE International Conference on Multimedia and Expo, 2002

Multimodal emotion recognition in audiovisual communication.
Proceedings of the 2002 IEEE International Conference on Multimedia and Expo, 2002

Combination of multiple classifiers for handwritten word recognition.
Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition, 2002

Handwritten address recognition with open vocabulary using character n-grams.
Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition, 2002

Evaluation of Confidence Measures for On-Line Handwriting Recognition.
Proceedings of the Pattern Recognition, 2002

2001
A continuous density interpretation of discrete HMM systems and MMI-neural networks.
IEEE Trans. Speech Audio Process., 2001

An Integrated Approach to Shape and Color-Based Image Retrieval of Rotated Objects Using Hidden Markov Models.
Int. J. Pattern Recognit. Artif. Intell., 2001

Scaled likelihood linear regression for hidden Markov model adaptation.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Distributed speech recognition using traditional and hybrid modeling techniques.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

A novel hybrid face profile recognition system using the FERET and MUGSHOT databases.
Proceedings of the 2001 International Conference on Image Processing, 2001

A comparison of discrete and continuous output modeling techniques for a pseudo-2D hidden Markov model face recognition system.
Proceedings of the 2001 International Conference on Image Processing, 2001

Retrieval of overlapping and touching objects using hidden Markov models.
Proceedings of the 2001 International Conference on Image Processing, 2001

Improved person tracking using a combined pseudo-2D-HMM and Kalman filter approach with automatic background state adaptation.
Proceedings of the 2001 International Conference on Image Processing, 2001

Multi-Branch and Two-Pass HMM Modeling Approaches for Off-Line Cursive Handwriting Recognition.
Proceedings of the 6th International Conference on Document Analysis and Recognition (ICDAR 2001), 2001

Adaptation of an Address Reading System to Local Mail Streams.
Proceedings of the 6th International Conference on Document Analysis and Recognition (ICDAR 2001), 2001

A Comparison of Character N-Grams and Dictionaries Used for Script Recognition.
Proceedings of the 6th International Conference on Document Analysis and Recognition (ICDAR 2001), 2001

Comparing Adaptation Techniques for On-Line Handwriting Recognition.
Proceedings of the 6th International Conference on Document Analysis and Recognition (ICDAR 2001), 2001

Recognition of face profiles from the mugshot database using a hybrid connectionist/HMM approach.
Proceedings of the IEEE International Conference on Acoustics, 2001

New approaches to audio-visual segmentation of TV news for automatic topic retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2001

Content based indexing of images and video using face detection and recognition methods.
Proceedings of the IEEE International Conference on Acoustics, 2001

Facial Expression Recognition with Pseudo-3D Hidden Markov Models.
Proceedings of the Pattern Recognition, 2001

Writer Adaptation for Online Handwriting Recognition.
Proceedings of the Pattern Recognition, 2001

2000
Recognition of JPEG compressed face images based on statistical methods.
Image Vis. Comput., 2000

Compound splitting and lexical unit recombination for improved performance of a speech recognition system for German parliamentary speeches.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

On-Line Handwritten Formula Recognition with Integrated Correction Recognition and Execution.
Proceedings of the 15th International Conference on Pattern Recognition, 2000

Improved Degraded Document Recognition with Hybrid Modeling Techniques and Character N-Grams.
Proceedings of the 15th International Conference on Pattern Recognition, 2000

An HMM Based Two-Pass Approach for Off-Line Cursive Handwriting Recognition.
Proceedings of the Advances in Multimodal Interfaces, 2000

DUcoder-the Duisburg University LVCSR stackdecoder.
Proceedings of the IEEE International Conference on Acoustics, 2000

Frame-discriminative and confidence-driven adaptation for LVCSR.
Proceedings of the IEEE International Conference on Acoustics, 2000

Tied posteriors: an approach for effective introduction of context dependency in hybrid NN/HMM LVCSR.
Proceedings of the IEEE International Conference on Acoustics, 2000

A novel error measure for the evaluation of video indexing systems.
Proceedings of the IEEE International Conference on Acoustics, 2000

Person Tracking in Real-World Scenarios Using Statistical Methods.
Proceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2000), 2000

Crane Gesture Recognition Using Pseudo 3-D Hidden Markov Models.
Proceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2000), 2000

Comparison of Confidence Measures for Face Recognition.
Proceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2000), 2000

Gesture Recognition Using Pseudo 3D Hidden Markov Models.
Proceedings of the Mustererkennung 2000, 2000

Unlimited Vocabulary Script Recognition Using Character N-Grams.
Proceedings of the Mustererkennung 2000, 2000

1999
A discriminative training procedure based on language model and dictionary for LVCSR.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Speaker adaptation using regularization and network adaptation for hybrid MMI-NN/HMM speech recognition.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Robust Person Tracking with Non-Stationary Background Using a Combined Pseudo-2D-Hmm and Kalman-Filter Approach.
Proceedings of the 1999 International Conference on Image Processing, 1999

Pseudo 3-D Hmms for Image Sequence Recognition.
Proceedings of the 1999 International Conference on Image Processing, 1999

High Quality Face Recognition in Jpeg Compressed Images.
Proceedings of the 1999 International Conference on Image Processing, 1999

Searching an Engineering Drawing Database for User-specified Shapes.
Proceedings of the Fifth International Conference on Document Analysis and Recognition, 1999

Multimedia Database Retrieval using Hand-Drawn Sketches.
Proceedings of the Fifth International Conference on Document Analysis and Recognition, 1999

Advanced State Clustering for Very Large Vocabulary HMM-based On-Line Handwriting Recognition.
Proceedings of the Fifth International Conference on Document Analysis and Recognition, 1999

On-Line Handwritten Formula Recognition using Hidden Markov Models and Context Dependent Graph Grammars.
Proceedings of the Fifth International Conference on Document Analysis and Recognition, 1999

Performance Evaluation of a New Hybrid Modeling Technique for Handwriting Recognition using On-Line and Off-Line Data.
Proceedings of the Fifth International Conference on Document Analysis and Recognition, 1999

Refining tree-based state clustering by means of formal concept analysis, balanced decision trees and automatically generated model-sets.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

Experiments in topic indexing of broadcast news using neural networks.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

Graphics-Based Retrieval of Color Image Databases Using Hand-Drawn Query Sketches.
Proceedings of the Graphics Recognition, Recent Advances, Third International Workshop, 1999

Engineering Drawing Database Retrieval Using Statistical Pattern Spotting Techniques.
Proceedings of the Graphics Recognition, Recent Advances, Third International Workshop, 1999

High performance face recognition using pseudo 2-D hidden Markov models.
Proceedings of the 5th European Control Conference, 1999

Gesichtserkennung mit Hidden Markov Modellen.
Proceedings of the Mustererkennung 1999, 1999

Vergleich verschiedener statistischer Modellierungsverfahren für die On- und Off-line Handschriftenerkennung.
Proceedings of the Mustererkennung 1999, 1999

1998
Controlling the Complexity of HMM Systems by Regularization.
Proceedings of the Advances in Neural Information Processing Systems 11, [NIPS Conference, Denver, Colorado, USA, November 30, 1998

Confidence measures for HMM-based speech recognition.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

A German dialogue system for scheduling dates and meetings by naturally spoken continuous speech.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Efficient computation of MMI neural networks for large vocabulary speech recognition systems.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Soft state-tying for HMM-based speech recognition.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

A new hybrid approach to large vocabulary cursive handwriting recognition.
Proceedings of the Fourteenth International Conference on Pattern Recognition, 1998

A systematic comparison between on-line and off-line methods for signature verification with hidden Markov models.
Proceedings of the Fourteenth International Conference on Pattern Recognition, 1998

Tree-based state clustering using self-organizing principles for large vocabulary on-line handwriting recognition.
Proceedings of the Fourteenth International Conference on Pattern Recognition, 1998

On-line handwritten formula recognition using statistical methods.
Proceedings of the Fourteenth International Conference on Pattern Recognition, 1998

Hidden Markov model based continuous online gesture recognition.
Proceedings of the Fourteenth International Conference on Pattern Recognition, 1998

Efficient search with posterior probability estimates in HMM-based speech recognition.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

Speaker adaptation for hybrid MMI/connectionist speech-recognition systems.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

A NN/HMM hybrid for continuous speech recognition with a discriminant nonlinear feature extraction.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

Echtzeitfähige Gestikerkennung mit stochastischen Mustererkennungsverfahren.
Proceedings of the Informatik '98, 1998

Speech recognition with a new hybrid architecture combining neural networks and continuous HMM.
Proceedings of the 6th European Symposium on Artificial Neural Networks, 1998

Invariante Erkennung handskizzierter Piktogramme mit Anwendungsmöglichkeiten in der inhaltsorientierten Bilddatenbankabfrage.
Proceedings of the Mustererkennung 1998, 20. DAGM-Symposium, Stuttgart, 29. September, 1998

Bildorientierte Videoindexierung mit Hidden Markov Modellen.
Proceedings of the Mustererkennung 1998, 20. DAGM-Symposium, Stuttgart, 29. September, 1998

1997
Hybrid NN/HMM-Based Speech Recognition with a Discriminant Neural Feature Extraction.
Proceedings of the Advances in Neural Information Processing Systems 10, 1997

A new approach to generalized mixture tying for continuous HMM-based speech recognition.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Large vocabulary speech recognition with context dependent MMI-connectionist / HMM systems using the WSJ database.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Reduced lexicon trees for decoding in a MMIi-connectionist/HMM speech recognition system.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Improved On-Line Handwriting Recognition Using Context Dependent Hidden Markov Models.
Proceedings of the 4th International Conference Document Analysis and Recognition (ICDAR '97), 1997

New improved feature extraction methods for real-time high performance image sequence recognition.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Advanced training methods and new network topologies for hybrid MMI-connectionist/HMM speech recognition systems.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

An investigation of the use of trigraphs for large vocabulary cursive handwriting recognition.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

High Performance Real-Time Gesture Recognition Using Hidden Markov Models.
Proceedings of the Gesture and Sign Language in Human-Computer Interaction, 1997

Large Vocabulary On-Line Handwriting Recognition with Context Dependent Hidden Markov Models.
Proceedings of the Mustererkennung 1997, 1997

Echtzeitfähige Videosequenzerkennung mit statistischen Verfahren.
Proceedings of the Mustererkennung 1997, 1997

1996
A New Approach to Hybrid HMM/ANN Speech Recognition using Mutual Information Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 9, 1996

A comparison between continuous and discrete density hidden Markov models for cursive handwriting recognition.
Proceedings of the 13th International Conference on Pattern Recognition, 1996

A new approach to video sequence recognition based on statistical methods.
Proceedings of the Proceedings 1996 International Conference on Image Processing, 1996

Fast online video image sequence recognition with statistical methods.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

A new hybrid system based on MMI-neural networks for the RM speech recognition task.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

Optimal Combination of Neural Networks and Discrete Statistical Pattern Classifiers.
Proceedings of the Mustererkennung 1996, 1996

1995
Mutual Information Neural Networks: A New Connectionist Paradigm for Dynamic Pattern Recognition Tasks.
Proceedings of the Neural Networks: Artificial Intelligence and Industrial Applications, 1995

Large vocabulary speaker-independent continuous speech recognition with a new hybrid system based on MMI-neural networks.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Speech recognition experiments with a new multilayer LVQ network (MLVQ).
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Ein Hybrides System zur Erkennung von sprecherunabhängiger fließender Sprache mit großen Wortschätzen.
Proceedings of the Mustererkennung 1995, 1995

1994
Maximum mutual information neural networks for hybrid connectionist-HMM speech recognition systems.
IEEE Trans. Speech Audio Process., 1994

Mutual information neural networks: a new connectionist approach for dynamic speech recognition tasks.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1993
Joint optimization of multiple neural codebooks in a hybrid connectionist-HMM speech recognition system.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

Speaker adaptation using improved speaker Markov models.
Proceedings of the IEEE International Conference on Acoustics, 1993

1992
Unsupervised information theory-based training algorithms for multilayer neural networks.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1991
Algorithmen der Sprachverarbeitung zur Entwicklung eines vollsynthetischen Sprachausgabesystems.
Springer, ISBN: 978-3-540-53870-7, 1991

Information theory-based supervised learning methods for self-organizing maps in combination with hidden Markov modeling.
Proceedings of the 1991 International Conference on Acoustics, 1991

1990
Large vocabulary hidden markov model based speech recognition.
Eur. Trans. Telecommun., 1990

Information theory principles for the design of self-organizing maps in combination with hidden Markov modeling for continuous speech recognition.
Proceedings of the IJCNN 1990, 1990

Baseform adaptation for large vocabulary hidden Markov model based speech recognition systems.
Proceedings of the 1990 International Conference on Acoustics, 1990

Neural Network Based Continuous Speech Recognition by Combining Self Organizing Feature Maps and Hidden Markov Modeling.
Proceedings of the Neural Networks, 1990

1989
An information theory approach to speaker adaptation.
Proceedings of the First European Conference on Speech Communication and Technology, 1989

Speaker adaptation for large vocabulary speech recognition systems using speaker Markov models.
Proceedings of the IEEE International Conference on Acoustics, 1989

1988
Formant tracking with quasilinearization.
Proceedings of the IEEE International Conference on Acoustics, 1988

1987
The dectalk system for German: A study of the modification of a text-to-speech converter for a foreign language.
Proceedings of the IEEE International Conference on Acoustics, 1987

1986
Maschinelle Spracherkennung zur Verbesserung der Mensch-Maschine-Schnittstelle.
PhD thesis, 1986

A new algorithm for estimation of formant trajectories directly from the speech signal based on an extended Kalman-filter.
Proceedings of the IEEE International Conference on Acoustics, 1986


  Loading...