Publications

For full lists of publications, please refer to the websites of each faculty member: Y. Sato, Y. Sugano

(2024). Single-to-Dual-View Adaptation for Egocentric 3D Hand Pose Estimation. Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024).

PDF Cite Code

(2024). Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives. Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024).

PDF Cite

(2024). Rotation-Constrained Cross-View Feature Fusion for Multi-View Appearance-based Gaze Estimation. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

PDF Cite Code

(2023). Image Cropping under Design Constraints. Proceedings of the 5th ACM International Conference on Multimedia in Asia (MMAsia 2023).

PDF Cite Code

(2023). Weakly Supervised Temporal Sentence Grounding With Uncertainty-Guided Self-Training. Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023).

PDF Cite

(2023). Technical Report for EgoTracks in Ego4D Challenge 2023. Proceedings of the Joint International 3rd Ego4D and 11th EPIC Workshop (in conjunction with CVPR 2023, extended abstract).

Cite

(2023). Structural Multiplane Image: Bridging Neural View Synthesis and 3D Reconstruction. Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023).

PDF Cite

(2023). FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotations. Proceedings of the Joint International 3rd Ego4D and 11th EPIC Workshop (in conjunction with CVPR 2023, extended abstract).

Cite

(2023). Fine-grained Affordance Annotation for Egocentric Hand-Object Interaction Videos. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2023).

PDF Cite

(2022). Learning Video-independent Eye Contact Segmentation from In-the-Wild Videos. Proceedings of the 16th Asian Conference on Computer Vision (ACCV 2022).

PDF Cite Code

(2022). Compound Prototype Matching for Few-shot Action Recognition. Proceedings of the European Conference on Computer Vision (ECCV 2022).

PDF Cite DOI

(2022). Background Mixup Data Augmentation for Hand and Object-in-Contact Detection. Proceedings of the International Workshop on Observing and Understanding Hands in Action (in conjunction with ECCV 2022).

PDF Cite

(2022). Surgical Skill Assessment via Video Semantic Aggregation. Proceedings of the International Conference on Medical Computing and Computer Assisted Invention (MICCAI 2022).

PDF Cite Code DOI

(2022). Learning-by-Novel-View-Synthesis for Full-Face Appearance-Based 3D Gaze Estimation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.

PDF Cite Project DOI

(2022). Interact before Align: Leveraging Cross-Modal Knowledge for Domain Adaptive Action Recognition. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022).

PDF Cite DOI

(2022). Precise Affordance Annotation for Egocentric Action Video Datasets. Proceedings of the Tenth International Workshop on Egocentric Perception, Interaction and Computing (EPIC 2022 at CVPR 2022, extended abstract).

PDF Cite

(2022). Object Instance Identification in Dynamic Environments. Proceedings of the Tenth International Workshop on Egocentric Perception, Interaction and Computing (EPIC 2022 at CVPR 2022, extended abstract).

PDF Cite

(2022). Spatio-Temporal Perturbations for Video Attribution. IEEE Transactions on Circuits and Systems for Video Technology.

PDF Cite DOI

(2021). Neural Routing by Memory. Proceedings of The 35th Conference on Neural Information Processing Systems (NeurIPS 2021).

PDF Cite

(2021). EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2021: Team M3EM Technical Report. The Eighth International Workshop on Egocentric Perception, Interaction and Computing (EPIC 2021).

PDF Cite

(2021). Unsupervised Common Particular Object Discovery and Localization by Analyzing a Match Graph. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

PDF Cite DOI

(2021). Toward Visually Explaining Video Understanding Networks by Perturbation. Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV 2021).

PDF Cite DOI

(2020). Learning-based Region Selection for End-to-End Gaze Estimation. Proceedings of the 31st British Machine Vision Conference (BMVC 2020).

PDF Cite Project

(2020). Generalizing Hand Segmentation in Egocentric Videos With Uncertainty-Guided Model Adaptation. Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020).

PDF Cite DOI

(2020). Improving Action Segmentation via Graph Based Temporal Reasoning. Proceedings of the 2020 Conference on Computer Vision and Pattern Recognition (CVPR 2020).

PDF Cite DOI

(2020). Investigating audio data visualization for interactive sound recognition. Proceedings of the 25th International Conference on Intelligent User Interfaces.

PDF Cite Project DOI

(2018). Revisiting data normalization for appearance-based gaze estimation. Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications.

PDF Cite DOI

(2018). Gaze-guided Image Classification for Reflecting Perceptual Class Ambiguity. Adjunct proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology.

PDF Cite DOI

(2018). Future Person Localization in First-Person Videos. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018).

PDF Cite DOI

(2018). Forecasting user attention during everyday mobile interactions using device-integrated and wearable sensors. Proceedings of the 20th International Conference on Human-Computer Interaction with Mobile Devices and Services.

PDF Cite DOI

(2018). Browsing Group First-Person Videos with 3D Visualization. Proc. ACM International Conference on Interactive Surfaces and Spaces (ISS 2018).

PDF Cite DOI

(2018). A Multimodal Corpus of Expert Gaze and Behavior during Phonetic Segmentation Tasks. Proceedings of the Eleventh International Conference on Language Resources and Evaluation.

PDF Cite

(2017). Rapid Prototyping of Accessible Interfaces With Gaze-Contingent Tunnel Vision Simulation. Proc. ACM SIGACCESS International Conference on Computers and Accessibility (ASSETS 2017).

PDF Cite DOI

(2017). Noticeable or Distractive?: A Design Space for Gaze-Contingent User Interface Notifications. Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (Extended Abstracts).

PDF Cite DOI

(2017). It's Written All Over Your Face: Full-Face Appearance-Based Gaze Estimation. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

PDF Cite DOI

(2017). Fast Multi-frame Stereo Scene Flow with Motion Segmentation. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017).

PDF Cite DOI

(2017). Everyday Eye Contact Detection Using Unsupervised Gaze Target Discovery. Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology.

PDF Cite DOI

(2017). Deep Photometric Stereo Network. Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops.

PDF Cite DOI

(2017). Cell tracking for cell image analysis. Proc. Biomedical Imaging and Sensing Conference.

PDF Cite DOI

(2016). Visual Motif Discovery via First-Person Vision. Proc. European Conference on Computer Vision (ECCV 2016).

PDF Cite DOI

(2016). Visual Guidance with Unnoticed Blur Effect. Proc. International Working Conference on Advanced Visual Interfaces (AVI 2016).

PDF Cite DOI

(2016). Visual Guidance with Unnoticed Blur Effect. Proceedings of the International Working Conference on Advanced Visual Interfaces (AVI ‘16).

PDF Cite DOI

(2016). Spatio-Temporal Modeling and Prediction of Visual Attention in Graphical User Interfaces. Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems.

PDF Cite DOI

(2016). Recognizing Micro-Actions and Reactions from Paired Egocentric Videos. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016).

PDF Cite DOI

(2016). Joint Recovery of Dense Correspondence and Cosegmentation in Two Images. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016).

PDF Cite DOI

(2016). Hierarchical Gaussian Descriptor for Person Re-identification. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016).

PDF Cite DOI

(2016). Exploiting Spectral-Spatial Correlation for Coded Hyperspectral Image Restoration. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016).

PDF Cite DOI

(2016). AggreGaze: Collective Estimation of Audience Attention on Public Displays. Proceedings of the 29th Annual Symposium on User Interface Software and Technology.

PDF Cite DOI

(2016). 3D gaze estimation from 2D pupil positions on monocular head-mounted eye trackers. Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications.

PDF Cite DOI

(2015). Self-Calibrating Head-Mounted Eye Trackers Using Egocentric Visual Saliency. Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology.

PDF Cite DOI

(2015). Fast sparse edge-based intrinsic image decomposition guided by chromaticity gradients. Proc. IEEE International Conference on Image Processing (ICIP 2015).

PDF Cite DOI

(2015). Ego-surfing first person videos. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015).

PDF Cite DOI

(2015). Appearance-based gaze estimation in the wild. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

PDF Cite DOI

(2015). A scalable approach for understanding the visual structures of hand grasps. IEEE International Conference on Robotics and Automation (ICRA 2015).

PDF Cite DOI

(2014). Shape-Preserving Half-Projective Warps for Image Stitching. Proc. IEEE Conference on Computer Vision and Pattern Recogntion (CVPR2014).

PDF Cite DOI

(2014). Sensing, predicting, and utilizing human visual attention. Proc. 4th International Conference on Image Processing Theory, Tools and Applications (IPTA 2014).

Cite DOI

(2014). Learning-by-Synthesis for Appearance-Based 3D Gaze Estimation. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition.

PDF Cite DOI

(2014). Interreflection Removal Using Fluorescence. Proc. European Conference on Computer Vision (ECCV 2014).

PDF Cite DOI

(2014). Influence of stimulus and viewing task types on a learning-based visual saliency model. Proceedings of the Eighth Biennial ACM Symposium on Eye Tracking Research & Applications.

PDF Cite DOI

(2013). Uncalibrated Photometric Stereo for Unknown Isotropic Reflectances. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013).

PDF Cite DOI

(2013). Spectral Imaging Using Basis Lights. Proc. British Machine Vision Conference (BMVC 2013).

PDF Cite DOI

(2013). Image Preference Estimation from Eye Movements with A Data-driven Approach. Proceedings of the 3rd International Workshop on Pervasive Eye Tracking and Mobile Eye-Based Interaction.

PDF Cite

(2013). Early facial expression recognition using early RankBoost. Proc. IEEE Int. Conf. Automatic Face and Gesture Recognition (FG 2013).

PDF Cite DOI

(2012). Illumination normalization of face images with cast shadows. Proc. International Conference on Pattern Recognition (ICPR 2012).

PDF Cite

(2012). Head pose-free appearance-based gaze sensing via eye image synthesis. Proceedings of the 21st International Conference on Pattern Recognition.

PDF Cite

(2012). Denoising hyperspectral images using spectral domain statistics. Proc. International Conference on Pattern Recognition (ICPR 2012).

PDF Cite

(2012). Coupling eye-motion and ego-motion features for first-person activity recognition. Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

PDF Cite DOI

(2012). Bispectral photometric stereo based on fluorescence. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR 2012).

PDF Cite DOI

(2011). Surface Reconstruction in Photometric Stereo with Calibration Error. Proc. International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT 2011).

PDF Cite DOI

(2011). Photometric stereo with auto-radiometric calibration. Proc. IEEE Workshop on Color and Photometry in Computer Vision (CPCV2011).

PDF Cite DOI

(2011). Inferring human gaze from appearance via adaptive linear regression. Proceedings of the IEEE International Conference on Computer Vision.

PDF Cite DOI

(2011). Estimating change in head pose from low resolution video using LBP-based tracking. Proc. International Symposium on Intelligent Signal Processing and Communications Systems (ISPACS 2011).

PDF Cite DOI

(2011). Aesthetic quality classification of photographs based on color harmony. Proc. IEEE Conferece on Computer Vision and Pattern Recognition (CVPR 2011).

PDF Cite DOI

(2010). Video Segmentation with Motion Smoothness. IEICE Transactions on Information and Systems.

PDF Cite DOI

(2010). Calibration-free gaze sensing using saliency maps. Proceedings of the Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition.

PDF Cite DOI

(2009). Visual localization of non-stationary sound sources. Proc. International Conference on Multimedia (MM 2009).

PDF Cite DOI

(2009). Video segmentation with motion smoothness. Poster Proceedings of International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH 2009).

PDF Cite DOI

(2009). Sensation-based photo cropping. Proc. ACM International Conference on Multimedia (MM 2009).

PDF Cite DOI

(2009). Detecting Video Forgeries Based on Noise Characteristics. Proc. Pacific-Rim Symposium on Image and Video Technology (PSIVT2009).

PDF Cite DOI

(2008). Recovering audio-to-video synchronization by audiovisual correlation analysis. Proc. International Conference on Pattern Recognition (ICPR 2008).

PDF Cite DOI

(2008). Finding Speaker Face Region by Audiovisual Correlation. Proc. Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications (M2SFA2 2008).

Cite

(2008). 3-D Interaction with Wall-Sized Display and Information Transportation using Mobile Phones. Proc. Workshop on Designing Multi-touch Interaction Techniques for Coupled Public and Private Displays.

Cite

(2007). Person-Independent Monocular Tracking of Face and Facial Actions with Multilinear Models. Proceedings of the Third International Workshop on Analysis and Modeling of Faces and Gestures.

PDF Cite DOI

(2007). Information Layout and Interaction on Virtual and Real Rotary Tables. Proc. IEEE International Workshop on Horizontal Interactive Human-Computer Systems (Tabletop2007).

PDF Cite DOI

(2006). Robust Content-Dependent Photometric Projector Compensation. Proc. IEEE International Workshop on Projector-Camera Systems.

PDF Cite DOI

(2006). Gaze Estimation from Low Resolution Images. Proc. IEEE Pacific-Rim Symposium on Image and Video Technology (PSIVT 2006).

PDF Cite DOI

(2006). An MDL Approach to Learning Activity Grammars. Proc. Korea-Japan Workshop on Pattern Recognition (KJPR 2006).

Cite

(2005). Steerable Projector Calibration. Proc. IEEE Workshop on Projector-Camera Systems (PROCAM 2005).

PDF Cite DOI

(2005). Real-Time Modeling of Face Deformation for 3D Head Pose Estimation. Proc. IEEE International Workshop on Analysis and Modeling of Faces and Gestures.

PDF Cite DOI

(2005). Deleted Interpolation Using a Hierarchical Bayesian Grammar Network for Recognizing Human Activity. Proc. International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

Cite

(2005). Combining head tracking and mouse input for a GUI on multiple monitors. Extended Abstracts of the 2005 Conference on Human Factors in Computing Systems (CHI 2005).

PDF Cite DOI

(2004). Video-Based Tracking of User's Motion for Augmented Desk Interface. Proc. International Conference on Automatic Face and Gesture Recognition (FG 2004).

PDF Cite DOI

(2004). Video Content Manipulation by Means of Content Annotation and Nonsymbolic Gestural Interfaces. Proc. International Conference on Knowledge-Based Intelligent Information and Engineering Systems (KES 2004).

PDF Cite DOI

(2004). Reflectance Estimation from Motion under Complex Illumination. Proc. International Conference on Pattern Recognition (ICPR’04).

PDF Cite DOI

(2003). Ubiquitous display for dynamically changing environment. Extended abstracts of ACM Conference on Human Factors in Computing Systems (CHI 2003).

PDF Cite DOI

(2003). Object Recognition Based on Photometric Alignment Using RANSAC. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR 2003).

PDF Cite DOI

(2003). Illumination from Shadows. IEEE Trans. Pattern Anal. Mach. Intell..

PDF Cite DOI

(2002). Vision-Based Face Tracking System for Large Displays. Proc. International Conference on Ubiquitous Computing (Ubicomp 2002).

PDF Cite DOI

(2002). Two-handed drawing on augmented desk system. Proc. International Working Conference on Advanced Visual Interfaces (AVI 2002).

PDF Cite DOI

(2002). Two-handed drawing on augmented desk. Extended abstracts of ACM Conference on Human Factors in Computing Systems (CHI 2002).

PDF Cite DOI

(2002). Real-Time Tracking of Multiple Fingertips and Gesture Recognition for Augmented Desk Interface Systems. Proc. IEEE International Conference on Automatic Face and Gesture Recognition (FG 2002).

PDF Cite DOI

(2001). Stability Issues in Recovering Illumination Distribution from Brightness in Shadows. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2001).

PDF Cite DOI

(2001). Interactive object registration and recognition for augmented desk interface. Extended Abstracts on Human Factors in Computing Systems (CHI 2001).

PDF Cite DOI

(2000). Fast Tracking of Hands and Fingertips in Infrared Images for Augmented Desk Interface. Proc. IEEE International Conference on Automatic Face and Gesture Recognition (FG 2000).

PDF Cite DOI

(1999). Photometric modeling for mixed reality. Proc. International Symposium on Mixed Reality (ISMR ’99).

Cite

(1999). Object recognition using local EGI and 3D models with M-estimators. Proc. IEEE/SICE/RSJ. International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI'99).

Cite

(1999). Illumination Distribution from Shadows. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR’99).

PDF Cite DOI

(1999). Eigen-Texture Method: Appearance Compression Based on 3D Model. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR’99).

PDF Cite DOI

(1999). Appearance modeling for mixed reality: photometric aspects. Proc. IEEE International Conference on Systems, Man, and Cybernetics (SMC'99).

Cite

(1999). Appearance Compression and Synthesis based on 3D Model for Mixed Reality. Proc. IEEE International Conference on Computer Vision (ICCV'99).

PDF Cite DOI

(1997). Visual learning and object verification with illumination invariance. Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'97).

PDF Cite DOI

(1997). Object shape and reflectance modeling from observation. Proc. ACM Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'97).

PDF Cite DOI

(1997). 3D shape and reflectance morphing. Proc. International Conference on Shape Modeling and Applications.

PDF Cite DOI

(1996). Recovering shape and reflectance properties from a sequence of range and color images. Proc. IEEE/SICE/RSJ International Conference on Multisensor Fusion and Integration for Intelligent Systems.

Cite

(1996). Photorealistic object model generation from observation for virtual reality applications. Proc. International Conference on Artificial Reality and Tele-Existence ’96.

Cite

(1995). Reflectance analysis under solar illumination. Proc. IEEE Workshop on Physics-Based Modeling in Computer Vision.

Cite

(1993). Temporal-color space analysis of reflection. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR ’93).

PDF Cite DOI