Corridor of EEIC, The University of Tokyo

Multimedia, computer vision, image processing :Kiyoharu Aizawa

Fundamentals of Recognition and Learning, Open World.

Deep Learning accurately works for closed dataset containing large number of data per class. However, in reality, unknown classes and new classes with small amount of data frequently appear. We are investigating identification and recognition techniques for such open world situation. The specific topics are effective methodology for noisy training data, out-of-distribution detection, positive-unlabeled learning, open-set data learning, uncertainty estimation etc. We also work on scence character recognition and show that SOTA can be achieved with about 1/100th of the real images as opposed to existing approaches that use huge amounts of synthetic characters.

360̊ Image Processing・3D，Movie Map.

We are investigating 360̊ image processing. Specifically, we build “movie map” for walkers to explore in a city. Using 360̊ street videos, we work on many different research issues such as hyperlaspe 360̊video, 360̊image object detection, accurate SLAM, intersection detection, RoI detection, real-time route view generation based on user input, building database of automatically segmented video sections etc. We prototyped our first version MovieMap by which we can freely explore in a certain area in a city.

Life Logging, Food Computing

We have been pioneering life logging technology. Starting from generic purpose lifelogging, we now pursue specific purpose lifelogging. We focus on research on capture and analysis of our daily food logs (FoodLog), Using the app we developed, food records we captured exceeds 10 million. We are investigating various processing of FoodLog data, such as personalized food recognition, recipe and food record multimodal analysis, prediction of healthy index, etc. We also built a new a new food logging tool, FoodLog Athl, and made it publicly available that supports communications between users and dietitians.

Manga, Comic Computing

Manga, our unique culture, is our research target, which has rarely been discussed in the field of image processing. We have built a world largest scale Manga dataset, Manga109, and investigate image processing techniques such as retrieval, segmentation, recognition, colorization etc.

Desing, Fonts

We investigate image technology for creation, retrieval of designs of fonts, products etc. We built Emotype – a mobile messenger expressing our emotions via different typo-graphics, social font search by multimodal inputs, font search across various languages, font generation using a small number of samples, design of bags etc.