Hello, I’m Nagisa doing an internship at HACARUS. Here I report what we discussed at ”Machine Learning Meetup Kansai #6 (MLM Kansai)”, which is a meetup for machine learning engineers.
This is the sixth time and was held at one of the office of ”Sakura Internet Co., Ltd.” Around 30-40 people attended it and the 3 speakers gave insightful presentations.
#1 Park-san ( LINE Corporation )
-Models for Normalization of Texts with Machine Learning-
The first speaker was Mr.Park, a data scientist at LINE. He is now in charge of developing a speech synthesis service. The content of the speech was about the normalizing input texts, one of the preprocessing for the service.
Text Normalization in this sense is a processing of eliminating an unnecessary word or signal, correcting misspelling and so on to make sentences easier to be analyzed by a computer from the perspective of ASCII.
He introduced four models to achieve the four purposes of ”unifying the way of speaking(writing), completing a sentence, correcting a slung to a standard one and eliminating a meaningless word or signal”.
The first one is a model based on conditional random field (CRF). It has a capability of separating sentences into words. The next is Recurrent Neural Network (RNN) to output normalized texts and Auto Encoder with corpus to modify a slung followed. The last one is a model which is capable to take a context in sentences and word-by-word meaning as well.
Although It looks a small part of work for the speech synthesis service, there are a lot to do. I think I have to reconsider the importance of the preprocessing.
#2 Takahashi Tomohiro-san ( OMRON Corporation )
He explained his recently published article ”Decentralized Learning of Generative Adversarial Networks from Multi-Client Non-iid Data”. As you know, GAN stands for generative adversarial network invented by Ian Goodfellow and his colleagues. Almost All of attendants look they know what it is like but only few of them understand it mathematically.
The motivation he wrote this paper is following. GAN is useful when you want to create an image but it can create only images which are similar to the images used for GAN learning. To overcome it, one of ways is to have GAN learn plenty of images. However, it would be a bottleneck that learning such a large number of images because of the expensive learning cost of GAN.
Usually, we put data and the discriminator and the generator, neural networks comprising GAN, on the same place somewhere like a data server. However, as more data are accumulated, it is much harder to handle the data. Therefore he suggested the way of decentralized leaning of GAN. Each data remains to be stored in each place called ”client” with a discriminator and the generator is in a server. GAN learn through communications between each clients and the server. Obviously, this system will produce massive data traffic as the number of data (client) increases. He found the efficient way of avoiding such heavy traffic exploiting the algorithm. He discussed that it is mathematically fine to select the max output from each clients and only use it to learn. This reduces a data traffic drastically.
Although this has a security problem, that is, data to be used in GAN can be seen, it is a remarkable progress.
#3 Arikata-san ( Panasonic Corporation )
-the design sense in AI development-
Mr.Arikata discussed an uncertainty caused by human beings. He started from a talk about a reason a project is often not accomplished before the due date and postponed is because human beings do not always do his work as expected. In this sense human beings have an uncertainty. Then, his talk jumped to the topic of when people make an annotation to the image data. Because of the uncertainty they often make a mistake. He mentioned that it was important to make a user friendly interface annotation tool and he and his colleagues actually did.
It was not a talk about a preprocess like data augmentation. I would say it was about a preprocess of preprocess. The correctness of data might have more or less effect on the evaluation of a model, so it might be meaningful.
Inari-sushi ( It’s a fried bean curd stuffed with vinegared rice. ) was served in the event.
As usual, Lightning talks started.
The first speaker was Mr.Murayama. He talked his story of making ”hello world in enforcement learning” with a simple model. The content was really cool but what was awesome about his talk was his jokes. They were so funny that everyone there laughed at them. In this LT, we also listened to a talk about abnormal detection in server and one about an experience of hosting a data competition
It was the first time to attend this kind of meetup for engineers. I had an opportunity to talk to people whom I will never talk to in my everyday life and found an unexpected and interesting thing. Overall, it was nice. The MLM Kansai is getting popular and the organizer announced that he was going to hold the next meetup. Why not join us! If you are interested in it you can find some information at (https://mlm-kansai.connpass.com/).
Event Hashtag : #mlm_kansai