Lexical-meaning relationships, pictionary games, video upsampling

These are the most interesting AI research papers published this year. It combines the development of artificial intelligence (AI) and data science. It is organized chronologically and includes links to longer articles.

Combining Visual and Linguistic Representations for Patch-Based Identification of Lexical-Semantic Relationships

Although multimodal natural language processing has had a variety of applications, there have been several studies focused on multimodal relational lexical semantics. The researcher’s first attempt to use visual cues to identify lexical-semantic relationships representing linguistic phenomena such as synonyms, cohypernyms, and hypernyms is proposed in this study. Researchers hypothesize that visual information can augment textual information, depending on the perceptual subcomponent of semiotic textological language theory. At the same time, conventional approaches make use of paradigmatic approaches or/and distributional hypotheses.

To do this, researchers automatically add visual information to two gold standard datasets and follow a patch-based approach to create several fusion algorithms that mix text and visual modalities. Experimental results using multimodal datasets show that visual information can reliably improve performance by bridging semantic gaps in text encoding.

DrawMon: A Distributed System for Detecting Atypical Sketch Content in Parallel Pictnary Games

Famous sketch-based guessing game Pictionary lets you see cooperative gameplay with a common goal, even when communication is limited. But sometimes players draw things that are not normal. This content can be essential to the game, but it can also break the rules and be uninteresting. To address this kind of problem in a timely and scalable way, the researcher came up with his DrawMon. DrawMon is a new distributed system for simultaneously and automatically finding sketches with unique content in pictionary games.

Researchers have created a dedicated online tool for collecting game session data and annotating atypical sketch content. This led to his AtyPict, the first-ever atypical sketch content dataset. The researcher trains CanvasNet, a deep neural network that can find unusual material, on her AtyPict. CanvasNet is one of the important parts of DrawMon. Analysis of post-deployment game session data shows that DrawMon is well suited for large-scale monitoring and detecting unusual sketch content. Apart from Pictionary, their work can also be used as a guide for creating unique and unusual content response systems using shared interactive whiteboards.

Extreme Scale Talking Face Video Upsampling Using Audio-Visual Prior

In this study, researchers look at the interesting question of what can be learned from an 8*8 pixel video sequence. Turned out to be more than I thought. Experts show that by processing these 8*8 videos with proper audio and image priority, we can get a full-length 256*256 video of him. Our new audiovisual upsampling network helps researchers invoke this 32* scaling from very low resolution inputs. The pre-audio helps in capturing basic facial features and the exact shape of the lips, and one high-resolution image of her in front of the target name gives a lot of information about what the person looks like. provide.

Their plan is a multi-step system that works from start to finish. The first stage creates a rough intermediate output video. It can be used to animate a single target identification image to create realistic, accurate and high quality output. Their method is simple and performs much better than other super-resolution methods (8x improvement in FID score). The expert also applies the model to video compression of talking faces and shows that it yields his 3.5x improvement over the previous state-of-the-art in terms of bits per pixel. The paper and additional material carefully consider the results from the network by performing a number of ablation tests.

Source link