Facebook AI Research – Machine Intelligence Roadmap


The development of intelligent machines is one of the biggest unsolved challenges in computer science. In this paper, we propose some fundamental properties these machines should have, focusing in particular on communication and learning. We discuss a simple environment that could be used to incrementally teach a machine the basics of natural-language-based communication, as a prerequisite to more complex interaction with human users. We also present some conjectures on the sort of algorithms the machine should support in order to profitably learn from the environment.

Tomas Mikolov, Armand Joulin, Marco Baroni
Facebook AI Research

1 Introduction

A machine capable of performing complex tasks without requiring laborious programming would be tremendously useful in almost any human endeavour, from performing menial jobs for us to helping the advancement of basic and applied research. Given the current availability of powerful hardware and large amounts of machine-readable data, as well as the widespread interest in sophisticated machine learning methods, the times should be ripe for the development of intelligent machines.

We think that one fundamental reasons for this is that, since “solving AI” at once seems too complex a task to be pursued all at once, the computational community has preferred to focus, in the last decades, on solving relatively narrow empirical problems that are important for specific applications, but do not address the overarching goal of developing general-purpose intelligent machines.

In this article, we propose an alternative approach: we first define the general characteristics we think intelligent machines should possess, and then we present a concrete roadmap to develop them in realistic, small steps, that are however incrementally structured in such a way that, jointly, they should lead us close to the ultimate goal of implementing a powerful AI. We realise that our vision of artificial intelligence and how to create it is just one among many. We focus here on a plan that, we hope, will lead to genuine progress, without by this implying that there are not other valid approaches to the task.

The article is structured as follows. In Section 2 we indicate the two fundamental characteristics that we consider crucial for developing intelligence– at least the sort of intelligence we are interested in–namely communication and learning. Our goal is to build a machine that can learn new concepts through communication at a similar rate as a human with similar prior knowledge. That is, if one can easily learn how subtraction works after mastering addition, the intelligent machine, after grasping the concept of addition, should not find it difficult to learn subtraction as well.

Since, as we said, achieving the long-term goal of building an intelligent machine equipped with the desired features at once seems too difficult, we need to define intermediate targets that can lead us in the right direction. We specify such targets in terms of simplified but self-contained versions of the final machine we want to develop. Our plan is to “educate” the target machine like a child: At any time in its development, the target machine should act like a stand-alone intelligent system, albeit one that will be initially very limited in what it can do. The bulk of our proposal (Section 3) thus consists in the plan for an interactive learning environment fostering the incremental development of progressively more intelligent behaviour.

Section 4 briefly discusses some of the algorithmic capabilities we think a machine should possess in order to profitably exploit the learning environment. Finally, Section 5 situates our proposal in the broader context of past and current attempts to develop intelligent machines.

Download the full paper here

Alternative structures for character-level RNNs


Recurrent neural networks are convenient and efficient models for language modeling. However, when applied on the level of characters instead of words, they suffer from several problems. In order to successfully model long-term dependencies, the hidden representation needs to be large. This in turn implies higher computational costs, which can become prohibitive in practice. We propose two alternative structural modifications to the classical RNN model. The first one consists on conditioning the character level representation on the previous word representation. The other one uses the character history to condition the output probability. We evaluate the performance of the two proposed modifications on challenging, multi-lingual real world data.

Piotr Bojanowski ∗
Paris, France

Armand Joulin and Tomas Mikolov
Facebook AI Research
New York, NY, USA

Download the full paper here