Intelligent Machines and Foolish Humans

[This Blog Articles post was written & submitted by J.D.F.]

We will eventually build machines so intelligent that they will be self-aware. When that happens, it will highlight two outstanding human traits: brilliance and foolhardiness. Of course, the kinds of people responsible for creating such machines would be exceptionally clever. The future, however, may show that those geniuses had blinkered vision and didn’t realise quite what they were creating. Many respected scientists believe that nothing threatens human existence more definitively than conscious machines, and that when humanity eventually takes the threat seriously, it may well be too late.

Other experts counter that warning and argue that since we build the machines we will always be able to control them. That argument seems reasonable, but it doesn’t stand up to close scrutiny. Conscious machines, those with self-awareness, could be a threat to humans for many reasons, but three in particular. First, we won’t be able to control them because we won’t know what they’re thinking. Second, machine intelligence will improve at a much faster rate than human intelligence. Scientists working in this area and in artificial intelligence (AI) in general, suggest that computers will become conscious and as intelligent as humans sometime this century, maybe even in less than two or three decades. So, machines will have achieved in about a century what took humans millions of years. Machine intelligence will continue to improve, and very quickly, we will find ourselves sharing the Earth with an intelligence form far superior to us. Third, machines can leverage their brainpower hugely by linking together. Humans can’t directly link their brains and must communicate with others by tedious written, visual, or aural messaging.

Some world-famous visionaries have sounded strong warnings about AI. Elon Musk, the billionaire entrepreneur and co-founder of PayPal, Tesla Motors, and SpaceX, described it as [we could be] “summoning the demon.” The risk is that as scientists relentlessly improve the capabilities of AI systems, at some indeterminate point, they may set off an unstoppable chain reaction where the machines wrest control from their creators. In April 2015, Stephen Hawkins, the renowned theoretical physicist, cosmologist, and author, gave a stark warning: “the development of full artificial intelligence could spell the end of the human race.” Luke Muehlhauser, director of MIRI (The Machine Intelligence Research Institute), was quoted in the Financial Times as saying that by building AI “we’re toying with the intelligence of the gods and there is no off switch.” Yet we seem to be willing to take the risk.

Perhaps most people are not too concerned because consciousness is such a nebulous concept. Even scientists working with AI may be working in the dark. We all know humans have consciousness, but nobody, not even the brightest minds, understands what it is. So we can only speculate about how or when machines might get it, if ever. Some scientists believe that when machines acquire the level of thinking power similar to that of the human brain, machines will be conscious and self-aware. In other words, those scientists believe that our consciousness is purely a physical phenomenon – a function of our brain’s complexity.

For millions of years, human beings have dominated the Earth and all other species on it. That didn’t happen because we are the largest, or the strongest, but because we are the most intelligent by far. If machines become more intelligent, we could well end up as their slaves. Worse still, they might regard us as surplus to their needs and annihilate us. That doomsday scenario has been predicted by countless science fiction writers.

Should we heed their prophetic vision as most current advanced technology was once science fiction?
Or do we have nothing to worry about?

For more on this subject, read Nick Bostrom’s highly recommended book, Superintelligence, listed in our books section.

Superintelligence by Nick Bostrom

The human brain has some capabilities that the brains of other animals lack. It is to these distinctive capabilities that our species owes its dominant position. Other animals have stronger muscles or sharper claws, but we have cleverer brains.

If machine brains one day come to surpass human brains in general intelligence, then this new superintelligence could become very powerful. As the fate of the gorillas now depends more on us humans than on the gorillas themselves, so the fate of our species then would come to depend on the actions of the machine superintelligence.

But we have one advantage: we get to make the first move. Will it be possible to construct a seed AI or otherwise to engineer initial conditions so as to make an intelligence explosion survivable? How could one achieve a controlled detonation?

To get closer to an answer to this question, we must make our way through a fascinating landscape of topics and considerations. Read the book and learn about oracles, genies, singletons; about boxing methods, tripwires, and mind crime; about humanity’s cosmic endowment and differential technological development; indirect normativity, instrumental convergence, whole brain emulation and technology couplings; Malthusian economics and dystopian evolution; artificial intelligence, and biological cognitive enhancement, and collective intelligence.

This profoundly ambitious and original book picks its way carefully through a vast tract of forbiddingly difficult intellectual terrain. Yet the writing is so lucid that it somehow makes it all seem easy. After an utterly engrossing journey that takes us to the frontiers of thinking about the human condition and the future of intelligent life, we find in Nick Bostrom’s work nothing less than a reconceptualisation of the essential task of our time.

Stanford’s Open Course on Natural Language Processing (NLP)

If you are interested in doing Stanford’s Open Course on Natural Language Processing (NLP), Coursera (coursera.org) have made the full course available on YouTube through 101 video lessons.

The full Stanford NLP Open Course can be found via the following YouTube playlist: https://www.youtube.com/playlist?list=PL4LJlvG_SDpxQAwZYtwfXcQr7kGnl9W93

Here is the Course Introduction (1 – 1):

Presented by Professor Dan Jurafsky & Chris Manning (nlp-class.org), the Natural Language Processing (NLP) course contains the following lessons:

1 – 1 – Course Introduction – Stanford NLP – Professor Dan Jurafsky & Chris Manning
2 – 1 – Regular Expressions – Stanford NLP – Professor Dan Jurafsky & Chris Manning
2 – 2 – Regular Expressions in Practical NLP – Stanford NLP – Professor Dan Jurafsky & Chris Manning
2 – 3 – Word Tokenization- Stanford NLP – Professor Dan Jurafsky & Chris Manning
2 – 4 – Word Normalization and Stemming – Stanford NLP – Professor Dan Jurafsky & Chris Manning
2 – 5 – Sentence Segmentation – Stanford NLP – Professor Dan Jurafsky & Chris Manning
3 – 1 – Defining Minimum Edit Distance – Stanford NLP – Professor Dan Jurafsky & Chris Manning
3 – 2 – Computing Minimum Edit Distance – Stanford NLP – Professor Dan Jurafsky & Chris Manning
3 – 3 – Backtrace for Computing Alignments – Stanford NLP – Professor Dan Jurafsky & Chris Manning
3 – 4 – Weighted Minimum Edit Distance – Stanford NLP – Professor Dan Jurafsky & Chris Manning
3 – 5 – Minimum Edit Distance in Computational Biology-Stanford NLP-Dan Jurafsky & Chris Manning
4 – 1 – Introduction to N-grams- Stanford NLP – Professor Dan Jurafsky & Chris Manning
4 – 2 – Estimating N-gram Probabilities – Stanford NLP – Professor Dan Jurafsky & Chris Manning
4 – 3 – Evaluation and Perplexity – Stanford NLP – Professor Dan Jurafsky & Chris Manning
4 – 4 – Generalization and Zeros – Stanford NLP – Professor Dan Jurafsky & Chris Manning
4 – 5 – Smoothing_ Add-One – Stanford NLP – Professor Dan Jurafsky & Chris Manning
4 – 6 – Interpolation – Stanford NLP – Professor Dan Jurafsky & Chris Manning
4 – 7 – Good-Turing Smoothing – Stanford NLP – Professor Dan Jurafsky & Chris Manning
4 – 8 – Kneser-Ney Smoothing – Stanford NLP – Professor Dan Jurafsky & Chris Manning
5 – 1 – The Spelling Correction Task – Stanford NLP – Professor Dan Jurafsky & Chris Manning
5 – 2 – The Noisy Channel Model of Spelling – Stanford NLP – Professor Dan Jurafsky & Chris Manning
5 – 3 – Real-Word Spelling Correction – Stanford NLP – Professor Dan Jurafsky & Chris Manning
5 – 4 – State of the Art Systems – Stanford NLP – Professor Dan Jurafsky & Chris Manning
6 – 1 – What is Text Classification- Stanford NLP – Professor Dan Jurafsky & Chris Manning
6 – 2 – Naive Bayes – Stanford NLP – Professor Dan Jurafsky & Chris Manning
6 – 3 – Formalizing the Naive Bayes Classifier – Stanford NLP-Dan Jurafsky & Chris Manning
6 – 4 – Naive Bayes_ Learning – Stanford NLP – Professor Dan Jurafsky & Chris Manning
6 – 5 – Naive Bayes_ Relationship to Language Modeling-Stanford NLP-Dan Jurafsky & Chris Manning
6 – 6 – Multinomial Naive Bayes_ A Worked Example – Stanford NLP-Dan Jurafsky & Chris Manning
6 – 7 – Precision, Recall, and the F measure – Stanford NLP – Professor Dan Jurafsky & Chris Manning
6 – 8 – Text Classification_ Evaluation- Stanford NLP – Professor Dan Jurafsky & Chris Manning
6 – 9 – Practical Issues in Text Classification – Stanford NLP-Dan Jurafsky & Chris Manning
7 – 1 – What is Sentiment Analysis- Stanford NLP – Professor Dan Jurafsky & Chris Manning
7 – 2 – Sentiment Analysis_ A baseline algorithm- NLP-Dan Jurafsky & Chris Manning
7 – 3 – Sentiment Lexicons – Stanford NLP – Professor Dan Jurafsky & Chris Manning
7 – 4 – Learning Sentiment Lexicons – Stanford NLP – Professor Dan Jurafsky & Chris Manning
7 – 5 – Other Sentiment Tasks – Stanford NLP – Professor Dan Jurafsky & Chris Manning
8 – 1 – Generative vs. Discriminative Models- Stanford NLP – Professor Dan Jurafsky & Chris Manning
8 – 2 – Making features from text for discriminative NLP models-Dan Jurafsky & Chris Manning
8 – 3 – Feature-Based Linear Classifiers – Stanford NLP – Professor Dan Jurafsky & Chris Manning
8 – 4 – Building a Maxent Model_ The Nuts and Bolts-Dan Jurafsky & Chris Manning
8 – 5 – Generative vs. Discriminative models_ The problem of overcounting evidence- Stanford NLP
8 – 6 – Maximizing the Likelihood- Stanford NLP – Professor Dan Jurafsky & Chris Manning
9 – 1 – Introduction to Information Extraction- Stanford NLP-Dan Jurafsky & Chris Manning
9 – 2 – Evaluation of Named Entity Recognition- Stanford NLP-Dan Jurafsky & Chris Manning
9 – 3 – Sequence Models for Named Entity Recognition-NLP-Professor Dan Jurafsky & Chris Manning
9 – 4 – Maximum Entropy Sequence Models- Stanford NLP – Professor Dan Jurafsky & Chris Manning
10 – 1 – What is Relation Extraction- Stanford NLP – Professor Dan Jurafsky & Chris Manning
10 – 2 – Using Patterns to Extract Relations – Stanford NLP – Professor Dan Jurafsky & Chris Manning
10 – 3 – Supervised Relation Extraction – Stanford NLP – Professor Dan Jurafsky & Chris Manning
10 – 4 – Semi-Supervised and Unsupervised Relation Extraction-Dan Jurafsky & Chris Manning
11 – 1 – The Maximum Entropy Model Presentation-NLP-Dan Jurafsky & Chris Manning
11 – 2 – Feature Overlap_Feature Interaction-Stanford NLP-Professor Dan Jurafsky & Chris Manning
11 – 3 – Conditional Maxent Models for Classification–NLP-Dan Jurafsky & Chris Manning
11 – 4 – Smoothing_Regularization_Priors for Maxent Models-NLP-Dan Jurafsky & Chris Manning
12 – 1 – An Intro to Parts of Speech and POS Tagging -NLP-Dan Jurafsky & Chris Manning
12 – 2 – Some Methods and Results on Sequence Models for POS Tagging -Dan Jurafsky Chris Manning
13 – 1 – Syntactic Structure_ Constituency vs Dependency -NLP-Dan Jurafsky & Chris Manning
13 – 2 – Empirical_Data-Driven Approach to Parsing-NLP-Dan Jurafsky & Chris Manning
14 – 1 – Instructor Chat –NLP-Dan Jurafsky & Chris Manning
15 – 1 – CFGs and PCFGs -Stanford NLP-Professor Dan Jurafsky & Chris Manning
15 – 2 – Grammar Transforms-Stanford NLP-Professor Dan Jurafsky & Chris Manning
15 – 3 – CKY Parsing -Stanford NLP-Professor Dan Jurafsky & Chris Manning
15 – 4 – CKY Example-Stanford NLP-Professor Dan Jurafsky & Chris Manning
15 – 5 – Constituency Parser Evaluation -Stanford NLP-Professor Dan Jurafsky & Chris Manning
16 – 1 – Lexicalization of PCFGs-Stanford NLP-Professor Dan Jurafsky & Chris Manning
16 – 2 – Charniak’s Model-Stanford NLP-Professor Dan Jurafsky & Chris Manning
16 – 3 – PCFG Independence Assumptions-Stanford NLP-Professor Dan Jurafsky & Chris Manning
16 – 4 – The Return of Unlexicalized PCFGs-Stanford NLP-Professor Dan Jurafsky & Chris Manning
16 – 5 – Latent Variable PCFGs-Stanford NLP-Professor Dan Jurafsky & Chris Manning
17 – 1 – Dependency Parsing Introduction-Stanford NLP-Professor Dan Jurafsky & Chris Manning
17 – 2 – Greedy Transition-Based Parsing-Stanford NLP-Professor Dan Jurafsky & Chris Manning
17 – 3 – Dependencies Encode Relational Structure-Stanford NLP-Dan Jurafsky & Chris Manning
18 – 1 – Introduction to Information Retrieval-Stanford NLP-Professor Dan Jurafsky & Chris Manning
18 – 2 – Term-Document Incidence Matrices -Stanford NLP-Professor Dan Jurafsky & Chris Manning
18 – 3 – The Inverted Index-Stanford NLP-Professor Dan Jurafsky & Chris Manning
18 – 4 – Query Processing with the Inverted Index-Stanford NLP-Dan Jurafsky & Chris Manning
18 – 5 – Phrase Queries and Positional Indexes-Stanford NLP-Professor Dan Jurafsky & Chris Manning
19 – 1 – Introducing Ranked Retrieval-Stanford NLP-Professor Dan Jurafsky & Chris Manning
19 – 2 – Scoring with the Jaccard Coefficient-Stanford NLP-Professor Dan Jurafsky & Chris Manning
19 – 3 – Term Frequency Weighting-Stanford NLP-Professor Dan Jurafsky & Chris Manning
19 – 4 – Inverse Document Frequency Weighting-Stanford NLP-Professor Dan Jurafsky & Chris Manning
19 – 5 – TF-IDF Weighting-Stanford NLP-Professor Dan Jurafsky & Chris Manning
19 – 6 – The Vector Space Model -Stanford NLP-Professor Dan Jurafsky & Chris Manning
19 – 7 – Calculating TF-IDF Cosine Scores-Stanford NLP-Professor Dan Jurafsky & Chris Manning
19 – 8 – Evaluating Search Engines -Stanford NLP-Professor Dan Jurafsky & Chris Manning
20 – 1 – Word Senses and Word Relations-NLP-Dan Jurafsky & Chris Manning
20 – 2 – WordNet and Other Online Thesauri -NLP-Dan Jurafsky & Chris Manning
20 – 3 – Word Similarity and Thesaurus Methods -NLP-Dan Jurafsky & Chris Manning
20 – 4 – Word Similarity_ Distributional Similarity I –NLP-Dan Jurafsky & Chris Manning
20 – 5 – Word Similarity_ Distributional Similarity II -NLP-Dan Jurafsky & Chris Manning
21 – 1 – What is Question Answering-NLP-Dan Jurafsky & Chris Manning
21 – 2 – Answer Types and Query Formulation-NLP-Dan Jurafsky & Chris Manning
21 – 3 – Passage Retrieval and Answer Extraction-NLP-Dan Jurafsky & Chris Manning
21 – 4 – Using Knowledge in QA -NLP-Dan Jurafsky & Chris Manning
21 – 5 – Advanced_ Answering Complex Questions-NLP-Dan Jurafsky & Chris Manning
22 – 1 – Introduction to Summarization-NLP-Dan Jurafsky & Chris Manning
22 – 2 – Generating Snippets-NLP-Dan Jurafsky & Chris Manning
22 – 3 – Evaluating Summaries_ ROUGE-NLP-Dan Jurafsky & Chris Manning
22 – 4 – Summarizing Multiple Documents-NLP-Dan Jurafsky & Chris Manning
23 – 1 – Instructor Chat II -Stanford NLP-Professor Dan Jurafsky & Chris Manning

[Stanford NLP Open Course video playlist: https://www.youtube.com/playlist?list=PL4LJlvG_SDpxQAwZYtwfXcQr7kGnl9W93]

An introduction to TensorFlow: Open source machine learning

TensorFlow is an open source software library for numerical computation using data flow graphs. Originally developed by researchers and engineers working on the Google Brain Team within Google’s Machine Intelligence AI research organisation for the purposes of conducting machine learning and deep neural networks research.

To learn more about TensorFlow, visit tensorflow.org

Human Immortality Through AI

Will AI research some day lead to human immortality? There are a few groups and companies that believe that one day, the human race will merge with Artificially Intelligent machines.

One approach to this would be that of which Ray Kurzweil describes in his 2005 book The Singularity Is Near. Kurzweil pens a world where humans transcend biology by implanting AI nano-bots directly into the neural networks of the brain. The futurist and inventor also predicts that humans are going to develop emotions and characteristics of higher complexity as a result of this. Kurzweil’s prediction is that this will happen around the year 2030. That’s less than 15 years away!

Another path to human immortality could be by way of uploading your mind’s data or a ‘mind-file’ into a database, to later be imported or downloaded into an AI’s brain that will continue on life as you. This is something a company based in Los Angeles called Humai is claiming to be working on (however I’m still not 100% convinced that the recent blanket media coverage on Humai is not all part of an elaborate PR stunt for a new Hollywood movie!). Humai’s current website meta title reads: ‘Humai Life: Extended | Enhanced | Restored’. Their mission statement sounds very bold and ambitious for our current times and news headlines like ‘Humai wants to resurrect the dead with artificial intelligence’ do not help but the AI tech start-up does make a point of saying that the AI technology for the mind restoration part of the process will not be ready for another 30 years.

But do we really want to live forever and do non AI Researchers even care about this? In the heart of Silicon Valley, Joon Yun is a hedge fund manager doing US social security data calculations. Yun says, “the probability of a 25-year-old dying before their 26th birthday is 0.1%”. If we could keep that risk constant throughout life instead of it rising due to age-related disease, the average person would – statistically speaking – live 1,000 years. In December 2014, Yun announced a $1m prize fund to challenge scientists to “hack the code of life” and push the human lifespan past its apparent maximum of around 120 years.

Like or not, in one way or another, the immortality of human beings is probably something that is going to happen – unless the invention of AI and the singularity renders us extinct that is!

Facebook AI Research – Machine Intelligence Roadmap

Abstract

The development of intelligent machines is one of the biggest unsolved challenges in computer science. In this paper, we propose some fundamental properties these machines should have, focusing in particular on communication and learning. We discuss a simple environment that could be used to incrementally teach a machine the basics of natural-language-based communication, as a prerequisite to more complex interaction with human users. We also present some conjectures on the sort of algorithms the machine should support in order to profitably learn from the environment.

Tomas Mikolov, Armand Joulin, Marco Baroni
Facebook AI Research

1 Introduction

A machine capable of performing complex tasks without requiring laborious programming would be tremendously useful in almost any human endeavour, from performing menial jobs for us to helping the advancement of basic and applied research. Given the current availability of powerful hardware and large amounts of machine-readable data, as well as the widespread interest in sophisticated machine learning methods, the times should be ripe for the development of intelligent machines.

We think that one fundamental reasons for this is that, since “solving AI” at once seems too complex a task to be pursued all at once, the computational community has preferred to focus, in the last decades, on solving relatively narrow empirical problems that are important for specific applications, but do not address the overarching goal of developing general-purpose intelligent machines.

In this article, we propose an alternative approach: we first define the general characteristics we think intelligent machines should possess, and then we present a concrete roadmap to develop them in realistic, small steps, that are however incrementally structured in such a way that, jointly, they should lead us close to the ultimate goal of implementing a powerful AI. We realise that our vision of artificial intelligence and how to create it is just one among many. We focus here on a plan that, we hope, will lead to genuine progress, without by this implying that there are not other valid approaches to the task.

The article is structured as follows. In Section 2 we indicate the two fundamental characteristics that we consider crucial for developing intelligence– at least the sort of intelligence we are interested in–namely communication and learning. Our goal is to build a machine that can learn new concepts through communication at a similar rate as a human with similar prior knowledge. That is, if one can easily learn how subtraction works after mastering addition, the intelligent machine, after grasping the concept of addition, should not find it difficult to learn subtraction as well.

Since, as we said, achieving the long-term goal of building an intelligent machine equipped with the desired features at once seems too difficult, we need to define intermediate targets that can lead us in the right direction. We specify such targets in terms of simplified but self-contained versions of the final machine we want to develop. Our plan is to “educate” the target machine like a child: At any time in its development, the target machine should act like a stand-alone intelligent system, albeit one that will be initially very limited in what it can do. The bulk of our proposal (Section 3) thus consists in the plan for an interactive learning environment fostering the incremental development of progressively more intelligent behaviour.

Section 4 briefly discusses some of the algorithmic capabilities we think a machine should possess in order to profitably exploit the learning environment. Finally, Section 5 situates our proposal in the broader context of past and current attempts to develop intelligent machines.

Download the full paper here

Deep Grammar will correct your grammatical errors using AI

Deep Grammar is a grammar checker built on top of deep learning. Deep Grammar uses deep learning to learn a model of language, and it then uses this model to check text for errors in three steps:

  1. Compute the likelihood that someone would have intended to write the text.
  2. Attempt to generate text that is close to the written text but is more likely.
  3. If such text is found, show it to the user as a possible correction.

To see Deep Grammar in action. Consider the sentence “I will tell he the truth.” Deep Grammar calculates that this sentence is unlikely, and it tries to come up with a sentence that is close to that sentence but is likely. It finds the sentence “I will tell him the truth.” Since this sentence is both likely and close to the original sentence, it suggests it to the user as a correction.

Here are some other examples of sentences with the corrections found by Deep Grammar:

  • We know that our brains our not perfect. –> We know that our brains are not perfect.
  • Have your ever wondered about it? –> Have you ever wondered about it.
  • To bad the development has stopped. –> Too bad the development has stopped.

You can find a quantitative evaluation of deep grammar here.

The Lovelace 2.0 Test of Artificial Creativity and Intelligence

The Lovelace 2.0 Test asks whether a computer can create an artefact – a poem, story, painting or architectural design – that expert and unbiased observers would conclude was designed by a human.

Prof. Mark Riedl proposes the concept of artificial creativity, akin to artificial intelligence. This could be tested he says, using an alternative to the Turing Test, the AI benchmark that asserts that if a computer system can fool a human being into thinking it is human itself, then it can be said to be truly intelligent.

You can read Prof. Riedl’s paper below:

Abstract

Observing that the creation of certain types of artistic artefacts necessitate intelligence, we present the Lovelace 2.0 Test of creativity as an alternative to the Turing Test as a means of determining whether an agent is intelligent.

The Lovelace 2.0 Test builds off prior tests of creativity and additionally provides a means of directly comparing the relative intelligence of different agents.

Mark O. Riedl
School of Interactive Computing; Georgia Institute of Technology
riedl@cc.gatech.edu

Download the full paper here

Facebook Bolsters AI Research Team

[Announcement from Facebook AI Research]

The Facebook AI Research team is excited to announce additions joining from both academia and industry. Highlighting the newest members to this quickly growing team including the award-winning Léon Bottou and Laurens van der Maaten. Their work will focus on several aspects of machine learning, with applications to image, speech, and natural language understanding.

Léon Bottou joins us from Microsoft Research. After his PhD in Paris, he held research positions at AT&T Bell Laboratories, AT&T Labs-Research and NEC Labs. He is best known for his pioneering work on machine learning, structured prediction, stochastic optimization, and image compression. More recently, he worked on causal inference in learning systems. He is rejoining some of his long-time collaborators Jason Weston, Ronan Collobert, Antoine Bordes and Yann LeCun, the latter with whom he developed the widely used DjVu compression technology and the AT&T check reading system. Leon is a laureate of the 2007 Blavatnik Award for Young Scientists.

Nicolas Usunier was most recently a professor at Université de Technologie de Compiègne and also held a chair position from the “CNRS-Higher Education Chairs” program. Nicolas earned his PhD in machine learning in 2006 with specific focus areas in theory, ranking, and learning with multiples objectives. At FAIR he will work on text understanding tasks, especially question answering, and on the design of composite objective functions that can define complex learning problems from simpler ones.

Anitha Kannan comes to us from Microsoft Research where she worked on various applications in computer vision, Web and e-Commerce search, linking structured and unstructured data sources and computational education. Anitha received her PhD from the University of Toronto and will continue her research in machine learning and computer vision.

Laurens van der Maaten comes to us with an extensive history working on machine learning and computer vision. Prior to joining Facebook, Laurens was an Assistant Professor at Delft University of Technology, a post-doctoral researcher at UC San Diego and a Ph.D. student at Tilburg University. He will continue his research on learning embeddings for visualization and deep learning, time series classification, regularization, and cost-sensitive learning.

Michael Auli joins FAIR after completing a postdoc at Microsoft Research where he worked on improving language translation quality using recurrent neural networks. He earned a Ph.D. at the University of Edinburgh for his work on syntactic parsing with approximate inference.

Gabriel Synnaève was most recently a postdoctoral fellow at Ecole Normale Supérieure in Paris. Prior to that, he received his PhD in Bayesian modeling applied to real-time strategy games AI from University of Grenoble in 2012. Gabriel will initially be working on speech recognition and language understanding.

Having hired more than 40 people across our Menlo Park and New York labs — including some of the top AI researchers and engineers in the world, these new hires underscore our commitment to advancing the field of machine intelligence and developing technologies that give people better ways to communicate.

[End of announcement]

Alternative structures for character-level RNNs

Abstract

Recurrent neural networks are convenient and efficient models for language modeling. However, when applied on the level of characters instead of words, they suffer from several problems. In order to successfully model long-term dependencies, the hidden representation needs to be large. This in turn implies higher computational costs, which can become prohibitive in practice. We propose two alternative structural modifications to the classical RNN model. The first one consists on conditioning the character level representation on the previous word representation. The other one uses the character history to condition the output probability. We evaluate the performance of the two proposed modifications on challenging, multi-lingual real world data.

Piotr Bojanowski ∗
INRIA
Paris, France
piotr.bojanowski@inria.fr

Armand Joulin and Tomas Mikolov
Facebook AI Research
New York, NY, USA
tmikolov.ajoulin@fb.com

Download the full paper here