DeepText: Facebook’s text understanding engine
Purpose of Develop DeepText
Understanding the various ways text is used on Facebook can help us improve people’s experiences with our products, whether we’re surfacing more of the content that people want to see or filtering out undesirable content like spam.
The community on Facebook is truly global, so it’s important for DeepText to understand as many languages as possible.
How to make machine works to understand human language?
Using deep learning, we can reduce the reliance on language-dependent knowledge, as the system can learn from text with no or little preprocessing. This helps us span multiple languages quickly, with minimal engineering effort.
To get closer to how humans understand text, we need to teach the computer to understand things like slang and word-sense disambiguation. As an example, if someone says, “I like blackberry,” does that mean the fruit or the device?
What is happening in DeepText?
In traditional NLP approaches, words are converted into a format that a computer algorithm can learn. The word “brother” might be assigned an integer ID such as 4598, while the word “bro” becomes another integer, like 986665. This representation requires each word to be seen with exact spellings in the training data to be understood.
With deep learning, we can instead use “word embeddings,” a mathematical concept that preserves the semantic relationship among words. So, when calculated properly, we can see that the word embeddings of “brother” and “bro” are close in space. This type of representation allows us to capture the deeper semantic meaning of words.
Using word embeddings, we can also understand the same semantics across multiple languages, despite differences in the surface form. As an example, for English and Spanish, “happy birthday” and “feliz cumpleaños” should be very close to each other in the common embedding space. By mapping words and phrases into a common embedding space, DeepText is capable of building models that are language-agnostic.
From here, we can see that Facebook is intend to further develop natural language processing (which is machine communicate with human in human language) with statistical machine translation (Multiple Language translation & processing at Machine Level) follow by neural network processing whether if user post is statement or sale point (meaning buy or sell stuff).
We shall see following challenges which being addressed in natural language processing tackled with solutions, good luck to DeepText.
- Excellence ambiguous
- Metonymy (Example: Play Mozart, play is a metonymy)