Sentient or just really smart?

The original article was published by Tata Consultancy Services. You can find the article here.

How are language models structured?

Developments in AI language models have increased concerns about machine sentience, particularly after the Google incident in which an engineer was fired for citing concerns over LAMDA’s sentience. LAMDA is Google’s internal language model and uses smart pattern matching to give an output, much like other proprietary models. Whether machine sentience is even viable on a near horizon is debatable, but to assess the claim, a quick primer on how language models are structured and what factors determine the machine’s intelligence level.


Natural language processing mimics human conversations

Language models are one of the most important building blocks of Natural Language Processing (NLP) applications and are used to generate text output. These predictive AI models use probabilities to deliver the most humanistic output that mimics realistic conversations. The goal of any language model is to find patterns in human communication and utilize them to deliver a specific output. The level of accuracy depends on the core language model deployed, algorithm’s structure, data sets, and computational power utilized.

Several proprietary language models have been designed to achieve predetermined objectives like speech recognition, machine translation, sentiment analysis, text suggestions, etc. However, these are built on core models, which can be categorized into gram-based and neural language models.


From keyword-, to phrase-, to sentiment-based

Stanford classifies language models into two types—unigram and bigram. The core distinction is how the data is analyzed. As the name implies, unigram analyzes it as a one-word sequence while bigram analyzes two, trigram three, and so on. This gradual improvement has transitioned AI chatbot responses from keyword-based to phrase-based. This is being further developed into sentiment-based.

Language models can also be classified based on their operations—statistical and neural language models. Statistical language models are predictive AI models such as the unigram, bigram, and exponential models that utilize the preceding word and probabilities to deliver an output. Since these models used mathematical calculations, they fail to capture the entire essence of a conversation. Therefore, to humanize responses, the neural language model was developed.

With a three-layered feedforward specialized network topology, the SNN is one of the most powerful neural networks that can process temporal data in real-time. This high computational power and advanced topology make it suitable for robotics and computer vision applications that require real-time data processing.

SNN facilitates real-time sourcing and processing of the data and is a major improvement over other neural networks, which primarily rely on frequency rather than temporal data.

SNN is one of the most powerful neural networks that can process temporal data in real-time. 


Getting closer to human emulation

The neural language models parameterize words to overcome the sparsity issue. This AI predictive model prioritizes probability distribution against sequencing and is far more accurate in delivering relevant output. It is therefore widely used for machine translation, language generation, and dialogue systems. However, neural language models require more time to train and can be complex to implement.

Two popular neural language models are Open AI’s GPT-2 and its successor GPT-3. The GPT-2 used 1.5 billion parameters, and the most recent GPT-3 used 175 billion parameters. These are getting closer to emulating human communication styles, but machine sentience is out of the question.

Below is an overview of the sentiment analysis model, a neural language model, and the precursor for AI concerns.


Deep learning for subjectivity

This language model derives its name from its pattern recognition technique and must not be misconstrued with its semantic proximity to sentient. Also referred to as opinion mining, the sentiment analysis model utilizes deep learning techniques to identify subjective opinions through smart pattern matching. This is how LAMDA comprehended queries and responded to them—by drawing and combining text from a large data set on which it was trained

Machine sentience is a very long way away

Machine sentience is a misconception and not an AI concern per se because machines are incapable of sensing or responding to feelings. The output delivered is from the fed data, so the Garbage In Garbage Out (GIGO) concept applies. Most conclusions are drawn based on the Turing test, which is now outdated. Proposals have been made to replace that with advanced alternatives.

As current language models train on large data sets and utilize sophisticated algorithms, passing the Turing test does not confirm sentience. This longstanding yardstick involves a test with a human initiating a text conversation on one side and multiple other humans along with a machine on the other side. If a distinction cannot be made between responses generated by other humans and the machine, the participating machine is said to have passed the Turing test. This was the source of confusion in the LAMDA case.

Machine sentience is a distant dream because humans decide how a language model would work and define what datasets the model must train on. So, just because a machine is fed with large datasets and is powered by an intelligent algorithm does not make it sentient. It is still controlled by humans who can turn it off anytime they want to.

Share post

Your B2B Navigator in the World of Market Intelligence - Contact Us

Tailored B2B information solutions. We collect and integrate vital intelligence, empowering your growth strategies and competitive edge. Accelerate your pathway to success.