A large language model (LLM) is a computational model designed to perform natural language processing tasks, especially language generation, using statistical patterns learned from large text corpora. LLMs can generate, summarize, translate and parse text in many contexts, and are a foundational technology behind modern chatbots. LLMs can produce text that resembles natural language patterns because they are trained on collections of human-written text. For the same reason, biased or inaccurate training data can make an LLM's output less reliable.
As of 2024, the largest and most capable LLMs are all based on transformer architectures, which, according to the 2017 paper Attention Is All You Need, can be more efficient and parallelizable than earlier statistical and recurrent neural network models. Research into other architectures, such as state space models, is ongoing.
Benchmark evaluations for LLMs attempt to measure model reasoning, factual accuracy, alignment, and safety.
View More On Wikipedia.org