Safekipedia

Formal language

Adapted from Wikipedia Β· Discoverer experience

In logic, mathematics, computer science, and linguistics, a formal language is a special kind of language made up of strings of symbols. These symbols come from a set known as an "alphabet". When these symbols are put together in certain ways, they form strings, also called "words".

Formal languages are important in computer science because they help define how programming languages work. They are also used in studying how hard it is for computers to understand and process information. In mathematics and logic, formal languages help us understand the rules and structure of mathematical systems.

The study of formal languages looks mostly at their patterns and structures, without worrying too much about their meanings. This idea came from the study of natural languages, helping us see the rules that govern how words and sentences are built.

History

In the 1600s, a thinker named Gottfried Leibniz dreamed up a special way to write ideas using simple pictures. Later, a scientist named Carl Friedrich Gauss studied special codes called Gauss codes.

In the 1800s, George Boole showed how to use symbols to describe logical thinking, like using math to solve puzzles. Another thinker, Gottlob Frege, tried to create a clear system of symbols to express ideas.

In the early 1900s, many smart people worked on formal languages. Axel Thue wrote about words and patterns, and Leonardo Torres Quevedo made a system to describe machines. Later, Noam Chomsky created a way to organize different kinds of languages, and John Backus developed a method to describe computer programming languages.

Words over an alphabet

Main article: Alphabet (formal languages)

In the study of formal languages, an alphabet is a collection of symbols or letters. These symbols can be used to create words by putting them together in sequences. A word is simply a finite line-up of these symbols from the alphabet.

For any alphabet, there is one special word with no symbols at all, called the empty word. When we join two words together, we get a new word that has just as many symbols as the two original words combined.

Definition

A formal language is a collection of special strings made from a set of symbols called an "alphabet." These strings are formed by putting the symbols together in specific ways. For example, if the alphabet has the symbols A, B, and C, then "ABC" and "BAA" might be strings that belong to the language, while "CAD" might not.

In subjects like computer science and math, people often just say "language" instead of "formal language" when the set of symbols is clear. Some formal languages follow special rules, like regular languages or context-free languages, but the basic idea is simply a group of strings made from a given alphabet.

Examples

This section talks about a special kind of language used in math and computers. Imagine you have a set of symbols, like the numbers 0 to 9, plus the symbols "+" and "=". These symbols can be put together in certain ways to make strings, or words.

Some rules decide which strings belong to this special language. For example, the string "23+4=555" follows the rules, but "=234+" does not. These rules only describe how the strings look, not what they mean. For instance, the rule does not say that "0" means the number zero or that "+" means addition.

Constructions

For small sets of words, we can list them all. But for bigger sets, there are too many words to list. Instead, we use special ways to describe them. For example, we might say a language includes all words made from the symbol "a", like "a", "aa", "aaa", and so on. Other examples include the correct programs in a programming language or all possible inputs that a computer can understand.

Language-specification formalisms

Formal languages are important tools in many areas of study. They help us understand how to describe and work with sets of strings, which are sequences of symbols.

A formal language can be defined in several ways, such as through rules called grammars, patterns called regular expressions, or machines that can recognize certain strings. We often study how powerful these different ways of describing languages are, how easy it is to tell if a string belongs to a language, and how we can compare different languages. Sometimes, answering these questions can be very difficult or even impossible! This is why formal languages are closely linked to the study of what computers can and cannot do, and how complex these tasks can be.

Operations on languages

We can do different things with groups of words made from sets of symbols. Some common actions include putting words together, finding words that appear in more than one group, and looking at words that don’t appear in a certain group.

For example, if we have two groups of words, we can join them together to make new words. We can also find words that are in both groups, or look at all possible words that are not in a certain group. There are also special ways to repeat words many times or to reverse the order of symbols in a word. These actions help us understand how different groups of words relate to each other.

concatenation Kleene star String homomorphism string operations closure properties context-free languages regular languages trios abstract families of languages

Closure properties of language families ( L 1 {\displaystyle L_{1}} Op L 2 {\displaystyle L_{2}} where both L 1 {\displaystyle L_{1}} and L 2 {\displaystyle L_{2}} are in the language family given by the column). After Hopcroft and Ullman.
OperationRegularDCFLCFLINDCSLRecursiveRE
UnionL 1 βˆͺ L 2 = { w ∣ w ∈ L 1 ∨ w ∈ L 2 } {\displaystyle L_{1}\cup L_{2}=\{w\mid w\in L_{1}\lor w\in L_{2}\}} YesNoYesYesYesYesYes
IntersectionL 1 ∩ L 2 = { w ∣ w ∈ L 1 ∧ w ∈ L 2 } {\displaystyle L_{1}\cap L_{2}=\{w\mid w\in L_{1}\land w\in L_{2}\}} YesNoNoNoYesYesYes
ComplementΒ¬ L 1 = { w ∣ w βˆ‰ L 1 } {\displaystyle \neg L_{1}=\{w\mid w\not \in L_{1}\}} YesYesNoNoYesYesNo
ConcatenationL 1 β‹… L 2 = { w z ∣ w ∈ L 1 ∧ z ∈ L 2 } {\displaystyle L_{1}\cdot L_{2}=\{wz\mid w\in L_{1}\land z\in L_{2}\}} YesNoYesYesYesYesYes
Kleene starL 1 βˆ— = { Ξ΅ } βˆͺ { w z ∣ w ∈ L 1 ∧ z ∈ L 1 βˆ— } {\displaystyle L_{1}^{*}=\{\varepsilon \}\cup \{wz\mid w\in L_{1}\land z\in L_{1}^{*}\}} YesNoYesYesYesYesYes
(String) homomorphism h {\displaystyle h} h ( L 1 ) = { h ( w ) ∣ w ∈ L 1 } {\displaystyle h(L_{1})=\{h(w)\mid w\in L_{1}\}} YesNoYesYesNoNoYes
Ρ-free (string) homomorphism h {\displaystyle h} h ( L 1 ) = { h ( w ) ∣ w ∈ L 1 } {\displaystyle h(L_{1})=\{h(w)\mid w\in L_{1}\}} YesNoYesYesYesYesYes
Substitution Ο† {\displaystyle \varphi } Ο† ( L 1 ) = ⋃ Οƒ 1 β‹― Οƒ n ∈ L 1 Ο† ( Οƒ 1 ) β‹… … β‹… Ο† ( Οƒ n ) {\displaystyle \varphi (L_{1})=\bigcup _{\sigma _{1}\cdots \sigma _{n}\in L_{1}}\varphi (\sigma _{1})\cdot \ldots \cdot \varphi (\sigma _{n})} YesNoYesYesYesNoYes
Inverse homomorphism h βˆ’ 1 {\displaystyle h^{-1}} h βˆ’ 1 ( L 1 ) = ⋃ w ∈ L 1 h βˆ’ 1 ( w ) {\displaystyle h^{-1}(L_{1})=\bigcup _{w\in L_{1}}h^{-1}(w)} YesYesYesYesYesYesYes
ReverseL R = { w R ∣ w ∈ L } {\displaystyle L^{R}=\{w^{R}\mid w\in L\}} YesNoYesYesYesYesYes
Intersection with a regular language R {\displaystyle R} L ∩ R = { w ∣ w ∈ L ∧ w ∈ R } {\displaystyle L\cap R=\{w\mid w\in L\land w\in R\}} YesYesYesYesYesYesYes

Applications

Programming languages

Main articles: Syntax (programming languages) and Compiler-compiler

In programming, a special set of rules helps computers understand code. A tool called a lexical analyzer finds important parts like words and numbers in the code. Another part, called a parser, checks if the code follows the right order. If it does, the computer can turn the code into something it can run.

Formal theories, systems, and proofs

Main articles: Theory (mathematical logic) and Formal system

In math and logic, a formal theory is a group of sentences made using a special language. A formal system has this language plus rules for making new sentences from old ones. These systems help us see which statements are true based on the rules.

Interpretations and models

Main articles: Formal semantics (logic), Interpretation (logic), and Model theory

Even though formal languages are just sets of symbols, we can give them meaning. In math, we study how to give meaning to these symbols. This helps us understand what the symbols really stand for and when a statement is true.

Related articles

This article is a child-friendly adaptation of the Wikipedia article on Formal language, available under CC BY-SA 4.0.