High and welcome to today's lectures. Before we are looking how machine translation works. We want to look, what is it what we really translate? So we want to look at language. What is the language we want to translate. This should explain us why it is difficult to translate. So first, we want to give you an overview about different language fundamentals and also look what people use to describe language. If we are talking about language. There are two important things. So the one thing people are often talking about the syntax. This describes the structure of the language. It describes how a sentence is composed of different words. The other second important part is the semantic, the semantic steels was a meaning of language. And today we want to look at these both things and see how sentences are built, what makes language difficult and which models for languages very exist in machine translation. The basic units which we are dealing with are words. So in most Mt systems, they are built on on words. And then these words composed different sentences, and with words already, the first problems start. So if you only look at English, it is often very easy to separate a sentence into different words. But this is no longer the case. If you go into different language pairs. So if we, for example, look at the Chinese language. They don't have spaces, and they don't separate each word by spaces. So here the first challenge is already to separate the sentence into different words. And as many other parts of the language. Also, this segmentation is ambiguous. As we see here on the right side with our example. If we are segmenting the sentence in this way, this Chinese sentence means the ping pong paddle is sold out. A different segmentation would be the following one. If we are using the blue segmentation. This sentence means the table tennis auction is over. So the meaning of the sentence really depends on how you segment the sentence. So this is already the first challenge. If we are dealing with this type of language when we are talking about words. One important thing is the morphology. The morphology describes how words are constructed out of different, more themes. So mostly a word consists of the stem, which is the main meaning of it. Like, for example, if you have the word "houses" to stay more famous house. And then you have your function more theme the plural as in English, while English is not a morphologically rich language. We have only very few different more themes, like the plural "s" in other languages. This is much more complicated. Here on the right side again, we have an example from the finish. So we have here a finished word, which means "in my house too. So these finished words consists of five different more themes, where the first one means that it is a house, then we have the morphine indicating that it is in, then we have the morphine knee, which indicates that it is my house, then we have the more thing for two. And finally, there is the morphine for a question. So we see that in other languages, we have words which contain a lot of information. And this makes off machine translation a lot more difficult since if we have a morphologically rich languages. We have a lot more words, because in one word, a lot of information are gathered. When we now talk about the structure of the language. The first important thing are part of speech tax. These are word classes which describe what the function of the word is in a sentence, a word can be a noun, a word can be worth our word can be an adjective. And these classes are very helpful. If you want to describe the structure of the sentence. In addition to the part of speech tax. You have the grammatical categories. They describe the properties of items within the grammar of the language. For example, a work has a tens, are a noun, has a gender or a number. These categories are often described by the more things we looked at before. And in many languages, you have to assure that there is agreement between the different word and the phrases. So for example, adjective and noun should health the same gender in many languages. So after looking as India visual words. Now, let us look at the structure of the whole sentence. If we want to describe the sentence structure. One very famous method to do is is using phrase structure grammar. This model shows the structure of the sentence using context - free, remembers a sentence to be correct, have to have some specific structures. So for example, in every sentence, there should be a work. And this is described by these grammars. You can then often describe a sentence by a past tree. So you can describe how the sentence consists of the different components. And this is done by the phrase "structure grammar ." So let us look at an example of the sentence, which is ," Jane buys a house. So first we have here the noun phrase "a house which convists of an article and noun, then the structure can be described by the work phrase, which consists of a word and a noun phrase. So in our example, it would be by the house. And finally, the whole sentence consists of a noun phrase and a web phrase where the noun phrase is just known ," Jane ." And we have the web phrase we passed before, which is by the house. So we see that sentences are constructed of different phrases, which then where some rules form the final sentence. In addition to the syntax of the language, which describes the structure. We also have the semantics, which describes the meaning of natural language constructs. So it can describe the meaning of words, of phrases and of sentences. In general, we have there often the compositionality. That means that the meaning of a sentence is composed of its meaning of its part. So the meaning of the whole sentence is the meaning constructed of the different words in the sentence. They are different formalism, which try to describe the semantic of a sentence. For example, there is like approaches to describe the semantics of a sentence by using first order logic are by using frame semantics, one part of semantic, which is especially important for machine translation as a lexical semantics. Lexical semantics is concerned about the meaning of single words? Why does the meaning of single words have difficulties for languages. One problem here is that several words can have different meaning depending on the context they are used in. So first we have the polar semi policy means that we have a word with the same surface form, but which can have different meanings. For example, if you look in English, you have the word "interest ," which can mean that you are interested in something, or it can be the interest rate in the finance industry. Another very famous example is the English word "bank ," which can be either the river bank or the financial institute bank. A second related phenomenon is a harmonomy. This means that the word can have two completely different meanings, which are not related at all. So for example, if you look at the English word can it can either be a work which describes that you can do something, or it can be the can. For example, the can of beats. So to summarize today's lectures, what did we look at first? We looked at the morphology, the morphology describes the internal structure of words. We have seen that words are the basic units we are using machine translation. But that first words also consists of different components. And secondly, often it is not that easy to re- separate the individual words in a sentence, because it is not in all languages like it in English, where these words are separated by spaces. Furthermore, we looked at syntax which describes the structure of the sentence. We have seen that words have different task in a sentence we have announced. We have the adjectives, and we can use pastries T- to describe the structure of the sentence. And finally, there is a semantics which describes the meaning of the sentence of the words of phrases. And we have already seen in some examples why this might lead to some difficulties in machine translation, because words, the same word can have different meaning depending on the context it is used it. For example, if we looked at the examples bank or interest one