Skip to main navigation menu Skip to main content Skip to site footer

Looking under the Hood for Evidence of Normalization: Multivariate exploratory analysis of lexical bundles


The study investigated the hypothesis of normalization and stylistic variation across translators as manifested in the use of lexical bundles between translated and non-translated English literary texts. Normalization is a hypothesis originally proposed as ‘conservatism’ by Baker (1996) which states that the translator tends to conform to linguistic patterns and conventions typical of the target language even to the point of exaggeration, and lexical bundles are sequences of three or four words recurring with high frequency in natural discourse. The study was carried out in two stages. The first stage replicated previous studies that relied on simple frequency tests to confirm the normalization hypothesis. Contrary to these earlier studies, the present study’s frequency tests on lexical bundles failed to provide clear support for the normalization hypothesis. The second stage employed two types of multivariate exploratory analysis, principal component analysis (PCA) and hierarchical cluster analysis (HCA), to examine the underlying relationships among individual texts, lexical bundles, and translated and non-translated group categories. Following the failed frequency tests, it was hypothesized here that normalization might be still present in the translated corpus but restricted by types of lexical bundles. PCA confirmed this hypothesis by revealing that normalization occurred in the use of a particular functional type of lexical bundles, called discourse bundles, which are relatively free from the thematic content of the text in which they occur. This ascertains the traditional idea that statistical tests of translation hypotheses must deal with linguistic features unrelated to the thematic content of the corpus. Additionally, PCA revealed variation across the types of lexical bundles preferred by individual translators. HCA further identified the presence of a subgroup of translated texts that cluster with non-translated texts, rather than with their fellow translated texts. This was taken as indicating that the use of lexical bundles varied among the translators and that the division between translated and non-translated texts is not clear-cut.


corpus-based translation studies, normalization, principal component analysis, hierarchical cluster analysis, Korean-English literary translation



Download data is not yet available.