Sale!
Placeholder

Using Web Search for Machine Translation from English to Swedish

10,000 3,000

Topic Description

Summary
This project entails the use of the World Wide Web, and web search engines, as a resource
for machine translation. The aim of this project is to design, implement and evaluate a tool for the
translation of adjective-noun phrases from English to Swedish. An online dictionary is queried to
obtain translations of the individual words in the phrase. The words are then all combined to form
‘candidate translations’ for the phrase. The final translations are chosen by using the Web as a
linguistic resource. The translation of the English phrase chosen by the tool is the Swedish phrase
most frequent on the Web. The translations are then evaluated by native Swedish speakers to judge
the effectiveness of the tool.

Contents
i. Title Page……………………………………………………………………………………………………. i
ii. Summary…………………………………………………………………………………………………….. ii
iii. Acknowledgements……………………………………………………………………………………… iii
iv. Table of Contents………………………………………………………………………………………… iv
1. Introduction 1
1.1 Aims……………………………………………………………………………………………………… 1
1.2 Objectives……………………………………………………………………………………………… 1
1.3 Minimum Requirements……………………………………………………………………………. 2
1.4 Relevance to degree……………………………………………………………………………….. 2
2 Project Management 3
2.1 Schedule……………………………………………………………………………………………….. 3
3 Background Reading…………………………………………………………………………………… 4
3.1 Machine Translation………………………………………………………………………………… 4
3.2 Evaluating translations……………………………………………………………………………… 5
3.3 Collocations…………………………………………………………………………………………… 6
3.3.1 Defined………………………………………………………………………………………… 6
3.3.2 Finding Collocations in Text Corpus……………………………………………………. 7
3.4 Using the Web for Natural Language Processing…………………………………………..7
3.4.1 Grefenstette………………………………………………………………………………………7
3.4.2 Noise on the Web……………………………………………………………………………… 8
3.4.3 The Web as a Corpus…………………………………………………………………………9
3.4.4 Conclusions……………………………………………………………………………………… 9
3.5 Language Resources and Tools………………………………………………………………….10
3.5.1 English-Swedish Dictionary………………………………………………………………….10
3.5.2 British National Corpus………………………………………………………………………. 10
3.5.3 Gsearch……………………………………………………………………………………………11
3.5.4 Unix Tools…………………………………………………………………………………………11
3.5.5 Perl Regular Expressions…………………………………………………………………….11
iv
4 Collocations 12
4.1 Extraction of phrases……………………………………………………………………………….. 12
4.2 Selection of phrases by frequency……………………………………………………………… 13
5 Methodology 14
6 Implementation of Translation Tool 15
6.1 Obtaining the dictionary HTML page…………………………………………………………… 15
6.2 Searching the dictionary HTML page………………………………………………………….. 16
6.3 Generating candidate translations……………………………………………………………….18
6.3.1 Handling adjective-noun agreement…………………………………………………….. 19
6.3.2 Creating candidate translation phrases………………………………………………… 19
6.4 Using web search for translation selection…………………………………………………… 19
7 Evaluation 21
7.1 Evaluation Procedure……………………………………………………………………………….. 21
7.2 Design of the Evaluation…………………………………………………………………………… 21
7.3 Evaluation Results and Analysis………………………………………………………………….22
7.3.1 Fluency of adjective noun phrases………………………………………………………. 22
7.3.2 Fidelity of adjective noun phrases…………………………………………………………23
7.3.3 Fluency of adjective adjective noun phrases…………………………………………. 24
7.3.4 Fidelity of adjective adjective noun phrases………………………………………….. 25
7.3.5 Comparison of results……………………………………………………………………….. 26
7.3.6 Evaluator Agreement/Disagreement…………………………………………………….. 27
7.4 Conclusions…………………………………………………………………………………………… 28
7.4.1 Summary………………………………………………………………………………………… 28
7.4.2 Comparisons with similar research…………………………………………………….. 29
7.4.3 Possible Improvements……………………………………………………………………. 30
References………………………………………………………………………………………………………. 31
Appendices……………………………………………………………………………………………………… 32
A. Personal Reflection………………………………………………………………………………………… 33
B. Project Schedule v1.0 and v2.0………………………………………………………………………… 34
C. Translation Procedure Example……………………………………………………………………….. 36
D. Example of dictionary entry………………………………………………………………………………

PROJECT SAMPLE/DEPARTMENTS

REVIEW OUR SERVICES

SEE FAQ