Sale!
Placeholder

An application for the automatic generation of presentation slides from text

10,000 3,000

Topic Description

 ALL listed project topics on our website are complete material from chapter 1-5 which are well supervised and approved by lecturers who are intellectual in their various fields of discipline, documented to assist you with complete, quality and well organized researched materials. which should be use as reference or Guild line...  See frequently asked questions and answeres



Summary
The aim of this project is to produce a system to semi-automate the process of generating slideshow
content. The creation of computerised slideshow presentations for conference talks, currently requires a
presenter to review their article and to manually condense the content into bullet point form. This system
automates the process by converting an article into a set of slides, each containing a heading and a list of associated key-terms. An export facility allows users to save the generated slide content in a popular
presentation application (OpenOffice’s Impress). The developed solution required an in-depth understanding and adoption of the NLP technique of Term Extraction as well as Information Retrieval. The system is currently trained on a corpus of scientific articles, though the use of domain independent methods means that minimal alterations are required to extend the system to function with articles in other domains. The metrics of precision and recall were used for determining the accuracy of the implemented term extraction method.

Contents
Summary………………………………………………………………………………………………….ii
Acknowledgements………………………………………………………………………………………iii
Table of Contents………………………………………………………………………………………..iv
Section 1: Introduction…………………………………………………………………………………. 1
1.1 Problem…………………………………………………………………………………….… 1
1.2 Project aim and objectives……………………………………………………………………. 2
1.3 Minimum requirements and Extensions……………………………………………………… 2
1.4 Deliverables…………………………………………………………………………………… 3
1.5 Project Management………………………………………………………………………….. 3
Section 2: Background Reading………………………………………………………………………. 4
2.1 Corpora……………………………………………………………………………………….. 5
2.1.1 Overview………………………………………………………………………… 5
2.1.2 Corpus of Scientific Articles……………………………………………………. 5
2.2 Key-word Extraction…………………………………………………………………………. 6
2.2.1 Collocations……………………………………………………………………… 7
2.2.1.1 Overview………………………………………………………………….. 7
2.2.1.2 Methods for extracting Collocations……………………………………… 7
2.2.1.2.1 Frequency……………………………………………………… 7
2.2.1.2.2 Mean and Variance…………………………………………….. 7
2.2.1.2.3 Hypothesis Testing…………………………………………….. 8
2.2.1.3 Evaluation………………………………………………………………… 8
2.2.1.4 Conclusions……………………………………………………………….. 9
2.2.2 Term extraction…………………………………………………………………. 9
2.2.2.1 Overview………………………………………………………………….. 9
2.2.2.2 POS Tagging……………………………………………………………..10
2.2.2.3 Method for Term Acquisition/Extraction………………………………..10
2.2.2.4 Conclusions………………………………………………………………11
2.2.3 Information retrieval……………………………………………………………11
2.2.3.1 Overview…………………………………………………………………11
2.2.3.2 Inverted Index……………………………………………………………12
2.2.3.3 Vector Space Model………………………………………………………12
2.2.3.4 Term Weighting………………………………………………………….13
2.2.3.5 Evaluation………………………………………………………………..13
2.2.3.6 Conclusions………………………………………………………………14
2.3 Topic/Text Segmentation……………………………………………………………………14
2.3.1 Overview……………………………………………………………………….14
2.3.2 TextTiling……………………………………………………………………….15
v
Section 3: Methodologies……………………………………………………………………………..16
3.1 Programming Languages……………………………………………………………………16
3.1.1 Criteria for Choice of Programming Languages……………………………….16
3.1.2 Evaluation of Programming Languages………………………………………..17
3.1.3 Final Choice of Programming Languages………………………………………18
3.2 Choice of Software Methodologies………………………………………………………….18
3.2.1 Types of Methodology………………………………………………………….18
3.2.2 Modifications of SDLC Model…………………………………………………19
3.3 Choice of NLP Methods……………………………………………………………………..20
3.4 User Interface Design………………………………………………………………………..21
Section 4: Implementation……………………………………………………………………………23
4.1 Corpus Collection……………………………………………………………………………..23
4.1.1 Tagged Corpora…………………………………………………………………24
4.1.2 Untagged Corpus of Scientific Articles…………………………………………26
4.2 Key-Word and Term Extraction………………………………………………………………26
4.2.1 STAGE1: Strip XML to obtain Article’s Content………………………………..26
4.2.2 STAGE2: Create Tagged CorpusSciTR/Source Document………………………27
4.2.3 STAGE3: Generate Noun Phrase List……………………………………………29
4.2.4 STAGE4: Create Inverted Index and Source Document Term Frequency Index..30
4.2.4.1 Inverted Index……………………………………………………………30
4.2.4.2 Term Frequency Index……………………………………………………31
4.2.5 STAGE5: Rank Terms in Source Document using TF.IDF………………………31
4.3 Slide Formatting………………………………………………………………………………32
4.4 User Interface…………………………………………………………………………………33
Section 5: Testing and Evaluation……………………………………………………………………38
5.1 Testing…………………………………………………………………………………………….38
5.1.1 Human judges……………………………………………………………………………..39
5.1.2 Title and Abstract…………………………………………………………………………40
5.2 Evaluation…………………………………………………………………………………………41
5.2.1 Quantitative……………………………………………………………………………….41
5.2.2 Qualitative…………………………………………………………………………………43
Section 6: Conclusions………………………………………………………………………………..46
6.1 Achievement of project aims……………………………………………………………………46
6.1.1 Minimum Requirements………………………………………………………………….46
6.1.2 Extensions…………………………………………………………………………………47
6.2 Scope for other domains………………………………………………………………………..48
6.3 Future work…………………………………………………………………………………….49
Bibliography…………………………………………………………………………………… ……..52
Appendix A – Personal Reflection…………………………………………………………………….

GET COMPLETE MATERIAL