Sale!
Placeholder

Better Plagiarism Detection

10,000 3,000

Topic Description

Summary
The aim of this project was to expand and improve the existing plagiarism detection system currently in
place within the University of Leeds’ School of Computing department, which is used to check student’s
submission of source code.
An evaluation of the existing system identifying its benefits and disadvantages was coupled with
background reading on the area of plagiarism detection as well a summary on three other commonly used systems which complete this task.

Contents
1 – Introduction Page
1.1 – Aim 1
1.2 – Objectives, Minimum Requirements and Deliverables 1
1.2.1 – Objectives 1
1.2.2 – Minimum Requirements 1
1.2.3 – Deliverables 1
1.2.4 – Possible Extensions 2
1.2.5 – Evaluation Criteria 2
1.2.6 – Project Schedule 3
2 – Problem Background 4
2.1 – Plagiarism 4
2.1.1 – What is Plagiarism? 4
2.1.2 – Why do people Plagiarise? 5
2.1.3 – How do people Plagiarise source code? 5
2.1.4 – Ethics involved with plagiarism detection. 6
2.2 – Plagiarism Detection 8
2.2.1 – Attribute Counting Techniques 8
2.2.2 – Structure Metrics 8
2.3 – Existing SoC Solution 9
2.3.1 – SoC Plagiarism System 9
2.3.2 – Conclusions of Current System 9
2.4 – Other Third Party Solutions 14
2.4.1 – JPlag 14
2.4.2 – MOSS 15
2.2.3 – YAP 15
2.2.4 – Third Party Conclusions 16
2.5 – Methods of String Comparison 16
2.5.1 – Longest Common Substring 16
2.5.2 – Edit Distance 17
2.5.3 – Hamming Distance 18
2.5.4 – Winnowing 18
Better Plagiarism Detection
v
2.6 – Choice of Programming Language 19
2.6.1 – Perl 19
2.6.2 – C++ 19
2.6.3 – Java 20
2.6.4 – Conclusion on Language 20
2.7 – Methodology 20
3 – Design 22
3.1 – System Design 22
3.2 – Input Stage 23
3.3 – Optimisation Stage 24
3.4 – Comparison Stage 28
3.5 – Focused Code Analysis 29
3.6 – Output Stage 30
3.7 – User Interface 31
4 – Implementation 33
4.1 – System Overview 33
4.2 – Input Stage 34
4.3 – Optimisation Stage 34
4.4 – Comparison Stage 37
4.5 – Focused Code Analysis 38
4.6 – Output Stage 39
4.7 – User Interface 40
4.8 – Expanding the Program 42
5 – Testing 44
5.1 – Test Plan 44
5.2 – Test on Real Data 44
5.3 – Test with “engineered” data 47
5.4 – Evaluation of Testing 47
7 – Conclusion and Further Work 49
7.1 – Conclusion 49
Better Plagiarism Detection
vi
7.2 – Further Work 50
Appendix A – Personal Experience Reflection 51
Appendix B – Implementation Stage UML Diagram 52
Appendix C – Example Program Output 53
Appendix D – Testing Results 54
Appendix E – Bibliography and References

GET COMPLETE MATERIAL

INQUIRES:

OUR SERVICES: