International Journal of Computer Engineering and Technology (IJCET)

Source ID:00000005
Volume 7, Issue 5,September – October 2016, Pages 65-76, Page Count - 12

FIOBODA - SEMANTIC ANNOTATION FRAMEWORK FOR WEB EXTRACTED DATA

C. Gnana Chithra (1) E. Ramaraj (2)

(1) Equity Research Consultant, Angeeras Securities, Chennai, India.
(2) Professor, Department of Computer Science and Engineering, Alagappa University, Karaikudi, India.

Manuscript ID:- 00000-65932
Access Type : Open Access
Read Full Article


Cite this article:C. Gnana Chithra,E. Ramaraj,  Fioboda - Semantic Annotation Framework For Web Extracted Data, International Journal of Computer Engineering and Technology(IJCET), 2016, 7(5), PP.65-76

Manuscript Level Metrics (MLM)

Views Downloads Citations Cited References Social Shares
12 8 0 0

Abstract

Semantic annotation of web pages is the state of art technology for achieving the unified objective of attaining Semantic web Universe, which enables sharing, and reusing the document content beyond the boundaries and applications. Web is a treasury of knowledge and efficient tools should be designed to explore the structured and unstructured data. Annotating million of web pages manually is an impossible task. For high information retrieval rates, automatic annotation of documents is mandatory. Metadata is added to the web pages to make it intelligent for processing in content based intelligent applications. This paper analyses the problems with the current Semantic annotation systems and proposes a new Ontology based Automatic annotation system Framework. Ontology based semantic annotation is one of the best methods for extracting data from the Knowledge Base.

The integration of Modified Manning’s Sentence boundary detection algorithm and Noun Phrase Collocation algorithm and classification using machine learning techiques in the Information Extraction module, and developing a new data model and ontology for Structured Ontology engineering model is contributed in this paper. Annotation module annotates the output of the information extraction module with the aid of ontologies and dictionaries and stores the resultant annotated data as RDF triples in the Annotation database. Reasoning is made on the Annotated data by the RDF repository interface. FIOBODA is abbreviated as the Financial Instruments ontology based open document annotation. Web pages extracted from the Financial securities domain are mapped with the Finance ontology to extract the subject, predicate and object. SVM classifier is used to classify the correct and incorrect annotations. The correct output annotation data is stored in Annotation data base and RDF repository for later use. The proposed framework to an extent solves the problem of knowledge bottleneck due to its reusability and interoperability features.


Author Keywords
Dublin Core FIOBODA Financial Securities Ontology Metadata Semantic Annotation Framework

ISSN Print: 0976-6367 ISSN Online: 0976-6375
Source Type: Journals Document Type: Journal Article
Publication Language: English DOI:
Abbreviated Journal Title: IJCET Access Type: Open Access
Publisher Name: IAEME Publication Resource Licence: CC BY-NC
Major Subject:Physical Sciences Subject Area classification: Computer Science
Subject area: Software Development Source: SCOPEDATABASE