Personal tools
You are here: Home KP-Lab tools for trialogical learning Text Mining Services

Text Mining Services

Text Mining services, as a part of the SWKM Knowledge MatchMaker module, are used within KP-Lab system to assist users when creating or updating the semantic descriptions of KP-Lab knowledge artefacts (object of activities).

Description of the services

Text Mining Services are a software package that provides functions for classification (Classification Service) and creation of conceptual maps (Clustering Service), based on the analysis of textual content and semantic annotations of KP-Lab knowledge artefacts. The Text Mining Services can be used to assist users when creating or updating the semantic descriptions of the knowledge artefacts.

  • The Classification Service, after a software-training period, classifies the artefacts under some pre-defined set of categories (e.g., ontology concepts), resulting in a semi-automatic generation of semantic descriptions.
  • The Clustering Service looks for clusters of similar artefacts and (semi-)automatically acquire conceptual maps from them. This can lead to the update or even the creation of new ontologies stored in the SWKM repository.

Both Clustering and Classification services provide a client API as well as a web service interface. The Classification Service exposes operations for creation and maintenance of classification models and their consequent usage for automatic classification. The Clustering Service exposes operations for text pre-processing, concept map creation, and clustering of knowledge artefacts. In addition, the Text Mining Console is provided as a web-based application for accessing and using some of the text mining functionality via standard web interface.

The services are accessing the SWKM repository (using the Persistence API) to retrieve the properties and semantic annotations of knowledge artefacts. The content of artefacts is retrieved from the Content Repository, using its web service interface. Indexes and internal data of the Text Mining Services are stored in a local Mining Object Repository.

Implementation of the classification service is based on the JBowl Java library, providing a platform for several classification algorithms, tools for processing natural language texts, as well as some clustering techniques.

Prospective users

  • Schools, universities can use the services, integrated within an e-Learning environment, to organize the learning materials. E.g. Technical University of Kosice, Faculty of Electrical Engineering and Informatics – the Text Mining Services are used within the courses of the Knowledge Management.
  • Developers of various text-processing applications and research communities working on the field of text processing can use the services as building blocks for their solutions.

The Text Mining Services can be of interest for use by research and development communities working on various text-processing systems, especially (but not exclusively) in the field of e-Learning. Currently, the Text Mining Services are tightly connected on the KP-Lab repositories as SWKM and Content repository, so it can be preferably offered as an extension of the KP-Lab-like systems. The core of the Text Mining Services has its own API defined and so it can be offered as a component for the solutions where text mining and text processing functionality is needed.

The tool and knowledge creation

The Text Mining Services supports the knowledge creation processes by retrieving hidden relations between the knowledge artefacts (Clustering service) and by structuring the artefacts according to their meaning extracted from the textual content (Classification service). The services provide an assistance during the process of creating semantic descriptions of the artefacts and this way they enable to identify qualitatively new semantic relationships between the artefacts.

Knowledge protection

Text Mining Services are free and open source software designed and developed by TUK (Technical University of Kosice, Slovakia) and UEP (University of Economics, Prague, Czech Republic) / BUT (Brno University of Technology, Brno, Czech Republic). It is licensed under a GNU Lesser General Public License 3.0.

Contact details

Karol Furdík (TUK) - kfurdik@stonline.sk

Downloads


      Knowledge Practice Portal    Website created: 9 Feb 2006
Last major update: 25 Mar 2009
©2006 - 2011 KP-Lab
Powered by Plone, the Open Source Content Management System. 
Information Society Technologies