Information Extraction and Knowledge Discovery

The programme has attracted leading Bulgarian scientists from various fields of information retrieval (search engine development technologies) and natural language processing methods. The Master Programme is implemented in cooperation with the European COST project: Action: IC1002: Multilingual and Multifaceted Interactive Information Access (MUMIA).

Professional area: 
4.6. Informatics and Computer Science
Master of Science
MII392113 Informatics
Master's programme: 
Information Extraction and Knowledge Discovery
Form of education: 
Duration of full-time training (in semesters): 
Professional qualification: 
MSC in Informatics - Information Extraction and Knowledge Discovery
Language of Instruction: 
Master's programme director: 
Prof. Ivan Koichev, PhD

Focus, educational goals

Information Retrieval or Search Engine Technologies and Knowledge Discovery in Data / Text Mining have been a rapidly growing area over the last decade. A major prerequisite for this is the explosive development of the Web, digital libraries and other electronic storage of information and data. This raises the need to develop methods and algorithms to find useful information quickly and easily. At present, the unstructured information (text) presented electronically far exceeds the structured information (data). Systems for searching and retrieving useful information in the ocean from unstructured (semi-structured) information have a significantly larger market share in the present. At the same time, the need for automatic analysis of these large corpuses of text and data to gain knowledge from them increases. Currently, this is one of the fastest growing markets in the IT sector, which also determines the demand for specialists who are well acquainted with the relevant algorithms and software technologies.

The program is meant for students who have completed Bachelor programmes in the fields of Computer Science, Computer Science and Mathematics.

The Master's programme aims to familiarize students with the basic information retrieval methods and algorithms that are trying to go further than keyword search by adding a degree of intelligence to the developed systems. Students will learn basic approaches to natural language processing and automatic data and text analysis. They will develop skills to solve specific tasks by successfully combining the necessary sequence of steps and assessing the role of the necessary resources and technologies for successful conversions.

Training (knowledge and skills)


  • The courses are taught by combining / complementing lectures + seminars + discussions + individual assignments. The main material is offered to students during lectures, by applying discussions. Students prepare their own assignments themselves.
  • Seminar sessions include solving additional tasks on the subject, looking at variants of the task for independent work to illustrate specific algorithms and methods.
  • The necessary training resources are the following: computer and multimedia presentation slides.


  • Knowledge of basic algorithms and methods behind web-search engines, e-commerce sites, referral systems, email spam filtering, spell correction, machine translation, automatic classification and grouping, etc. Methods for designing and developing software systems and services.
  • The skills gained from experience in developing course projects of real applications, such as: text processing; creating search engines; smart tools to improve search accuracy; spam filtering systems, categorization and grouping of documents; approaches for automatic data analysis and text aim to detect in them implicitly presented dependencies and specimens.
  • Open doors to job opportunities in information and e-commerce companies.

Professional competence

Graduates of the curriculum will acquire the following basic competencies:

  • Knowledge and skills for implementing algorithms and methods needed to build search and retrieval systems from collections of text documents and web.
  • Skills to apply natural language processing approaches and implement systems that help the user in analyzing and processing text.
  • Skills to use methods needed to analyze large data sets and text for the purpose of discarding unknown knowledge about them, including "hot" applications such as automatic analysis of views on topics and business intelligence.
  • Knowledge of software technologies and skills to design and develop distributed software systems and services.

Professional realization

Masters will be able to find a job in the field of: information technology, such as developers of text analysis programs and information retrieval programmes; as creators of platforms for processing large data sets; as researchers and lecturers in academic institutions; such as IT experts at state or international institutions, as well as for specialized services, publishers, libraries and others.

Graduates of this Master's programme will also be well qualified to continue their PhD studies at Sofia University or other universities.

Contact information