Dictionary of Croatian Idioms

A born-digital, corpus-driven, open-access dictionary of Croatian idioms

Dictionary Guide to using the dictionary Publication data

Project goal

The goal of the project is to create an open-access dictionary of Croatian idioms based on data from a large electronic corpus. The resulting dictionary will serve as a gateway for a large number of users and researchers to current idiomatic usage of Croatian.


The dictionary is based on the Croatian web corpus hrWaC (1.2 billion words). Using a large electronic corpus to compile a dictionary is in line with one of the key principles of modern-day lexicography: we can obtain reliable linguistic data by observing language in use.

28 March

CLASSLA-Express workshops

The CLASSLA-Express workshop series focuses on leveraging the CLARIN.SI corpora for language research. The workshops will take place from April to September, in five countries and six cities: Zagreb, Rijeka, Belgrade, Skopje, Sofia and Ljubljana.
Find more information and register here.
The CLASSLA-Express team: Ivana Filipović Petrović, Jelena Parizoska, Taja Kuzman i Nikola Ljubešić

11 March

Podcast Close Encounters of the Language Kind (11 March 2024)

Jelena Parizoska appeared on the podcast Close Encounters of the Language Kind (episode 230) talking about idioms.

29 September

Kocijan, Kristina; Filipović Petrović, Ivana; Parizoska, Jelena. “Verbal idioms in Croatian: Preparing language data for automatic identification in a corpus”

International conference CLARC 2023: Language and Language Data, University of Rijeka, Croatia, 28 September 2023

05 September

Parizoska, Jelena “Teaching idiomatic variation in EFL: How corpus data can enhance students’ metaphoric competence”

International conference Corpora in Language Learning, Translation and Research, University of Zadar, Croatia, 23 August 2023

30 June

Parizoska, Jelena; Filipović Petrović, Ivana; Kocijan, Kristina. “Establishing criteria and procedures to identify conventionalized similes in Croatian”

International conference Electronic lexicography in the 21st century (eLex 2023): Invisible lexicography, Brno, Czech Republic, 28 June 2023

19 June

Filipović Petrović, Ivana. 5th Summer Datathon on Linguistic Linked Open Data (June 11-16, 2023).

Ivana Filipović Petrović participated in the 5th Summer Datathon on Linguistic Linked Open Data. This edition was supported by the Nexus Linguarum COST Action. Ivana’s team received the Best Miniproject Award.