Odpri menu

Moderna arhivistika 2023, 6 (2), str./pp. 308-333

Dr. Miroslav NOVAK
arhivski svetnik, Pokrajinski arhiv Maribor, Slovenija / Archival Councillor, Regional Archives Maribor, Slovenia

Cel članek / Full text article

Stanje in perspektive vzajemnega metapodatkovnega korpusa slovenske javne arhivske službe
Status and perspectives of the mutual metadata corpus of the Slovenian Public Archival Service
(Moderna arhivistika 2023, 6 (2), str./pp. 308–333)

https://doi.org/10.54356/MA/2023/MJUO4040  


Izvleček:
Namen: Vzajemni arhivski metapodatkovni korpus Slovenske javne arhivske službe obsega več milijonov podatkovnih entitet, ki so bile zajete v obdobju 2009/10 do 2022. Osnovni namen raziskave je, da na podlagi kvantitativnih metod opredelimo stopnjo kakovosti zajetih arhivskih metapodatkov. Na tej osnovi je v nadaljevanju opravljena validacija njegove potencialne uporabnosti tako z vidika posrednih ali neposrednih uporabnikov kot tudi sistemov, ki temeljijo na umetni inteligenci.

Metoda/pristop: Za potrebe tega prispevka je bilo uporabljenih več metod. Osnovo predstavljajo statistične metode, ki so bile uporabljene za obdelavo podatkov iz letnih statističnih poročil o stanju in prirastu podatkov v vzajemni arhivski podatkovni bazi. Ob tem je bila uporabljena primerjalna metoda pridobljenih rezultatov iz začetka leta 2023 s podatki primerljivih analiz iz leta 2013 in nato 2016. Za predstavitev kompleksnosti obvladovanja celotnega problema pa so bile uporabljene še naslednje metode: metoda analize, opisna metoda, metoda povzemanja in izkustvena metoda. Na podlagi pridobljenih statističnih podatkov in s pomočjo analize SWOT so ob koncu prikazane prednosti, slabosti in priložnosti ter nevarnosti obstoječe arhivske strokovne prakse generiranja slovenskega arhivskega metapodatkovnega korpusa. Podrobne analize obdelav rezultatov po posameznih arhivih ali nižjih popisnih enot niso predmet te raziskave.

Rezultati: Na podlagi opravljenih analiz statističnih podatkov avtor ugotavlja, da število zapisov nominalno raste iz leta v leto. Njihova rast pa je opredeljena s povprečno pol milijona zapisi in njihovimi povezavami letno. Hkrati z nominalno rastjo zapisov pa podrobna analiza trendov kaže, da so ti različni skozi daljše obdobje po posameznih modulih. Rezultati tudi kažejo, da bo treba izvesti nekatera uravnoteženja zajemov med moduli, saj se bo le na tej osnovi celoten korpus razvijal enakomerno, hkrati pa bodo uporabnikom in drugim sistemom zagotovljene celovite informacije o ohranjenih arhivskih entitetah v slovenskih javnih arhivih.

Sklepi/ugotovitve: Rezultati statističnih obdelav podatkov o vzajemnem podatkovnem korpusu SJAS kažejo na relativno visoko stopnjo profesionalnosti zaposlenih v SJAS. Hkrati s tem nakazujejo, da bo treba posvetiti ustrezno pozornost standardizaciji opisovanja kontekstov, doslednejšemu povezovanju zapisov o enotah popisa z drugimi zapisi v sami podatkovni zbirki in drugimi zapisi v drugih relevantnih podatkovnih zbirkah. Prav tako bo treba dosledneje uporabljati statistične metode na nižjih ravneh popisnih enot, saj bo le tako mogoče sproti ugotavljati morebitne pomanjkljivosti v zapisih in izvajati potrebne popravke. Hkrati pa bo treba nekatere rešitve prilagoditi zahtevam standarda ISO 24083:2021. Razumevanje gibanja statističnih podatkov v podatkovnem korpusu SJAS ni pomembno le za upravljalce podatkovne zbirke SJAS, ampak tudi za različne uporabnike arhivskega gradiva in drugih storitev v javnih arhivih pa tudi za morebitne nadaljnje implementacije v sistemih, ki jih s skupnim imenom označujemo »umetna inteligenca«.

Ključne besede:
vzajemna podatkovna zbirka, arhivski metapodatki, statistične metode, standardizacija, slovenska javna arhivska služba.

 

Abstract:
Status and perspectives of the mutual metadata corpus of the Slovenian Public Archival Service
Purpose: The mutual archival metadata corpus of the Slovenian Public Archives Service (hereinafter: SJAS) comprises several million data entities that were captured in the period 2009/10 to 2022. The basic purpose of the research is to define the quality level of captured archival metadata based on quantitative methods. On this basis, the validation of its potential usefulness is carried out both from the point of view of indirect or direct users, as well as systems based on artificial intelligence.

Method / approach: Several methods were used for the purposes of this paper. The basis is represented by statistical methods that were used to process data from annual statistical reports on the state and growth of data in the mutual archival database. At the same time, the comparative method of the obtained results from the beginning of 2023 with the data of comparable analyzes from 2013 and then 2016 was used. To present the complexity of managing the entire problem, the following methods were also used: analysis method, descriptive method, summary method and experiential method. On the basis of the obtained statistical data and with the help of a SWOT analysis, the advantages, disadvantages, opportunities and threats of the existing archival professional practice of generating the Slovenian archival meta data corpus are shown at the end. Detailed analyzes of processing results by individual archives or lower units of description are not the subject of this research.

Results: Based on the analysis of statistical data, the author notes that the number of records nominally grows from year to year. Their growth was defined by an average of half a million records and their links annually. At the same time as the nominal growth of records, a detailed analysis of trends shows that these are different over a longer period by individual modules. The results also show that some catch-balancing between modules will need to be done. After all, only on this basis will the entire corpus develop evenly, and at the same time users and other systems will be provided with comprehensive information about preserved archival entities in Slovenian public archives.

Conclusions / findings: The results of the statistical processing of data about the mutual data corpus of SJAS show a relatively high level of professionalism among the employees in SJAS. At the same time, they indicate that the necessary attention will have to be paid to the standardization of describing contexts, to a more consistent connection of records about units of description with other records in the database itself and records in other modules of the Slovenian archival metadata corpus. It will also be necessary to implement selected statistical methods more consistently at the lower levels of units of description, as this is the only way it will be possible to identify potential deficiencies in the records and implement the necessary corrections. At the same time, some solutions will have to be adapted to the requirements of the ISO 24083:2021 standard. Understanding the movement of statistical data in the SJAS data corpus is not only important for the managers of the SJAS database, but also for various users of archival material and other services in public archives, as well as for possible further implementations in systems known as "artificial intelligence".

Keywords:
Mutual database, archival metadata, statistical methods, standardization, Slovenian public archival service.