- Website Launching (1.Dec.2022)
- Meetings 
- Courses 
Dates: 3-5 April 2023
Venue: Poseidonia Beach Hotel, Limassol, Cyprus.
Room: Triton 3, mezzanine. Speakers:
Karel Hron, Palacky University, Czech Republic.
Dan Vilenchik, Ben-Gurion University, Israel.
Peter Winker, University of Giessen, Germany.
David Suda, University of Malta, Malta.
Vladimir Batagelj, University of Ljubljana, Slovenia.
Monday, 3 April 2023
Tuesday, 4 April 2023
Wednesday, 5 April 2023
Karel Hron, Palacký University, Czech Republic.
The analysis of distributional data (probability density functions or histogram data) has recently gained increasing attention in the applications. Distributional data are often observed by themselves, or as result of aggregation of large streams of data. The course will provide an introduction to the analysis of these data using a Functional Data Analysis (FDA) approach, grounded on the perspective of Bayes spaces. These spaces are mathematical spaces whose points are densities (or, more generally, measures), which generalize to the FDA setting the Aitchison simplex for multivariate compositional data. The course will give an brief overview of the concise theory of Bayes spaces, as well as of statistical methods developed in this setting. All the methods will be illustrated through examples from real case studies.
Topics:
Dan Vilenchik, Ben-Gurion University, Israel.
High dimensional (HD) data is characterized by a large number of features compared to a much smaller number of samples. Such data is prevalent in biology, economics, psychology, etc. The “curse of dimensionality” refers to a host of phenomena that concern consistency issues when applying classical statistical tools to HD data. One way to reduce the dimensionality of the data is by performing feature selection. In this talk we are going to focus on unsupervised feature selection methods. We present some of the main methods used by practitioners, and study key questions that arise (both rigorously and hands-on data-driven). The first question is what happens when the data has a low signal-to-noise ratio (SNR)? How does low SNR in the HD setting affects feature selection, and what remedies may be offered. Then we study the prevalence of truly hard HD datasets in real-world applications (we define what we mean by “truly hard”), or are such problems mainly a theoretical curiosity.
Peter Winker, University of Giessen, Germany.
There is a growing interest in and use of textual information in different fields of economics comprising financial markets (analysts’ statements, communication of central banks) over innovation activities (patent abstracts, firm websites), and the development of economic science (journal articles, conference abstract). Using such textual information for quantitative analysis involves several steps including, e.g., 1) the selection of appropriate sources (corpora) and establishing access, 2) the preparation of the text data for further analysis, 3) the identification of themes within documents, 4) the quantification of the relevance of themes in different documents, 5) the aggregation of relevant information, e.g. across sectors or over time, 6) the application of the generated indicators. The tutorial will provide some first insights and recommendations concerning these steps of the analysis and address open issues regarding, e.g., computational complexity and statistical robustness of the methods. All steps will be illustrated with empirical examples.
David Suda, University of Malta, Malta.
Regularisation and sparsity in statistics refer to techniques which are not necessarily solely applicable to the high-dimensionality problem, but can certainly be beneficial to solving such problems. In this course, we start by going through techniques in regularised regression (namely penalised regression and partial least squares) and then also dimension reduction, and also how they can be applied to the high-dimensional context. When it comes to dimension reduction, we namely consider the PCA class of techniques, however we also look into techniques within this class that are more applicable to the time series context. Apart from regularisation and sparsity, this course also aims to look into metaheuristic algorithms which can be used in the variable selection problem, such as the genetic algorithm and the firefly algorithm to name a few. We shall be going through a number of hands-on examples, and also a number of examples in literature where the aforementioned techniques are used. Installation of R/R Studio on one’s device prior to the course is recommended.
Vladimir Batagelj, University of Ljubljana, Slovenia.
In the SNA literature, we can find some well-known 3-way networks such as CKM physicians' innovation (1957), Kapferer tailor shop (1972), Krackhardt office CSS (1987), Lazega law firm (2001), etc. Recently physicists working on complex networks, for example, Manlio De Domenico (2015), became interested in multiplex (multi-relational) networks - a subclass of multiway networks. In 1992 Borgatti and Everett, following Baker (1986), extended the blockmodeling to general k-way binary networks. In Genova (2022) a 4-way network about Italian student mobility, based on V = ( provinces, universities, programs, years ) was analyzed. Similar is the World trade network (exporters, importers, categories, years).
A weighted multiway network N = (V,L,w) is based on nodes from k finite sets (ways or dimensions) V = (V_1, V_2, ..., V_k), the set of links L ⊆ V_1 ╳ V_2 ╳ ... ╳ V_k, and the weight w : L → R. In a general multiway network, different additional data (node properties, link weights) can be known.
The course starts with some examples of multiway networks and a format for their description. The participants will learn about the basic notions and different transformations of multiway networks such as slicing, reordering of ways, joining the ways, flattening of a way, projection to a selected way, aggregation by a way partition (blockmodeling), normalization, recoding (binarization), 3D visualization (based on X3D), connectivity, cores, and others. They will be illustrated by their application in the analysis of different multiway networks.
To support the analysis of multiway networks in R, the R package MWnets is developed. The current version of the package is available at https://github.com/bavla/ibm3m/tree/master/multiway .