
Assistant Professor
University of Western Macedonia, Department of Communication and Digital Media
I am an Assistant Professor at the University of Western Macedonia at the Department of Communication and Digital Media, Kastoria, Greece. I received my Ph.D. from the Department of Computer Science & Engineering, University of Ioannina, Greece, and my M.Sc. and B.Sc. in Computer Science from the same institution. For the Academic year 2018-2019, I was an Adjunct Lecturer at the Department of Computer Science & Engineering, University of Ioannina. In the past, I have spent two years at the University of Houston, TX, USA where I worked as a Postdoctoral Fellow at Computational Biomedicine Lab.
My research covers a wide range of topics such as Computer Vision, Image and Video Processing, Image Analysis, Augmented Reality, Machine Learning, and Pattern Recognition with applications also to Medical Image Analysis, Biometrics, and Cardiovascular Informatics.
University of Western Macedonia, Department of Communication and Digital Media
University of Ioannina, Department of Computer Science & Engineering
University of Ioannina, Department of Computer Science & Engineering
University of Houston, Computational Biomedicine Lab
Ph.D. in Computer Science
University of Ioannina
Department of Computer Science and Engineering
Master in Computer Science
University of Ioannina
Department of Computer Science
Bachelor in Computer Science
University of Ioannina
Department of Computer Science
© Copyright Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright and/or copyright holders. All statements of fact, opinion or conclusions contained herein are those of the authors and should not be construed as representing the official views or policies of the sponsors. In most cases, these works may not be reposted without the explicit permission of the copyright holders. Please contact authors for further details.
In this work, a supervised probabilistic approach is proposed that integrates the learning using privileged information (LUPI) paradigm into a hidden conditional random field (HCRF) model, called HCRF+, for human action recognition. The proposed model employs a self-training technique for automatic estimation of the regularization parameters of the objective function. Moreover, the method provides robustness to outliers by modeling the conditional distribution of the privileged information by a Student's t-density function, which is naturally integrated into the HCRF+ framework. The proposed method was evaluated using different forms of privileged information on four publicly available datasets. The experimental results demonstrate its effectiveness concerning the-state-of-the-art in the LUPI framework using both hand-crafted and deep learning-based features extracted from a convolutional neural network.
@article{MVrigkas_etal_21, author = {Michalis Vrigkas and Evangelos Kazakos and Christophoros Nikou and Ioannis A. Kakadiaris}, title = {Human activity recognition using robust adaptive privileged probabilistic learning}, journal = {Pattern Analysis and Applications}, pages = {1--18}, month = {January}, year = {2021}, doi = {https://doi.org/10.1007/s10044-020-00953-x} }
In this paper, we study the concepts, materials, tools, and applications that constitute what we call augmented reality (AR) for the wine industry. A comprehensive review of what are the basic multimedia content and the minimum algorithmic requirements used to implement successful AR applications for wine products is given. To this end, we provide a detailed analysis of how AR technology is used to create augmented “live” wine labels, and how digital storytelling has revolutionized wine products marketing. Also, we describe the use of AR technology to promote winemaking companies to influence consumer preferences. Finally, we report the characteristics of future research directions and some open issues and challenges on using AR for wine product promotion.
@inproceedings{MVrigkas_etal_ETLTC_21, author = {Michalis Vrigkas and Georgios Lappas and Alexandros Kleftodimos and Amalia Triantafillidou}, title = {Augmented Reality for Wine Industry: Past, Present, and Future}, booktitle = {3rd ACM Chapter International Conference on Information and Communications Technology}, address = {Aizuwakamatsu Japan}, pages = {}, month = {January}, year = {2021} }
In this paper, the task of gender and age recognition is performed on pedestrian still images, which are usually captured in-the-wild with no near face-frontal information. Moreover, another difficulty originates from the underlying class imbalance in real examples, especially for the age estimation problem. The scope of the paper is to examine how different loss functions in convolutional neural networks (CNN) perform under the class imbalance problem. For this purpose, as a backbone, we employ the Residual Network (ResNet). On top of that, we attempt to benefit from appearance-based attributes, which are inherently present in the available data. We incorporate this knowledge in an autoencoder, which we attach to our baseline CNN for the combined model to jointly learn the features and increase the classification accuracy. Finally, all of our experiments are evaluated on two publicly available datasets.
@inproceedings{GChatzitzisi_etal_20, author = {Georgia Chatzitzisi and Michalis Vrigkas and Christophoros Nikou}, title = {Gender and age estimation without facial information from still images}, booktitle = {International Symposioum on Visual Computing}, publisher = {Springer International Publishing}, address = {San Diego, CA}, pages = {488-500}, month = {October}, year = {2020}, isbn = {978-3-030-64556-4} }
Advanced methodologies for transmitting compressed images, within acceptable ranges of transmission rate and loss of information, make it possible to transmit a medical image through a communication channel. Most prior works on 3D medical image compression consider volumetric images as a whole but fail to account for the spatial and temporal coherence of adjacent slices. In this paper, we set out to develop a 3D medical image compression method that extends the 3D wavelet difference reduction algorithm by computing the similarity of the pixels in adjacent slices and progressively compress only the similar slices. The proposed method achieves high-efficiency performance on publicly available datasets of MRI scans by achieving compression down to one bit per voxel with PSNR and SSIM up to 52.3 dB and 0.7578, respectively.
@inproceedings{MZerva_etal_20, author = {Matina Ch. Zerva and Michalis Vrigkas and Lisimachos P. Kondi and Christophoros Nikou}, title = {Improving {3D} medical image compression efficiency using spatiotemporal coherence}, booktitle = {Proc. IS&T International Symposioum on Electronic Imaging, Image Processing: Algorithms and Systems XVII}, pages = {63-1-63-6}, address = {Burlingame, CA}, month = {January}, year = {2020} }
Many approaches for action recognition focus on general actions, such as “running” or “walking”. This work presents a method for recognizing carrying actions in single images, by utilizing privileged information, such as annotations, available only during training, following the learning using privileged information paradigm. In addition, we introduce a dataset for carrying actions, formed using images extracted from YouTube videos depicting several scenarios. We accompany the dataset with a variety of different annotation types that include human pose, object and scene attributes. The experimental results demonstrate that our method, boosted sample averaged F1 score performance by 15.4% and 4.15% respectively, in the validation and testing partitions of our dataset, when compared to an end-to-end CNN model, trained only with observable information.
@inproceedings{CSmailis_ICIP19, author = {Christos Smailis and Michalis Vrigkas and Ioannis A. Kakadiaris}, title = {RECASPIA: Recognizing carrying actions in single images using privileged information}, booktitle = {Proc. 26th IEEE International Conference on Image Processing}, pages = {26--30}, address = {Taipei, Taiwan}, month = {September}, year = {2019} }
Hidden conditional random fields (HCRFs) are a powerful supervised classification system, which is able to capture the intrinsic motion patterns of a human action. However, finding the optimal number of hidden states remains a severe limitation for this model. This paper addresses this limitation by proposing a new model, called robust incremental hidden conditional random field (RI-HCRF). A hidden Markov model (HMM) is created for each observation paired with an action label and its parameters are defined by the potentials of the original HCRF graph. Starting from an initial number of hidden states and increasing their number incrementally, the Viterbi path is computed for each HMM. The method seeks for a sequence of hidden states, where each variable participates in a maximum number of optimal paths. Thereby, variables with low participation in optimal paths are rejected. In addition, a robust mixture of Student's t-distributions is imposed as a regularizer to the parameters of the model. The experimental results on human action recognition show that RI-HCRF successfully estimates the number of hidden states and outperforms all state-of-the-art models.
@inproceedings{MVrigkas_ISVC18, author = {Michalis Vrigkas and Ermioni Mastora and Christophoros Nikou and Ioannis A. Kakadiaris}, title = {Robust incremental hidden conditional random fields for human action recognition}, booktitle = {Proc. 13th International Symposium on Visual Computing}, address = {Las Vegas, NV}, month = {November}, pages = {126--136}, year = {2018} }
The 2013 ACC/AHA Pooled Cohort Equations risk calculator has been shown to be inaccurate in certain populations. Using the same risk variables, we developed a Machine Learning-based risk calculator in the MESA (Multi-Ethnic Study of Atherosclerosis) cohort and validated in the Flemish Study on Environment, Genes and Health Outcomes (FLEMENGHO). The ML Risk Calculator outperformed the ACC/AHA Risk Calculator by recommending less drug therapy, yet missing fewer CVD events. These findings demonstrate the potential of Machine Learning to assist medical decision-making.
@article{Kakadiaris_JAHA18, author = {Ioannis A. Kakadiaris and Michalis Vrigkas and Albert A. Yen and Tatiana Kuznetsova and Matthew Budoff and Morteza Naghavi}, title = {Machine learning outperforms {ACC/AHA CVD} risk calculator in {MESA}}, journal = {Journal of the American Heart Association}, volume = {7}, number = {22}, pages = {e009476}, year = {2018}, month = {November}, doi = {10.1161/JAHA.118.009476} }
Introduction: Machine learning (ML) is poised to revolutionize healthcare. Current national guidelines for prediction and prevention of atherosclerotic cardiovascular disease (ASCVD) use ACC/AHA Pooled Cohort Equation Risk Calculator which relies on traditional risk factors and linear statistical models. Unfortunately, this approach yields a low level of sensitivity and specificity. The low sensitivity results in missing high-risk individuals who need intensive therapy and the low specificity results in millions of people unnecessarily recommended drugs such as statin. We aimed to utilize Machine Learning (ML) to create a more accurate predictor of ASCVD events and whom to recommend statin.
Methods: We developed and validated a ML Risk Calculator based on Support Vector Machines (SVMs) using the latest 13-year follow up dataset from MESA (Multi-Ethnic Study of Atherosclerosis) of 6,459 participants who were free of cardiovascular disease at baseline. We provided identical input to the ACC/AHA and ML risk calculators and compared their accuracy. We also validated the ML model in another longitudinal cohort: the Flemish Study on Environment, Genes and Health Outcomes (FLEMENGHO).
Results: According to the ACC/AHA Risk Calculator and a 7.5% 10-year risk threshold, 46.0% would be recommended statin. Despite this high proportion, 23.8% of the 480 “Hard CVD” events occurred in those not recommended statin, resulting in sensitivity (Sn) 0.76, specificity (Sp) 0.56, and AUC 0.71. In contrast, ML Risk Calculator recommended statin to 11.4%, and only 14.4% of “Hard CVD” events occurred in those not recommended statin, resulting in Sn 0.86, Sp 0.95, and AUC 0.92. Similar results were seen in prediction of “All CVD” events.
Conclusions: The ML Risk Calculator outperformed the ACC/AHA Risk Calculator by recommending less drug therapy, yet missing fewer events. Additional studies are underway to validate the ML model in other cohorts and to explore its ability in predicting short-term (1-5 years) events with additional biomarkers including imaging. Machine learning is paving the way for early detection of asymptomatic high-risk individuals destined to a CVD event in the near future, the Vulnerable Patient
@inproceedings{Kakadiaris_etal18, author = {Ioannis A Kakadiaris and Michail Vrigkas and Albert Yen and Tatiana Kuznetsova and Matthew Budoff and Morteza Naghavi}, title = {Machine learning outperforms ACC/AHA CVD risk calculator in MESA offering new opportunities for short-term risk prediction and early detection of the vulnerable patient}, journal = {Circulation}, volume = {138}, number = {Suppl\ 1}, pages = {A17154--A17154}, year = {2018}, month = {November}, address = {Chicago, IL}, publisher = {American Heart Association, Inc.}, doi = {10.1161/circ.138.suppl\_1.17154} }
An algorithm for the localization and counting of cells in histopathological images is presented. The algorithm relies on the presegmentation of an image into a number of superpixels followed by two random forests for classification. The first random forest determines if there are any cells in the superpixels at its input and the second random forest provides the number of cells in the respective superpixel. The algorithm is evaluated on a bone marrow histopathological dataset. We argue that a single random forest is not sufficient to detect all the cells in the image while a cascade of classifiers achieves higher accuracy. The results compare favorably with the state of the art but with a lower computational cost.
@inproceedings{MOman_VISAPP18, author = {Oman Maga\~{n}a-Tellez and Michalis Vrigkas and Christophoros Nikou and Ioannis A. Kakadiaris}, title = {SPICE: Superpixel classification for cell detection and counting}, booktitle = {Proc. 13th International Conference on Computer Vision Theory and Applications}, address = {Funchal, Madeira, Portugal}, month = {January}, pages = {485--490}, year = {2018} }
Studies have shown that the status quo for atherosclerotic cardiovascular disease (ASCVD) prediction in the U.S. - using ACC/AHA Pooled Cohort Equations Risk Calculator - is inaccurate and results in overtreatment of low-risk and undertreatment of high-risk individuals. Machine Learning (ML) is poised to revolutionize healthcare. We used ML to develop a new ASCVD risk calculator and tackled the problem.
@article{IKakadiaris_AHA17, author = {Ioannis Kakadiaris and Michalis Vrigkas and Matthew Budoff and Albert Yen and Morteza Naghavi}, title = {Machine learning outperformed {ACC/AHA} {P}ooled {C}ohort {E}quations {R}isk {C}alculator for detection of high-risk asymptomatic individuals and recommending treatment for prevention of cardiovascular events in the {M}ulti-{E}thnic {S}tudy of {A}therosclerosis {(MESA)}}, volume = {136}, number = {Suppl 1}, pages = {A23075--A23075}, year = {2017}, month = {November 11-15}, address = {Anaheim, CA}, publisher = {American Heart Association, Inc.}, issn = {0009-7322}, URL = {http://circ.ahajournals.org/content/136/Suppl_1/A23075}, eprint = {http://circ.ahajournals.org/content}, journal = {Circulation} }
Classification models may often suffer from “structure imbalance” between training and testing data that may occur due to the deficient data collection process. This imbalance can be represented by the learning using privileged information (LUPI) paradigm. In this paper, we present a supervised probabilistic classification approach that integrates LUPI into a hidden conditional random field (HCRF) model. The proposed model is called LUPI-HCRF and is able to cope with additional information that is only available during training. Moreover, the proposed method employes Student's
@inproceedings{MVrigkas_ICCVW17, author = {Michalis Vrigkas and Evangelos Kazakos and Christophoros Nikou and Ioannis A. Kakadiaris}, title = {Inferring human activities using robust privileged probabilistic learning}, booktitle = {Proc. IEEE International Conference on Computer Vision Workshops}, year = {2017}, month = {October}, pages = {2658--2665}, address = {Venice, Italy} }
Incorporating additional knowledge in the learning process can be beneficial for several computer vision and machine learning tasks. Whether privileged information originates from a source domain that is adapted to a target domain, or as additional features available at training time only, using such privileged (i.e., auxiliary) information is of high importance as it improves the recognition performance and generalization. However, both primary and privileged information are rarely derived from the same distribution, which poses an additional challenge to the recognition task. To address these challenges, we present a novel learning paradigm that leverages privileged information in a domain adaptation setup to perform visual recognition tasks. The proposed framework, named Adaptive SVM+, combines the advantages of both the learning using privileged information (LUPI) paradigm and the domain adaptation framework, which are naturally embedded in the objective function of a regular SVM. We demonstrate the effectiveness of our approach on the publicly available Animals with Attributes and INTERACT datasets and report state-of-the-art results in both of them.
@inproceedings{NSarafianos_ICCVW17, author = {Nikolaos Sarafianos and Michalis Vrigkas and Ioannis A. Kakadiaris}, title = {Adaptive SVM+: Learning with privileged information for domain adaptation}, booktitle = {Proc. IEEE International Conference on Computer Vision Workshops}, year = {2017}, month = {October}, pages = {2637--2644}, address = {Venice, Italy} }
In this paper, a human behavior recognition method using multimodal features is presented. We focus on modeling individual and social behaviors of a subject (e.g., friendly/aggressive or hugging/kissing behaviors) with a hidden conditional random field (HCRF) in a supervised framework. Each video is represented by a vector of spatio-temporal visual features (STIP, head orientation and proxemic features) along with audio features (MFCCs). We propose a feature pruning method for removing irrelevant and redundant features based on the spatio-temporal neighborhood of each feature in a video sequence. The proposed framework assumes that human movements are highly correlated with sound emissions. For this reason, canonical correlation analysis (CCA) is employed to find correlation between the audio and video features prior to fusion. The experimental results, performed in two human behavior recognition datasets including political speeches and human interactions from TV shows, attest the advantages of the proposed method compared with several baseline and alternative human behavior recognition methods.
@article{MVrigkas_TAffC15, author = {Michalis Vrigkas and Christophoros Nikou and Ioannis A. Kakadiaris}, title = {Identifying human behaviors using synchronized audio-visual cues}, journal = {IEEE Transactions on Affective Computing}, year = {2017}, volume = {8}, number = {1}, pages = {54-66}, doi = {10.1109/TAFFC.2015.2507168}, month = {January} }
This thesis solves the problem of human activity recognition from video sequences. To model human activities, conditional random fields were applied using data from heterogeneous sources. Moreover, a novel classification scheme that is based on the learning using privileged information (LUPI) paradigm was also proposed, where privileged information is given as an additional input to the classification model and it is available only during training but never during testing. Experimental results demonstrated that privileged information helps to build a stronger classifier than one would not learn without it, while it significantly increases the recognition accuracy of the model.
@phdthesis{phdthesisMVrigkas16, author = {Michalis Vrigkas}, title = {Human activity recognition using conditional random fields and privileged information}, school = {Department of Computer Science and Engineering, University of Ioannina}, year = {2018}, month = {May} }
In many human activity recognition systems the size of the unlabeled training data may be significantly large due to expensive human effort required for data annotation. Moreover, the insufficient data collection process from heterogenous sources may cause dissimilarities between training and testing data. To address these limitations, a novel probabilistic approach that combines learning using privileged information (LUPI) and active learning is proposed. A pool-based privileged active learning approach is presented for semi-supervising learning of human activities from multimodal labeled and unlabeled data. Both uncertainty and distance from the decision boundary are used as a query inference strategies for selecting an unlabeled observation and query its label. Experimental results in four publicly available datasets demonstrate that the proposed method can identify with high accuracy complex human activities.
@inproceedings{MVrigkas_ICIP16, author = {Michalis Vrigkas and Christophoros Nikou and Ioannis A. Kakadiaris}, title = {Active privileged learning of human activities from weakly labeled samples}, booktitle = {Proc. 23rd IEEE International Conference on Image Processing}, year = {2016}, month = {September}, pages = {3036--3040}, address = {Phoenix, AZ} }
Most of the facial expression recognition methods consider that both training and testing data are equally distributed. As facial image sequences may contain information for heterogeneous sources, facial data may be asymmetrically distributed between training and testing, as it may be difficult to maintain the same quality and quantity of information. In this work, we present a novel classification method based on the learning using privileged information (LUPI) paradigm to address the problem of facial expression recognition. We introduce a probabilistic classification approach based on conditional random fields (CRFs) to indirectly propagate knowledge from privileged to regular feature space. Each feature space owns specific parameter settings, which are combined together through a Gaussian prior, to train the proposed t-CRF+ model and allow the different tasks to share parameters and improve classification performance. The proposed method is validated on two challenging and publicly available benchmarks on facial expression recognition and improved the state-of-the-art methods in the LUPI framework.
@inproceedings{MVrigkas_ICIP16, author = {Michalis Vrigkas and Christophoros Nikou and Ioannis A. Kakadiaris}, title = {Exploiting privileged information for facial expression recognition}, booktitle = {Proc. 9th IAPR/IEEE International Conference on Biometrics}, year = {2016}, month = {June}, pages = {1--8}, address = {Halmstad, Sweden}, doi = {10.1109/ICB.2016.7550048}, note = {Honorable Mention Paper Award} }
Recognizing human activities from video sequences or still images is a challenging task due to problems, such as background clutter, partial occlusion, changes in scale, viewpoint, lighting, and appearance. Many applications, including video surveillance systems, human-computer interaction, and robotics for human behavior characterization, require a multiple activity recognition system. In this work, we provide a detailed review of recent and state-of-the-art research advances in the field of human activity classification. We propose a categorization of human activity methodologies and discuss their advantages and limitations. In particular, we divide human activity classification methods into two large categories according to whether they use data from different modalities or not. Then, each of these categories is further analyzed into sub-categories, which reflect how they model human activities and what type of activities they are interested in. Moreover, we provide a comprehensive analysis of the existing, publicly available human activity classification datasets and examine the requirements for an ideal human activity recognition dataset. Finally, we report the characteristics of future research directions and present some open issues on human activity recognition.
@article{MVrigkas_FRONTIERS2015, author = {Michalis Vrigkas and Christophoros Nikou and Ioannis A. Kakadiaris}, title = {A review of human activity recognition methods}, journal = {Frontiers in Robotics and Artificieal Inteligence}, volume = {2}, number = {28}, pages = {1--26}, year = {2015}, url = {http://www.frontiersin.org/vision_systems_theory,_tools_and_applications/10.3389/frobt.2015.00028/abstract}, doi = {10.3389/frobt.2015.00028}, issn = {2296-9144} }
The automated interpretation of Pap smear images is a challenging issue with several aspects. The accurate segmentation of the structuring elements of each cell is a crucial procedure which entails in the correct identification of pathological situations. However, the extended cell overlapping in Pap smear slides complicates the automated analysis of these cytological images. In this work, we propose an efficient algorithm for the separation of the cytoplasm area of overlapping cells. The proposed method is based on the fact that in isolated cells the pixels of the cytoplasm exhibit similar features and the cytoplasm area is homogeneous. Thus, the observation of intensity changes in extended subareas of the cytoplasm indicates the existence of overlapping cells. In the first step of the proposed method, the image is tesselated into perceptually meaningful individual regions using a superpixel algorithm. In a second step, these areas are merged into regions exhibiting the same characteristics, resulting in the identification of each cytoplasm area and the corresponding nuclei. The area of overlap is then detected using an algorithm that specifies faint changes in the intensity of the cytoplasm of each cell. The method has been evaluated on cytological images of conventional Pap smears, and the results are very promising.
@inproceedings{MPlissiti_IWSSIP15, author = {Marina E. Plissiti and Michalis Vrigkas and Christophoros Nikou}, title = {Segmentation of cell clusters in Pap smear images using intensity variation between superpixels}, booktitle = {Proc. 22nd International Conference on Systems, Signals and Image Processing}, year = {2015}, month = {September}, pages = {184--187}, address = {London, UK} }
A global robust M-estimation scheme for maximum a posteriori (MAP) image super-resolution which efficiently addresses the presence of outliers in the low-resolution images is proposed. In iterative MAP image super-resolution, the objective function to be minimized involves the highly resolved image, a parameter controlling the step size of the iterative algorithm, and a parameter weighing the data fidelity term with respect to the smoothness term. Apart from the robust estimation of the high-resolution image, the contribution of the proposed method is twofold: (1) the robust computation of the regularization parameters controlling the relative strength of the prior with respect to the data fidelity term and (2) the robust estimation of the optimal step size in the update of the high-resolution image. Experimental results demonstrate that integrating these estimations into a robust framework leads to significant improvement in the accuracy of the high-resolution image.
@article{MVrigkas_JEI14, author = {Michalis Vrigkas and Christophoros Nikou and Lisimachos P. Kondi}, title = {Robust maximum a posteriori image super-resolution}, journal = {Journal of Electronic Imaging}, volume = {23}, number = {4}, pages = {043016}, year = {2014}, isbn = {1017-9909}, doi = {10.1117/1.JEI.23.4.043016}, URL = {http://dx.doi.org/10.1117/1.JEI.23.4.043016} }
A human behavior recognition method with an application to political speech videos is presented. We focus on modeling the behavior of a subject with a conditional random field (CRF). The unary terms of the CRF employ spatiotemporal features (i.e., HOG3D, STIP and LBP). The pairwise terms are based on kinematic features such as the velocity and the acceleration of the subject. As an exact solution to the maximization of the posterior probability of the labels is generally intractable, loopy belief propagation was employed as an approximate inference method. To evaluate the performance of the model, we also introduce a novel behavior dataset, which includes low resolution video sequences depicting different people speaking in the Greek parliament. The subjects of the Parliament dataset are labeled as friendly, aggressive or neutral depending on the intensity of their political speech. The discrimination between friendly and aggressive labels is not straightforward in political speeches as the subjects perform similar movements in both cases. Experimental results show that the model can reach high accuracy in this relatively difficult dataset.
@inproceedings{MVrigkas_SETN14, author = {Michalis Vrigkas and Christophoros Nikou and Ioannis A. Kakadiaris}, title = {Classifying behavioral attributes using conditional random fields}, booktitle = {Proc. 8th Hellenic Conference on Artificial Intelligence}, year = {2014}, month = {May}, pages = {95--104}, volume = {8445}, series = {Lecture Notes in Computer Science}, address = {Ioannina, Greece} }
A learning-based framework for action representation and recognition relying on the description of an action by time series of optical flow motion features is presented. In the learning step, the motion curves representing each action are clustered using Gaussian mixture modeling (GMM). In the recognition step, the optical flow curves of a probe sequence are also clustered using a GMM, then each probe sequence is projected onto the training space and the probe curves are matched to the learned curves using a non-metric similarity function based on the longest common subsequence, which is robust to noise and provides an intuitive notion of similarity between curves. Alignment between the mean curves is performed using canonical time warping. Finally, the probe sequence is categorized to the learned action with the maximum similarity using a nearest neighbor classification scheme. We also present a variant of the method where the length of the time series is reduced by dimensionality reduction in both training and test phases, in order to smooth out the outliers, which are common in these type of sequences. Experimental results on KTH, UCF Sports and UCF YouTube action databases demonstrate the effectiveness of the proposed method.
@article{MVrigkas_CVIU14, author = {Michalis Vrigkas and Vasileios Karavasilis and Christophoros Nikou and Ioannis A. Kakadiaris}, title = {Matching mixtures of curves for human action recognition}, journal = {Computer Vision and Image Understanding}, volume = {119}, pages = {27--40}, year = {2014}, issn = {1077--3142}, doi = {http://dx.doi.org/10.1016/j.cviu.2013.11.007} }
The accuracy of image registration plays a dominant role in image super-resolution methods and in the related literature, landmark-based registration methods have gained increasing acceptance in this framework. In this work, we take advantage of a maximum a posteriori (MAP) scheme for image super-resolution in conjunction with the maximization of mutual information to improve image registration for super-resolution imaging. Local as well as global motion in the low-resolution images is considered. The overall scheme consists of two steps. At first, the low-resolution images are registered by establishing correspondences between image features. The second step is to fine-tune the registration parameters along with the high-resolution image estimation, using the maximization of mutual information criterion. Quantitative and qualitative results are reported indicating the effectiveness of the proposed scheme, which is evaluated with different image features and MAP image super-resolution computation methods.
@article{MVrigkas_SPIC13, author = {Michalis Vrigkas and Christophoros Nikou and Lisimachos P. Kondi}, title = {Accurate image registration for \{MAP\} image super-resolution}, journal = {Signal Processing: Image Communication}, volume = {28}, number = {5}, pages = {494--508}, year = {2013}, issn = {0923-5965}, doi = {10.1016/j.image.2012.12.008} }
A framework for action representation and recognition based on the description of an action by time series of optical flow motion features is presented. In the learning step, the motion curves representing each action are clustered using Gaussian mixture modeling (GMM). In the recognition step, the optical flow curves of a probe sequence are also clustered using a GMM and the probe curves are matched to the learned curves using a non-metric similarity function based on the longest common subsequence which is robust to noise and provides an intuitive notion of similarity between trajectories. Finally, the probe sequence is categorized to the learned action with the maximum similarity using a nearest neighbor classification scheme. Experimental results on common action databases demonstrate the effectiveness of the proposed method.
@inproceedings{MVrigkas_VISAPP13, author = {Michalis Vrigkas and Vasileios Karavasilis and Christophoros Nikou and Ioannis Kakadiaris}, title = {Action recognition by matching clustered trajectories of motion vectors}, booktitle = {Proc. 8th International Conference on Computer Vision Theory and Applications}, year = {2013}, pages = {112--117}, address = {Barcelona, Spain}, month = {February} }
In this work, we propose an adaptive M-estimation scheme for robust image super-resolution. The proposed algorithm relies on a maximum a posteriori (MAP) framework and addresses the presence of outliers in the low resolution images. Moreover, apart from the robust estimation of the high resolution image, the contribution of the method is twofold: (i) the robust computation of the regularization parameters controlling the relative strength of the prior with respect to the data fidelity term and (ii) the robust estimation of the optimal step size in the update of the high resolution image. Experimental results demonstrate that integrating these estimations into a robust framework leads to significant improvement in the accuracy of the high resolution image.
@inproceedings{MVrigkas_ICIP12, author = {Michalis Vrigkas and Christophoros Nikou and Lisimachos P. Kondi}, title = {A fully robust framework for MAP image Super-Resolution}, booktitle = {Proc. IEEE International Conference on Image Processing}, year = {2012}, pages = {2225--2228}, address = {Orlando, FL}, month = {September} }
Accurate image registration plays a preponderant role in image super-resolution methods and in the related literature landmarkbased registration methods have gained increasing acceptance in this framework. However, their solution relies on point correspondences and on least squares estimation of the registration parameters necessitating further improvement. In this work, a maximum a posteriori scheme for image super-resolution is presented where the image registration part is accomplished in two steps. At first, the lowresolution images are registered by establishing correspondences between robust SIFT features. In the second step, the estimation of the registration parameters is fine-tuned along with the estimation of the high resolution image, in an iterative scheme, using the maximization of the mutual information criterion. Numerical results showed that the reconstructed image is consistently of higher quality than in standard MAP-based methods employing only landmarks.
@inproceedings{MVrigkas_ICASSP11, author = {Michalis Vrigkas and Christophoros Nikou and Lisimachos P. Kondi}, title = {On the improvement of image registration for high accuracy super-resolution}, booktitle = {Proc. IEEE International Conference on Acoustics, Speech and Signal Processing}, year = {2011}, pages = {981--984}, address = {Prague, Czech Republic}, month = {May} }
I am actively involved in learning efficient and discriminative image representations and provide solutions to challenging real-world problems. My research is focused on Computer Vision, Image and Video Processing, Machine Learning, and Augmented Reality. Special areas of research such as Biometrics, Medical Image Analysis, and Predictive Analytics for heart attack prediction have also attracted my interest. As a Postdoctoral Fellow at the University of Houston, my research was focused on challenging tasks with a significant societal impact and is related to developing machine learning algorithms for medical and biometric applications.
Each of the application areas described above employs a range of computer vision tasks; more or less well-defined measurement or processing problems, which can be solved using a variety of methods. I approach these problems with methods from signal processing and applied mathematics. I have worked on several research projects and C/C++, MATLAB, and Python are the languages I prefer to use for code developing.
The goal of this project is to empower cancer patients to digitize, securely manage and selectively disseminate their own medical records to doctors and healthcare providers, exploiting modern mobile devices and the ubiquity of wireless broadband internet access. Pursuit of the project’s goal is through the implementation of an information system for the creation, management and sharing of cancer patients’ medical records.
Numbers related to cancer incidence are dazzling; in Greece alone more than 40,000 cancer patients are added every year. Internationally, the number of new cancer cases exceeds 14,000,000 cases a year. This project shall enable members of the global cancer patient population to easily and securely manage their own medical records and share them with doctors and healthcare providers all over the globe. This will make remote consultation with doctors who are thousands of miles away much easier and affordable. For the same reason, this project shall have a serious impact on the ways that doctors and medical professionals in the fields of cancer diagnosis and treatment manage information about existing and new patients, allowing them to provide consultation to larger numbers of patients than is currently possible using conventional means for managing and sharing medical records.
The Parliament dataset is a collection of 228 video sequences, depicting political speeches in the Greek parliament, at a resolution of 320 × 240 pixels at 25 fps. All behaviors were recorded for 20 different subjects. The videos were acquired with a static camera and contain uncluttered backgrounds. The length of the video sequences is 250 frames. The video sequences were manually labeled with one of three behavioral labels: friendly (90 videos), aggressive (73 videos), or neutral (65 videos). The subjects express their opinion on a specific law proposal and they adjust their body movements and voice intensity level according to whether they agree with that or not.
The dataset was annotated by two observers of Greek origin, who watched the videos independently and recorded their labels separately. Disagreement was resolved by a third observer. The observers were asked to categorize the videos with respect to the notions of kindness and aggressiveness according to a general perception of a political speech by a citizen with a Greek mentality as follows. (i) Subjects with large and abrupt body, head and hand movements and high speech signal amplitude are to be labeled as aggressive. This corresponds to statesmen who express strongly their disagreement with the topic discussed or a previous speech given by a political opponent. (ii) Subjects with very small variations in their motion and speech signal amplitude are to be labeled as neutral. This class includes standard political speeches only expressing a point of view without any strong indication (body motion or voice tone) of agreement or disagreement with the topic discussed. (iii) Subjects with large but smooth variations in the pose of their body and hands speaking with a normal speech signal amplitudes are to be labeled as friendly.
If you use this dataset, I would be grateful if you cite with one of the following related publications:
Education is an admirable thing. But it is well to remember from time to time that nothing that is worth knowing can be taught.
Instructor For more information see the course web page here.
Instructor For more information see the course web page here.
Instructor For more information see the course web page here.
Instructor For more information see the course web page here.
Instructor For more information see the course web page here.
Instructor For more information see the course web page here.
Instructor For more information see the course web page here.
I would be happy to talk to you if you need my assistance in your research or you have any suggestions or comments about my work.
Coordinates: 40.504973, 21.248093
University of Western Macedonia,
Department of Communication and Digital Media,
Fourka Area,
GR 52100, Kastoria,
Greece.