Journal of machine learning research impact factor 2022

Journal Description

Machine Learning and Knowledge Extraction is an international, scientific, peer-reviewed, open access journal. It publishes original research articles, reviews, tutorials, research ideas, short notes and Special Issues that focus on machine learning and applications. Please see our video on YouTube explaining the MAKE journal concept. The journal is published quarterly online by MDPI.

  • Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
  • High Visibility: indexed within Scopus, ESCI (Web of Science), dblp, and other databases.
  • Rapid Publication: manuscripts are peer-reviewed and a first decision provided to authors approximately 19.5 days after submission; acceptance to publication is undertaken in 3.6 days (median values for papers published in this journal in the first half of 2022).
  • Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
  • MAKE is a companion journal of Entropy.

Latest Articles

Open AccessArticle

FeaSel-Net: A Recursive Feature Selection Callback in Neural Networks

Abstract

Selecting only the relevant subsets from all gathered data has never been as challenging as it is in these times of big data and sensor fusion. Multiple complementary methods have emerged for the observation of similar phenomena; oftentimes, many of these techniques are [...] Read more.

Selecting only the relevant subsets from all gathered data has never been as challenging as it is in these times of big data and sensor fusion. Multiple complementary methods have emerged for the observation of similar phenomena; oftentimes, many of these techniques are superimposed in order to make the best possible decisions. A pathologist, for example, uses microscopic and spectroscopic techniques to discriminate between healthy and cancerous tissue. Especially in the field of spectroscopy in medicine, an immense number of frequencies are recorded and appropriately sized datasets are rarely acquired due to the time-intensive measurements and the lack of patients. In order to cope with the curse of dimensionality in machine learning, it is necessary to reduce the overhead from irrelevant or redundant features. In this article, we propose a feature selection callback algorithm (FeaSel-Net) that can be embedded in deep neural networks. It recursively prunes the input nodes after the optimizer in the neural network achieves satisfying results. We demonstrate the performance of the feature selection algorithm on different publicly available datasets and compare it to existing feature selection methods. Our algorithm combines the advantages of neural networks’ nonlinear learning ability and the embedding of the feature selection algorithm into the actual classifier optimization. Full article

Show Figures

Open AccessArticle

Lottery Ticket Structured Node Pruning for Tabular Datasets

Abstract

This paper experiments with well known pruning approaches, iterative and one-shot, and presents a new approach to lottery ticket pruning applied to tabular neural networks based on iterative pruning. Our contribution is a standard model for comparison in terms of speed and performance [...] Read more.

This paper experiments with well known pruning approaches, iterative and one-shot, and presents a new approach to lottery ticket pruning applied to tabular neural networks based on iterative pruning. Our contribution is a standard model for comparison in terms of speed and performance for tabular datasets that often do not get optimized through research. We show leading results in several tabular datasets that can compete with ensemble approaches. We tested on a wide range of datasets with a general improvement over the original (already leading) model in 6 of 8 datasets tested in terms of F1/RMSE. This includes a total reduction of over 85% of nodes with the additional ability to prune over 98% of nodes with minimal affect to accuracy. The new iterative approach we present will first optimize for lottery ticket quality by selecting an optimal architecture size and weights, then apply the iterative pruning strategy. The new iterative approach shows minimal degradation in accuracy compared to the original iterative approach, but it is capable of pruning models much smaller due to optimal weight pre-selection. Training and inference time improved over 50% and 10%, respectively, and up to 90% and 35%, respectively, for large datasets. Full article

(This article belongs to the Section Learning)

Show Figures

Open AccessFeature PaperArticle

Actionable Explainable AI (AxAI): A Practical Example with Aggregation Functions for Adaptive Classification and Textual Explanations for Interpretable Machine Learning

Abstract

In many domains of our daily life (e.g., agriculture, forestry, health, etc.), both laymen and experts need to classify entities into two binary classes (yes/no, good/bad, sufficient/insufficient, benign/malign, etc.). For many entities, this decision is difficult and we need another class called “maybe”, [...] Read more.

In many domains of our daily life (e.g., agriculture, forestry, health, etc.), both laymen and experts need to classify entities into two binary classes (yes/no, good/bad, sufficient/insufficient, benign/malign, etc.). For many entities, this decision is difficult and we need another class called “maybe”, which contains a corresponding quantifiable tendency toward one of these two opposites. Human domain experts are often able to mark any entity, place it in a different class and adjust the position of the slope in the class. Moreover, they can often explain the classification space linguistically—depending on their individual domain experience and previous knowledge. We consider this human-in-the-loop extremely important and call our approach actionable explainable AI. Consequently, the parameters of the functions are adapted to these requirements and the solution is explained to the domain experts accordingly. Specifically, this paper contains three novelties going beyond the state-of-the-art: (1) A novel method for detecting the appropriate parameter range for the averaging function to treat the slope in the “maybe” class, along with a proposal for a better generalisation than the existing solution. (2) the insight that for a given problem, the family of t-norms and t-conorms covering the whole range of nilpotency is suitable because we need a clear “no” or “yes” not only for the borderline cases. Consequently, we adopted the Schweizer–Sklar family of t-norms or t-conorms in ordinal sums. (3) A new fuzzy quasi-dissimilarity function for classification into three classes: Main difference, irrelevant difference and partial difference. We conducted all of our experiments with real-world datasets. Full article

Show Figures

Open AccessArticle

Prospective Neural Network Model for Seismic Precursory Signal Detection in Geomagnetic Field Records

Abstract

We designed a convolutional neural network application to detect seismic precursors in geomagnetic field records. Earthquakes are among the most destructive natural hazards on Earth, yet their short-term forecasting has not been achieved. Stress loading in dry rocks can generate electric currents that [...] Read more.

We designed a convolutional neural network application to detect seismic precursors in geomagnetic field records. Earthquakes are among the most destructive natural hazards on Earth, yet their short-term forecasting has not been achieved. Stress loading in dry rocks can generate electric currents that cause short-term changes to the geomagnetic field, yielding theoretically detectable pre-earthquake electromagnetic emissions. We propose a CNN model that scans windows of geomagnetic data streams and self-updates using nearby earthquakes as labels, under strict detectability criteria. We show how this model can be applied in three key seismotectonic settings, where geomagnetic observatories are optimally located in high-seismicity-rate epicentral areas. CNNs require large datasets to be able to accurately label seismic precursors, so we expect the model to improve as more data become available with time. At present, there is no synthetic data generator for this kind of application, so artificial data augmentation is not yet possible. However, this deep learning model serves to illustrate its potential usage in earthquake forecasting in a systematic and unbiased way. Our method can be prospectively applied to any kind of three-component dataset that may be physically connected to seismogenic processes at a given depth. Full article

Show Figures

Open AccessArticle

How Do Deep-Learning Framework Versions Affect the Reproducibility of Neural Network Models?

Abstract

In the last decade, industry’s demand for deep learning (DL) has increased due to its high performance in complex scenarios. Due to the DL method’s complexity, experts and non-experts rely on blackbox software packages such as Tensorflow and Pytorch. The frameworks are constantly [...] Read more.

In the last decade, industry’s demand for deep learning (DL) has increased due to its high performance in complex scenarios. Due to the DL method’s complexity, experts and non-experts rely on blackbox software packages such as Tensorflow and Pytorch. The frameworks are constantly improving, and new versions are released frequently. As a natural process in software development, the released versions contain improvements/changes in the methods and their implementation. Moreover, versions may be bug-polluted, leading to the model performance decreasing or stopping the model from working. The aforementioned changes in implementation can lead to variance in obtained results. This work investigates the effect of implementation changes in different major releases of these frameworks on the model performance. We perform our study using a variety of standard datasets. Our study shows that users should consider that changing the framework version can affect the model performance. Moreover, they should consider the possibility of a bug-polluted version before starting to debug source code that had an excellent performance before a version change. This also shows the importance of using virtual environments, such as Docker, when delivering a software product to clients. Full article

Show Figures

Open AccessFeature PaperReview

Entropic Statistics: Concept, Estimation, and Application in Machine Learning and Knowledge Extraction

Abstract

The demands for machine learning and knowledge extraction methods have been booming due to the unprecedented surge in data volume and data quality. Nevertheless, challenges arise amid the emerging data complexity as significant chunks of information and knowledge lie within the non-ordinal realm [...] Read more.

The demands for machine learning and knowledge extraction methods have been booming due to the unprecedented surge in data volume and data quality. Nevertheless, challenges arise amid the emerging data complexity as significant chunks of information and knowledge lie within the non-ordinal realm of data. To address the challenges, researchers developed considerable machine learning and knowledge extraction methods regarding various domain-specific challenges. To characterize and extract information from non-ordinal data, all the developed methods pointed to the subject of Information Theory, established following Shannon’s landmark paper in 1948. This article reviews recent developments in entropic statistics, including estimation of Shannon’s entropy and its functionals (such as mutual information and Kullback–Leibler divergence), concepts of entropic basis, generalized Shannon’s entropy (and its functionals), and their estimations and potential applications in machine learning and knowledge extraction. With the knowledge of recent development in entropic statistics, researchers can customize existing machine learning and knowledge extraction methods for better performance or develop new approaches to address emerging domain-specific challenges. Full article

Open AccessArticle

Automatic Extraction of Medication Information from Cylindrically Distorted Pill Bottle Labels

Abstract

Patient compliance with prescribed medication regimens is critical for maintaining health and managing disease and illness. To encourage patient compliance, multiple aids, like automatic pill dispensers, pill organizers, and various reminder applications, have been developed to help people adhere to their medication regimens. [...] Read more.

Patient compliance with prescribed medication regimens is critical for maintaining health and managing disease and illness. To encourage patient compliance, multiple aids, like automatic pill dispensers, pill organizers, and various reminder applications, have been developed to help people adhere to their medication regimens. However, when utilizing these aids, the user or patient must manually enter their medication information and schedule. This process is time-consuming and often prone to error. For example, elderly patients may have difficulty reading medication information on the bottle due to decreased eyesight, leading them to enter medication information incorrectly. This study explored methods for extracting pertinent information from cylindrically distorted prescription drug labels using Machine Learning and Computer Vision techniques. This study found that Deep Convolutional Neural Networks (DCNN) performed better than other techniques in identifying label key points under different lighting conditions and various backgrounds. This method achieved a percentage of Correct Key points PCK @ 0.03 of 97%. These key points were then used to correct the cylindrical distortion. Next, the multiple dewarped label images were stitched together and processed by an Optical Character Recognition (OCR) engine. Pertinent information, such as patient name, drug name, drug strength, and directions of use, were extracted from the recognized text using Natural Language Processing (NLP) techniques. The system created in this study can be used to improve patient health and compliance by creating an accurate medication schedule. Full article

Show Figures

Open AccessArticle

On the Application of Artificial Neural Network for Classification of Incipient Faults in Dissolved Gas Analysis of Power Transformers

Abstract

Oil-submerged transformer is one of the inherent instruments in the South African power system. Transformer malfunction or impairment may interpose the operation of the electric power distribution and transmission system, coupled with liability for high overhaul costs. Hence, recognition of inchoate faults in [...] Read more.

Oil-submerged transformer is one of the inherent instruments in the South African power system. Transformer malfunction or impairment may interpose the operation of the electric power distribution and transmission system, coupled with liability for high overhaul costs. Hence, recognition of inchoate faults in an oil-submerged transformer is indispensable and it has turned into an intriguing subject of interest by utility owners and transformer manufacturers. This work proposes a hybrid implementation of a multi-layer artificial neural network (MLANN) and IEC 60599:2022 gas ratio method in identifying inchoate faults in mineral oil-based submerged transformers by employing the dissolved gas analysis (DGA) method. DGA is a staunch practice to discover inchoate faults as it furnishes comprehensive information in examining the transformer state. In current work, MLANN was established to pigeonhole seven fault types of transformer states predicated on the three International Electrotechnical Commission (IEC) combustible gas ratios. The designs enmesh the development of numerous MLANN algorithms and picking networks with the optimum performance. The gas ratios are in accordance with the IEC 60599:2022 standard whilst an empirical databank comprised of 100 datasets was used in the training and testing activities. The designated MLANN design produces an overall correlation coefficient of 0.998 in the categorization of transformer state with reference to the combustible gas produced. Full article

Show Figures

Open AccessArticle

Comparison of Imputation Methods for Missing Rate of Perceived Exertion Data in Rugby

Abstract

Rate of perceived exertion (RPE) is used to calculate athlete load. Incomplete load data, due to missing athlete-reported RPE, can increase injury risk. The current standard for missing RPE imputation is daily team mean substitution. However, RPE reflects an individual’s effort; group mean [...] Read more.

Rate of perceived exertion (RPE) is used to calculate athlete load. Incomplete load data, due to missing athlete-reported RPE, can increase injury risk. The current standard for missing RPE imputation is daily team mean substitution. However, RPE reflects an individual’s effort; group mean substitution may be suboptimal. This investigation assessed an ideal method for imputing RPE. A total of 987 datasets were collected from women’s rugby sevens competitions. Daily team mean substitution, k-nearest neighbours, random forest, support vector machine, neural network, linear, stepwise, lasso, ridge, and elastic net regression models were assessed at different missingness levels. Statistical equivalence of true and imputed scores by model were evaluated. An ANOVA of accuracy by model and missingness was completed. While all models were equivalent to the true RPE, differences by model existed. Daily team mean substitution was the poorest performing model, and random forest, the best. Accuracy was low in all models, affirming RPE as multifaceted and requiring quantification of potentially overlapping factors. While group mean substitution is discouraged, practitioners are recommended to scrutinize any imputation method relating to athlete load. Full article

(This article belongs to the Section Data)

Show Figures

Open AccessReview

Artificial Intelligence Methods for Identifying and Localizing Abnormal Parathyroid Glands: A Review Study

Abstract

Background: Recent advances in Artificial Intelligence (AI) algorithms, and specifically Deep Learning (DL) methods, demonstrate substantial performance in detecting and classifying medical images. Recent clinical studies have reported novel optical technologies which enhance the localization or assess the viability of Parathyroid Glands (PG) [...] Read more.

Background: Recent advances in Artificial Intelligence (AI) algorithms, and specifically Deep Learning (DL) methods, demonstrate substantial performance in detecting and classifying medical images. Recent clinical studies have reported novel optical technologies which enhance the localization or assess the viability of Parathyroid Glands (PG) during surgery, or preoperatively. These technologies could become complementary to the surgeon’s eyes and may improve surgical outcomes in thyroidectomy and parathyroidectomy. Methods: The study explores and reports the use of AI methods for identifying and localizing PGs, Primary Hyperparathyroidism (PHPT), Parathyroid Adenoma (PTA), and Multiglandular Disease (MGD). Results: The review identified 13 publications that employ Machine Learning and DL methods for preoperative and operative implementations. Conclusions: AI can aid in PG, PHPT, PTA, and MGD detection, as well as PG abnormality discrimination, both during surgery and non-invasively. Full article

Show Figures

Open AccessArticle

Sensor Fusion for Occupancy Estimation: A Study Using Multiple Lecture Rooms in a Complex Building

Abstract

This paper uses various machine learning methods which explore the combination of multiple sensors for quality improvement. It is known that a reliable occupancy estimation can help in many different cases and applications. For the containment of the SARS-CoV-2 virus, in particular, room [...] Read more.

This paper uses various machine learning methods which explore the combination of multiple sensors for quality improvement. It is known that a reliable occupancy estimation can help in many different cases and applications. For the containment of the SARS-CoV-2 virus, in particular, room occupancy is a major factor. The estimation can benefit visitor management systems in real time, but can also be predictive of room reservation strategies. By using different terminal and non-terminal sensors in different premises of varying sizes, this paper aims to estimate room occupancy. In the process, the proposed models are trained with different combinations of rooms in training and testing datasets to examine distinctions in the infrastructure of the considered building. The results indicate that the estimation benefits from a combination of different sensors. Additionally, it is found that a model should be trained with data from every room in a building and cannot be transferred to other rooms. Full article

Show Figures

Open AccessArticle

Factorizable Joint Shift in Multinomial Classification

Abstract

Factorizable joint shift (FJS) was recently proposed as a type of dataset shift for which the complete characteristics can be estimated from feature data observations on the test dataset by a method called Joint Importance Aligning. For the multinomial (multiclass) classification setting, we [...] Read more.

Factorizable joint shift (FJS) was recently proposed as a type of dataset shift for which the complete characteristics can be estimated from feature data observations on the test dataset by a method called Joint Importance Aligning. For the multinomial (multiclass) classification setting, we derive a representation of factorizable joint shift in terms of the source (training) distribution, the target (test) prior class probabilities and the target marginal distribution of the features. On the basis of this result, we propose alternatives to joint importance aligning and, at the same time, point out that factorizable joint shift is not fully identifiable if no class label information on the test dataset is available and no additional assumptions are made. Other results of the paper include correction formulae for the posterior class probabilities both under general dataset shift and factorizable joint shift. In addition, we investigate the consequences of assuming factorizable joint shift for the bias caused by sample selection. Full article

(This article belongs to the Section Learning)

Open AccessArticle

Investigating Machine Learning Applications in the Prediction of Occupational Injuries in South African National Parks

Abstract

There is a need to predict occupational injuries in South African National Parks for the purpose of implementing targeted interventions or preventive measures. Machine-learning models have the capability of predicting injuries such that the employees that are at risk of experiencing occupational injuries [...] Read more.

There is a need to predict occupational injuries in South African National Parks for the purpose of implementing targeted interventions or preventive measures. Machine-learning models have the capability of predicting injuries such that the employees that are at risk of experiencing occupational injuries can be identified. Support Vector Machines (SVMs), k Nearest Neighbours (k-NN), XGB classifier and Deep Neural Networks were applied and overall performance was compared to the accuracy of baseline models that always predict low extremity injuries. Data extracted from the Department of Employment and Labour’s Compensation Fund was used for training the models. SVMs had the best performance in predicting between low extremity injuries and injuries in the torso and hands regions. However, the overall accuracy was 56%, which was slightly above the baseline and below findings from similar previous research that reported a minimum of 62%. Gender was the only feature with an importance score significantly greater than zero. There is a need to use more features related to work conditions and which acknowledge the importance of environment in order to improve the accuracy of the predictions of the models. Furthermore, more types of injuries, and employees that have not experienced any injuries, should be included in future studies. Full article

Show Figures

Open AccessArticle

Live Fish Species Classification in Underwater Images by Using Convolutional Neural Networks Based on Incremental Learning with Knowledge Distillation Loss

Abstract

Nowadays, underwater video systems are largely used by marine ecologists to study the biodiversity in underwater environments. These systems are non-destructive, do not perturb the environment and generate a large amount of visual data usable at any time. However, automatic video analysis requires [...] Read more.

Nowadays, underwater video systems are largely used by marine ecologists to study the biodiversity in underwater environments. These systems are non-destructive, do not perturb the environment and generate a large amount of visual data usable at any time. However, automatic video analysis requires efficient techniques of image processing due to the poor quality of underwater images and the challenging underwater environment. In this paper, we address live reef fish species classification in an unconstrained underwater environment. We propose using a deep Convolutional Neural Network (CNN) and training this network by using a new strategy based on incremental learning. This training strategy consists of training the CNN progressively by focusing at first on learning the difficult species well and then gradually learning the new species incrementally using knowledge distillation loss while keeping the high performances of the old species already learned. The proposed approach yields an accuracy of 81.83% on the LifeClef 2015 Fish benchmark dataset. Full article

Show Figures

Open AccessArticle

Deep Leaning Based Frequency-Aware Single Image Deraining by Extracting Knowledge from Rain and Background

Cited by 1

Abstract

Due to the requirement of video surveillance, machine learning-based single image deraining has become a research hotspot in recent years. In order to efficiently obtain rain removal images that contain more detailed information, this paper proposed a novel frequency-aware single image deraining network [...] Read more.

Due to the requirement of video surveillance, machine learning-based single image deraining has become a research hotspot in recent years. In order to efficiently obtain rain removal images that contain more detailed information, this paper proposed a novel frequency-aware single image deraining network via the separation of rain and background. For the rainy images, most of the background key information belongs to the low-frequency components, while the high-frequency components are mixed by background image details and rain streaks. This paper attempted to decouple background image details from high frequency components under the guidance of the restored low frequency components. Compared with existing approaches, the proposed network has three major contributions. (1) A residual dense network based on Discrete Wavelet Transform (DWT) was proposed to study the rainy image background information. (2) The frequency channel attention module was introduced into the adaptive decoupling of high-frequency image detail signals. (3) A fusion module was introduced that contains the attention mechanism to make full use of the multi receptive fields information using a two-branch structure, using the context information in a large area. The proposed approach was evaluated using several representative datasets. Experimental results shows this proposed approach outperforms other state-of-the-art deraining algorithms. Full article

Show Figures

Open AccessArticle

VLA-SMILES: Variable-Length-Array SMILES Descriptors in Neural Network-Based QSAR Modeling

Abstract

Machine learning represents a milestone in data-driven research, including material informatics, robotics, and computer-aided drug discovery. With the continuously growing virtual and synthetically available chemical space, efficient and robust quantitative structure–activity relationship (QSAR) methods are required to uncover molecules with desired properties. Herein, [...] Read more.

Machine learning represents a milestone in data-driven research, including material informatics, robotics, and computer-aided drug discovery. With the continuously growing virtual and synthetically available chemical space, efficient and robust quantitative structure–activity relationship (QSAR) methods are required to uncover molecules with desired properties. Herein, we propose variable-length-array SMILES-based (VLA-SMILES) structural descriptors that expand conventional SMILES descriptors widely used in machine learning. This structural representation extends the family of numerically coded SMILES, particularly binary SMILES, to expedite the discovery of new deep learning QSAR models with high predictive ability. VLA-SMILES descriptors were shown to speed up the training of QSAR models based on multilayer perceptron (MLP) with optimized backpropagation (ATransformedBP), resilient propagation (iRPROP‒), and Adam optimization learning algorithms featuring rational train–test splitting, while improving the predictive ability toward the more compute-intensive binary SMILES representation format. All the tested MLPs under the same length-array-based SMILES descriptors showed similar predictive ability and convergence rate of training in combination with the considered learning procedures. Validation with the Kennard–Stone train–test splitting based on the structural descriptor similarity metrics was found more effective than the partitioning with the ranking by activity based on biological activity values metrics for the entire set of VLA-SMILES featured QSAR. Robustness and the predictive ability of MLP models based on VLA-SMILES were assessed via the method of QSAR parametric model validation. In addition, the method of the statistical H0 hypothesis testing of the linear regression between real and observed activities based on the F2,n−2 -criteria was used for predictability estimation among VLA-SMILES featured QSAR-MLPs (with n being the volume of the testing set). Both approaches of QSAR parametric model validation and statistical hypothesis testing were found to correlate when used for the quantitative evaluation of predictabilities of the designed QSAR models with VLA-SMILES descriptors. Full article

(This article belongs to the Section Learning)

Show Figures

Open AccessArticle

Data Mining Algorithms for Operating Pressure Forecasting of Crude Oil Distribution Pipelines to Identify Potential Blockages

Abstract

The implementation of data mining has become very popular in many fields recently, including in the petroleum industry. It is widely used to help in decision-making processes in order to minimize oil losses during operations. One of the major causes of loss is [...] Read more.

The implementation of data mining has become very popular in many fields recently, including in the petroleum industry. It is widely used to help in decision-making processes in order to minimize oil losses during operations. One of the major causes of loss is oil flow blockages during transport to the gathering facility, known as the congeal phenomenon. To overcome this situation, real-time surveillance is used to monitor the oil flow condition inside pipes. However, this system is not able to forecast the pipeline pressure on the next several days. The objective of this study is to forecast the pressure several days in advance using real-time pressure data, as well as external factor data recorded by nearby weather stations, such as ambient temperature and precipitation. Three machine learning algorithms—multi-layer perceptron (MLP), long short-term memory (LSTM), and nonlinear autoregressive exogenous model (NARX)—are evaluated and compared with each other using standard regression evaluation metrics, including a steady-state model. As a result, with proper hyperparameters, in the proposed method of NARX with MLP as a regressor, the NARX algorithm showed the best performance among the evaluated algorithms, indicated by the highest values of R2 and lowest values of RMSE. This algorithm is capable of forecasting the pressure with high correlation to actual field data. By forecasting the pressure several days ahead, system owners may take pre-emptive actions to prevent congealing. Full article

(This article belongs to the Section Learning)

Show Figures

Open AccessArticle

Input/Output Variables Selection in Data Envelopment Analysis: A Shannon Entropy Approach

Abstract

The purpose of this study is to provide an efficient method for the selection of input–output indicators in the data envelopment analysis (DEA) approach, in order to improve the discriminatory power of the DEA method in the evaluation process and performance analysis of [...] Read more.

The purpose of this study is to provide an efficient method for the selection of input–output indicators in the data envelopment analysis (DEA) approach, in order to improve the discriminatory power of the DEA method in the evaluation process and performance analysis of homogeneous decision-making units (DMUs) in the presence of negative values and data. For this purpose, the Shannon entropy technique is used as one of the most important methods for determining the weight of indicators. Moreover, due to the presence of negative data in some indicators, the range directional measure (RDM) model is used as the basic model of the research. Finally, to demonstrate the applicability of the proposed approach, the food and beverage industry has been selected from the Tehran stock exchange (TSE) as a case study, and data related to 15 stocks have been extracted from this industry. The numerical and experimental results indicate the efficacy of the hybrid data envelopment analysis–Shannon entropy (DEASE) approach to evaluate stocks under negative data. Furthermore, the discriminatory power of the proposed DEASE approach is greater than that of a classical DEA model. Full article

Show Figures

Open AccessArticle

Improving Deep Learning for Maritime Remote Sensing through Data Augmentation and Latent Space

Abstract

Training deep learning models requires having the right data for the problem and understanding both your data and the models’ performance on that data. Training deep learning models is difficult when data are limited, so in this paper, we seek to answer the [...] Read more.

Training deep learning models requires having the right data for the problem and understanding both your data and the models’ performance on that data. Training deep learning models is difficult when data are limited, so in this paper, we seek to answer the following question: how can we train a deep learning model to increase its performance on a targeted area with limited data? We do this by applying rotation data augmentations to a simulated synthetic aperture radar (SAR) image dataset. We use the Uniform Manifold Approximation and Projection (UMAP) dimensionality reduction technique to understand the effects of augmentations on the data in latent space. Using this latent space representation, we can understand the data and choose specific training samples aimed at boosting model performance in targeted under-performing regions without the need to increase training set sizes. Results show that using latent space to choose training data significantly improves model performance in some cases; however, there are other cases where no improvements are made. We show that linking patterns in latent space is a possible predictor of model performance, but results require some experimentation and domain knowledge to determine the best options. Full article

Show Figures

Open AccessArticle

Do We Need a Specific Corpus and Multiple High-Performance GPUs for Training the BERT Model? An Experiment on COVID-19 Dataset

Abstract

The COVID-19 pandemic has impacted daily lives around the globe. Since 2019, the amount of literature focusing on COVID-19 has risen exponentially. However, it is almost impossible for humans to read all of the studies and classify them. This article proposes a method [...] Read more.

The COVID-19 pandemic has impacted daily lives around the globe. Since 2019, the amount of literature focusing on COVID-19 has risen exponentially. However, it is almost impossible for humans to read all of the studies and classify them. This article proposes a method of making an unsupervised model called a zero-shot classification model, based on the pre-trained BERT model. We used the CORD-19 dataset in conjunction with the LitCovid database to construct new vocabulary and prepare the test dataset. For NLI downstream task, we used three corpora: SNLI, MultiNLI, and MedNLI. We significantly reduced the training time by 98.2639% to build a task-specific machine learning model, using only one Nvidia Tesla V100. The final model can run faster and use fewer resources than its comparators. It has an accuracy of 27.84%, which is lower than the best-achieved accuracy by 6.73%, but it is comparable. Finally, we identified that the tokenizer and vocabulary more specific to COVID-19 could not outperform the generalized ones. Additionally, it was found that BART architecture affects the classification results. Full article

(This article belongs to the Section Learning)

Show Figures

Highly Accessed Articles

Latest Books

E-Mail Alert

News

Topics

Topic in Entropy, Algorithms, Computation, MAKE, Energies, Materials

Artificial Intelligence and Computational Methods: Modeling, Simulations and Optimization of Complex Systems Topic Editors: Jaroslaw Krzywanski, Yunfei Gao, Marcin Sosnowski, Karolina Grabowska, Dorian Skrobek, Ghulam Moeen Uddin, Anna Kulakowska, Anna Zylka, Bachil El Fil
Deadline: 30 December 2022

Conferences

Special Issues

What is the Impact factor of Journal of Machine Learning Research?

4.091Journal of Machine Learning Research / Impact Factornull

Is Journal of Machine Learning Research good?

The overall rank of Journal of Machine Learning Research is 913. According to SCImago Journal Rank (SJR), this journal is ranked 2.393. SCImago Journal Rank is an indicator, which measures the scientific influence of journals.

Is PMLR a journal?

The Proceedings of Machine Learning Research is a series that publishes machine learning research papers presented at workshops and conferences. Each volume is separately titled and associated with a particular workshop or conference and will be published online on the PMLR web site. Authors retain copyright.

Is JMLR double blind?

A: You can still submit, even if a preprint of your work already is available online somewhere (as long as it's not in official proceedings for some other publishing venue). We follow the same approach as double blind conferences, asking that reviewers don't actively search online to figure out who the authors are.