Introduction
The pipeline offers a robust solution for handling image data of arbitrary sizes by employing a systematic approach of patching images into fixed-size patches. This not only enables efficient data augmentation but also facilitates training models on very high-resolution data that would otherwise exceed GPU memory limitations.
The process of training begins with the train.py script, where paths are created and essential utilities from utils.py are invoked. The data loading and preparation steps are handled in dataset.py, where the dataset is generated, split, and train, valid and test CSV files are saved. The image patching process, including saving co-ordinate of patches to train, valid, and json files is executed, followed by data augmentation and transformation through the Augment class. The dataset is then prepared using MyDataset class, which handles data fetching and transformation.
Metrics calculation, focal loss computation, and model initialization are handled in separate modules respectively metrics.py, loss.py, model.py, ensuring modularity and ease of management. Callback selection, including learning rate scheduling and validation visualization, is orchestrated through SelectCallbacks class.
For evaluation, two distinct scenario are provided in test.py. In the first scenario (evaluation = False), the evaluation dataset is prepared similarly to the training dataset, and predictions are made and displayed for analysis. In the second scenario (evaluation = True), additional steps for generating evaluation CSVs named eval_csv_gen is called
Throughout the pipeline, emphasis is placed on modularity, allowing for easy integration of new components or modifications. Understanding the intricacies of this pipeline is paramount for leveraging its capabilities effectively. For a deeper dive into its workings, referencing the FAPNET paper is recommended, providing insights into the architecture and rationale behind its design.
Introduction
Image recognition plays a vital role in the medical image analysis field, which depends on different medical image analysis algorithms with input data, features, parameters, and type of learning. Three crucial morphological features on Hematoxylin and Eosin 1991 (H&E) related to the classifying the diseases for breast cancer, mitosis count, tubule formation and nuclear pleomorphic. Mitosis counts plays an essential role and an important diagnostic factor for breast cancer grading. Mitosis detection is still a challenging problem because the cells are part of the cell cycle to generate a new nuclear and with different stages of mitosis. However, we implemented a residual learning algorithm for optimization and easiest training; our model is ResNet18 pre-trained to classify with localized based on the Tensorflow framework (TF-DFCNN). Moreover, it is used for avoiding the degradation problem consisted of the normalization function, data augmentation and sampling method to get high accuracy detection. Our deep fully convolutional network (DFCNN) consists of two-stage, where the first stage is used for classification of MITOS-ATYPIA 2014 dataset, which achieves 85% accuracy. In the second stage, we add a new layer to detect the localization depends on Weakly-Supervised Object Localization Concept via a class activation map (CAM) technique for identifying discriminative regions to retrain our CNN model without fully connected layer by combining the framework with localized layer lead the model to be more complex and precise about 93% accuracy.
Dataset
There are two main types of digital sliding scanners, which is a scanner Aperio and scanner Hamamatsu for automated scanning tissue, we used MITOS-ATYPIA 14 dataset that pre-processed to mitosis images and non-mitosis images (normal) and each input image with 256*256 dimensions. For training network, feed with 5000 normal images and 5000 mitosis images and for testing our network used 1000 normal images and 1000 mitosis images. The normal images below MITOS ATYPIA 14 dataset contains microscopic images of mitotic and non-mitotic cells.
Introduction
The image segmentation model can be used to extract real-world objects from images, blur backgrounds, create self-driving automobiles, and perform other image processing tasks. The goal of this research is to create a mask that shows floodwater in a given location based on Sentinal-1 (a dual-polarization synthetic-aperture radar (SAR) system) images or features.
Dataset
The dataset collected from the competition: Map Floodwater from Radar Imagery, which is hosted by Microsoft AI for Earth. The dataset consists of Sentinel-1 images and masks, as well as a CSV file with metadata such as city and year. Sentinel-1 images and masks were acquired from various parts of the world between 2016 and 2020. In total, the dataset consists of 542 chips (1084 images) and corresponding masks. Based on how radar microwave frequency transmits and receives, a single chip comprises two bands or images. Vertical transmit and receive are represented by the VV images. On the other hand, VH images stand for vertical transmit and horizontal receive. Each picture is saved as a GeoTIFF file with the dimension of 512 X 512. The mask consist of three categories:
Water: 1
NON-Water: 0
unlabeled: 255
Introduction
The image segmentation model can be used to extract real-world objects from images, blur backgrounds, create self-driving automobiles, and perform other image processing tasks. The goal of this research is to create a mask that shows floodwater in a given location based on Sentinal-1 (a dual-polarization synthetic-aperture radar (SAR) system) images or features.
Dataset
Our dataset comprises with satellite images taken from the Sentinel-2 satellite. It has 15 stacks of folders with waterbody labels. Each of these folders contains waterbody labels and a fuzzy mask. Note that these masks are derived from a few sources (amplitude, coherence, Sentinel-2, Landsat-8, and OpenStreetMap). They also contain rslc/*.rslc.notopo files, which are RSLCs with the topography pre-removed. Each image is about 10820 x 11361 pixels. Below is an example of our dataset:
Water: 1
NON-Water: 0
unlabeled: 255
Introduction
From credit ratings to housing allocation, machine learning models are increasingly used to automate everyday decision making processes. With the growing impact on society, more and more concerns are being voiced about the loss of transparency, accountability, and fairness of the algorithms making the decisions. We, as data scientists, need to step-up our game and look for ways to mitigate emergent discrimination in our models. We need to make sure that our predictions do not disproportionately hurt people with certain sensitive characteristics (e.g., gender, ethnicity).
Dataset
For our experiment, we used the Ault UCI dataset, which can be downloaded here. It is also referred to as the “Census Income” dataset. Here, we will predict whether or not a person’s income is greater than $50,000 a year. It is not hard to imagine that financial institutions train models on similar data sets and use them to decide whether or not someone is eligible for a loan, or to set the height of an insurance premium.
Introduction
Creating a foreground mask, which is a binary picture of the pixels that belong to moving elements in the scene, can be done with still cameras by using a technique known as background subtraction (BS), which is a common and commonly used method. In this study, we will show a video subtraction pipeline that makes use of both computer vision and deep learning techniques.
Dataset
The dataset, CDNET, can be downloaded fromhere. This dataset contains 11 video categories with 4–6 video sequences in each category. However, we only used eight video sequences. They are: busStation, canoe, fountain02, highway, office, park, peopleInShade, sidewalk. Each individual video file (.zip or.7z) can be downloaded separately. Alternatively, all video files within one category can be downloaded as a single .zip or .7z file. Each video file, when uncompressed, becomes a directory that contains the following:
a sub-directory named “input” containing a separate JPEG file for each frame of the input video
a sub-directory named “groundtruth” containing a separate BMP file for each frame of the groundtruth
“an empty folder named “results” for binary results (1 binary image per frame per video you have processed)
files named “ROI.bmp” and “ROI.jpg” showing the spatial region of interest
a file named “temporalROI.txt” containing two frame numbers. Only the frames in this range will be used to calculate your score
Introduction
From credit ratings to housing allocation, machine learning models are increasingly used to automate everyday decision making processes. With the growing impact on society, more and more concerns are being voiced about the loss of transparency, accountability, and fairness of the algorithms making the decisions. We, as data scientists, need to step-up our game and look for ways to mitigate emergent discrimination in our models. We need to make sure that our predictions do not disproportionately hurt people with certain sensitive characteristics (e.g., gender, ethnicity).
Dataset
For our experiment, we used the Ault UCI dataset, which can be downloaded here. It is also referred to as the “Census Income” dataset. Here, we will predict whether or not a person’s income is greater than $50,000 a year. It is not hard to imagine that financial institutions train models on similar data sets and use them to decide whether or not someone is eligible for a loan, or to set the height of an insurance premium.
Introduction
The image segmentation model can be used to extract real-world objects from images, blur backgrounds, create self-driving automobiles, and perform other image processing tasks. The goal of this research is to create a mask that shows floodwater in a given location based on Sentinal-1 (a dual-polarization synthetic-aperture radar (SAR) system) images or features.
Dataset
MBRSC satellites obtained aerial imagery of Dubai and annotated it with pixel-wise semantic segmentation in six classes. The total volume of the dataset is 72 images grouped into six larger tiles. They are:
Building: #3C1098
Land (unpaved area): #8429F6
Road: #6EC1E4
Vegetation: #FEDD3A
Water: #E2A929
Unlabeled: #9B9B9B
Masks are RGB, and information is provided as a HEX color code.
Introduction
The medical services in Bangladesh are shortage nowadays; people are suffering from getting the correct treatment from the hospital. With the low proportion of the doctors and the low per capita salary in Bangladesh, patients need to spend more money to get the appropriate treatments. Therefore, it is necessary to apply modern information technologies by which the scaffold between the patients and specialists can be reduced, and the patients can take proper treatment at a lower cost. Fortunately, we can solve this critical problem by utilizing interaction among electrical devices. With the big data collected from these devices, machine learning is a powerful tool for the data analytics because of its high accuracy, lower computational costs, and lower power consumption. This research is based on a case of study by the incorporation of the database, mobile application, web application and develops a novel platform through which the patients and the doctors can interact. In addition, the platform helps to store the patients’ health data to make the final prediction using machine learning methods to get the proper healthcare treatment with the help of the machines and the doctors. The experiment result shows the high accuracy over 95% of the disease detection using machine learning methods, with the cost 90% lower than the local hospital in Bangladesh, which provides the strong support to implement of our platform in the remote area of the country.
Dataset
Introduction
Cardio-Vascular Disease (CVD) is one of the leading causes of death all over the world with expecting approximately 23.6 million individuals to be attacked by the CVD by 2030. Thus, the healthcare industry is trying to gather a large amount of CVD information, which can help the doctors to detect and identify the potential risk factors of the CVD. Deep learning can dig out the hidden pattern of the disease and symptoms from this structured and unstructured medical information. As a result, in this paper, we propose an algorithm to predict the risk factors of the CVD using the attention module based Long Short- Term Memory (LSTM), which has almost 95% accuracy and 0.90 Matthews Correlation Coefficient (MCC) scores; better than any other previously proposed methods. Moreover, we propose a novel Intelligent Healthcare Platform for continuous data collection and patient monitoring system. Initially, the proposed platform is used for data collection, and we find out the best suitable features from the dataset for applying various machine learning algorithms. The experimental results show that the attention module based LSTM outperforms than the other statistical machine learning algorithms for the prediction as well as indicates significant risk factors of the CVD, which can be supportive for the CVD patients to change their lifestyle.
Dataset
The KITTI Visual Benchmark Suite is a dataset that has been developed specifically for the purpose of benchmarking optical flow, odometry data, object detection, and road/lane detection. The dataset can be downloaded from here. The road dataset has a dimension of 375 by 1242 pixels and contains 600 individual frames. It is the primary benchmark dataset for road and lane segmentation. This benchmark was developed in partnership with Jannik Fritsch and Tobias Kuehnl of Honda Research Institute Europe GmbH. The road and lane estimate benchmark includes 290 test and 289 training images; including three distinct types of road scenes, which are given below. Figure shows some example data (UU, UM, and UMM) plotted by MatPlotLib in RGB format.