A Wearable Stethoscope for Accurate Real-Time Lung Sound Monitoring and Automatic Wheezing Detection Based on an AI Algorithm
Kyoung-Ryul Lee
,
Taewi Kim
,
Sunghoon Im
,
Yi Jae Lee
,
Seongeun Jeong
,
Hanho Shin
,
Hana Cho
,
Sang-Heon Park
,
Minho Kim
,
Jin Goo Lee
,
Dohyeong Kim
,
Gil-Soon Choi
,
Daeshik Kang
,
SungChul Seo
,
Soo Hyun Lee
The various bioacoustics signals obtained with auscultation contain complex clinical information that has been traditionally used as biomarkers, however, they are not extensively used in clinical studies owing to their spatiotemporal limitations. In this study, we developed a wearable stethoscope for wireless, skin-attachable, low-power, continuous, real-time auscultation using a lung-sound-monitoring-patch (LSMP). LSMP can monitor respiratory function through a mobile app and classify normal and adventitious breathing by comparing their unique acoustic characteristics. The human heart and breathing sounds from humans can be distinguished from complex sound signals consisting of a mixture of bioacoustic signals and external noise. The performance of the LSMP sensor was further demonstrated in pediatric patients with asthma and elderly chronic obstructive pulmonary disease (COPD) patients where wheezing sounds were classified at specific frequencies. In addition, we developed a novel method for counting wheezing events based on a two-dimensional convolutional neural network deep-learning model constructed de novo and trained with our augmented fundamental lung-sound data set. We implemented a counting algorithm to identify wheezing events in real-time regardless of the respiratory cycle. The artificial intelligence-based adventitious breathing event counter distinguished > 80% of the events (especially wheezing) in long-term clinical applications in patients with COPD.
Kyoung-Ryul Lee, Taewi Kim, Sunghoon Im, Yi Jae Lee, Seongeun Jeong, Hanho Shin, Hana Cho, Sang-Heon Park, Minho Kim, Jin Goo Lee, Dohyeong Kim, Gil-Soon Choi, Daeshik Kang, SungChul Seo, Soo Hyun Lee.
A Wearable Stethoscope for Accurate Real-Time Lung Sound Monitoring and Automatic Wheezing Detection Based on an AI Algorithm.
Engineering, 2025, 53(10): 116-129 DOI:10.1016/j.eng.2024.12.031
After René Raeneck invented the stethoscope in 1816, the modern analog binaural stethoscope was developed in 1854 and continues to be extensively used while maintaining the same shape [1]. A stethoscope performs auscultation for precise diagnosis [2], [3], making it crucial, noninvasive, and inexpensive traditional method for identifying internal organ states. Pulmonary auscultation helps diagnose various respiratory symptoms [4], [5]. Continuous auscultation is more sensitive than intermittent auscultation in determining respiratory symptoms in diseases such as asthma [6] and chronic obstructive pulmonary disease (COPD) [7], [8], especially because many respiratory diseases worsen at night and dawn periods.
Asthma and COPD commonly produce wheezing sounds that can be diagnosed using clinical auscultation [9]. However, the accuracy of lung auscultation is questionable, with reported values of 60.3% for medical students and 80.1% for fellows. Factors affect auscultation accuracy include experience, signal strength, and signal mixing from other organs [10], [11]. Digital stethoscopes have improved diagnostic accuracy through recording, noise reduction, amplification, and data processing [12], [13], making them useful in telehealth [14], [15] and clinical education [16], [17]. Despite these advantages, digital stethoscopes are not extensively adopted because of their inconvenience.
Wearable electronics have driven the development of flexible electronics that can be integrated into the human body to replace larger, heavier medical devices [18], [19]. Flexible wearable electronics that are light and adaptable do not restrict human activities and can be worn to acquire and analyze high-precision biometric signals for health monitoring and human-computer interaction [20]. Conformal contact is necessary to acquire high-precision biosignals during long-term monitoring [21]. Studies are ongoing on materials [22], [23], their structures [24] and thicknesses [25], and on adhesive layers [26] to enhance flexible electronics [27], [28].
Recent advancements include wearable devices have been developed for health care [29], [30], [31], [32], especially wearable stethoscopes that use accelerometer [33], [34], [35] and microphone [36], [37], [38], [39] for signal acquisition. Piezoelectric microphones [40] are advantageous for wearability and real-time data acquisition, but face challenges in signal attenuation owing to impedance mismatch and manufacturing complexity. Accelerometer-based devices measure vibration-based signals such as speech, coughing, swallowing, and snoring [41], [42], [43], but they struggle with acoustic signal detection [44]. Research on wearable sensors using micro electro mechanical systems (MEMS) microphones is progressing because of their small size [45], high signal to noise ratio (SNR), low-power consumption [46], and flat frequency responses [47], [48].
Stethoscopes with electret microphones [44], [45] retain their traditional shapes while allowing breath sound analyses. The co-location of the accelerometer distinguishes actual breathing from external sounds; however, sensor wearability remains an issue for long-term monitoring. Recent studies demonstrated MEMS microphone-based wearable stethoscopes for the detection of wheezing sound using machine learning [39].
Wearable auscultation devices require extensive time and labor for data analysis. Deep-neural networks models, such as convolutional neural networks (CNNs) [8] and recurrent neural networks (RNNs) [49], predict abnormal pulmonary sounds with high accuracy (approximately 99.5%). However, these models mainly classify diseases without monitoring symptom changes or severity overtime, which necessitates monitoring for an extended period. Various respiratory diseases are treated by evaluating lung sound changes over 3 to 4 days, which influence treatment decisions based on improvement [50].
Table 1[35], [36], [37], [38], [39], [40], [43], [45], [51], [52] summarizes the advantages and disadvantages of wearable stethoscopes with two different detection systems (accelerometers and microphones). The objective of this study was to develop a system to monitor respiratory patients continuously for symptom changes in real-time. This can be achieved through continuous auscultation without direct contact between medical staff and patients over an extended period. In addition, the system was designed to detect symptoms using artificial intelligence (AI). Studies on auscultation monitoring using digital stethoscopes have been conducted for several purposes. However, spatiotemporal and technical limitations and long-term clinical studies have not yet been conducted.
This study describes the development of a wireless, skin-attached, and low-powered wearable stethoscope for continuous lung-sound monitoring to assess respiratory function and diagnose disease states. The composition is presented here through the design of lung sound monitoring patch (LSMP) systems optimized for continuous auscultation and the characterization of the device, clinical assessment of healthy subjects and several respiratory patients (asthma and COPD), and a new machine-learning-based data analysis algorithm that counts the number of adventitious sound events (specifically, forced breathing and various wheezing sounds) over long periods.
2. Experimental section/methods
In this section, we present the ➀ fabrication of the LSMP, a wearable device capable of continuous auscultation for a long time, including the configuration of the sensor; ➁ customized iPhone operation system (iOS) application (app) for data collection and management; ➂ clinical trials on normal subjects and patients with respiratory diseases; ➃ acoustic analysis to detect abnormal patient breath sounds; ➄ CNN-based AI algorithm using acoustic analysis, and ➅ an experimental section/method to finally validate the clinical data. In addition, sensor characterization (received signal strength indicator (RSSI), microphone directionality, etc.) and the acquisition of simulated reference breath sounds using a Child Sim are also presented.
2.1. Fabrication of LSMP sensor
The flexible printed circuit board (fPCB) for the LSMP was designed using Orcad (version 17.2, Cadence, USA), and PADS (version VX2.7, Siemens, Germany). The production, assembly, and inspection were performed by KS Electronics (Republic of Korea). A medical-grade adhesive tape was bonded to the bottom layer of the fPCB using a hydrocolloid dressing film (Easyderm, CGBio, Republic of Korea). An acoustic path with 2 mm hole was designed to enhance the acquired sound signals. The main components include uni- and omni-direction MEMS microphone, microcontroller unit (MCU), lithium polymer battery, as detailed in Table S1 in Appendix A. The encapsulating enclosure was three-dimensional (3D) modeling with Solidworks 2018 (Dassault Systems Corp., France) and printed using a 3D printer with a biocompatible resin (elastic 50A, Formlabs, USA).
2.2. Customized app for data acquisition
An iPhone SE2 (Apple Inc., USA) and customized iOS app were used as for all measurements. The iOS app implements the following functions: ➀ collects identification information and manages stored files for each subject; ➁ collects device signal reception (battery information, Bluetooth (BT) connections); ➂ performs real-time streaming of bio-acoustic signals from the LSMP sensor; ➃ conducts real-time visualization on three screens; ➄ reproduces and records through real-time equalization, and ➅ shares and replays saved files.
2.3. Experimental setup for Child Sim
A child heart lung sound trainer simulator (SB48061U, Simulaids Inc., UK) was used for the basic performance tests. Child Sim generated the reference sound of an actual 4-year-old, including normal, wheezing, and crackle sounds. The experiment was conducted inside an acrylic box with sound insulation and absorption material to avoid unintended environmental noise. A BT speaker (SRS-XB22, Sony, Japan) generated the same reference noise at the sensor location (Fig. S1 in Appendix A).
2.4. Experimental setup for RSSI
The RSSI values were measured to evaluate the wireless connection between the LSMP sensor and the mobile device. Tests were conducted with both Child Sim and human subjects at distance from 0 to 5 m, with measurements acquired five times at 50 cm increments. The standard deviations between the measurements at all distances are presented as the error bar.
2.5. Experimental setup for effect on Microphone directionality
To evaluate the effect of external noise on microphone directionality, we recorded bioacoustics signals from the Child Sim using uni-directional and omni-directional MEMS microphones. Noise signals centered at different frequencies (60 dBA) were generated using a noise generator.
2.6. Study with human subjects
The clinical study involved several human subjects: two healthy adults, two pediatric patients, and five elderly patients (Table S2 in Appendix A for detailed information). The institutional review board (IRB) of Nowon Eulji Medical Center, Eulji University (IRB No. EMCS 2021-07-003). All participants provided signed consent forms, with their legal guardian consenting for participation in the study. The LSMPs were attached to the skin directly on the body, where breath sound could be auscultated, as instructed by the clinicians.
2.7. Constructing deep learning model and wheeze counting algorithm
We utilized Log-Mel spectrograms converted from audio data as input for the deep-learning-model. The Python library named Librosa was used to preprocess the raw lung-sound data. To construct the deep-learning-model, we used the Python frameworks TensorFlow and Keras. Two-dimensional CNN layers were employed to extract features from the input, and max pooling layers were used to maintain the dominant features while reducing the computational burden. A dropout layer was added to prevent overfitting. The model used binary cross-entropy as the loss function and adaptive moment estimation (Adam) as the optimizer. Finally, a softmax was used as the activation function in the output layer because the sum of the output values (= 1) could be effectively utilized in the event-counting algorithm. After the training, the values predicted by the trained models were used. The raw data of the patient’s lung sounds were segmented using a fixed 0.6 s window with a step size of 0.06 s; these segments were then used as inputs into the trained model. The predicted values ranged between 0 and 1.0, indicating how closely the input data match a wheezing sound; a value of 1.0 indicates a wheeze, and a value of 0 indicates normal breathing. We set the thresholds for wheezing and normal breathing to 0.9 and 0.1, respectively, as these values were the appropriate hyperparameters for achieving high prediction accuracy. Consequently, when the predicted value dropped from 0.9 (wheezing) to 0.1 (normal breathing), it was considered a single wheezing event.
3. Results and discussion
3.1. Wearable skin attached real-time LSMP sensor
We developed a thin, flexible device that can be attached to the human body and is controlled by wireless communication for the continuous monitoring and long-term analysis of lung sounds. The key developmental factors included high-data reliability and sensitivity for lung sounds distinction and minimal skin irritation over several days. Fig. 1(a) shows a schematic of the LSMP sensor, and Fig. 1(b) shows a photograph of the fPCB (left) and biocompatible silicone cover. The cover allows the operation of a slide-type power switch and a charger port to charge the battery. A blue light emitting diode (LED) indicates the operation status by blinking (waiting) or turning the light on (pairing). All the other components were sealed for circuit protection and noise reduction.
Fig. 1(c) shows a block diagram of the major parts driving the LSMP sensor programmed in C using the embedded software. The power source was a 3.7 V battery which supplied a 3.3 V voltage via the power management unit to the circuit: the red hatch part. A MEMS microphone received clock signals and power through the MCU and transmitted the bioacoustic signals in pulse density modulation format to a mobile device via BT low energy (BLE). The blue LED indicated an operational status.
The basic characteristics of the developed LSMP were tested. Fig. 1(d) shows a schematic of the experimental setup used to evaluate the acoustic response using the LSMP and a mobile device. The LSMP was attached to the Child Sim and collected bioacoustic signals through the acoustic path formed of the fPCB. The bioacoustic signal was wirelessly transmitted to a mobile device via BLE. Fig. S1 shows a photograph of the setup. The LSMP could reliably measure bioacoustics signals for > 24 h, as shown in Fig. S2 in Appendix A.
Streaming data were received through a customized iOS app, visualized, and analyzed in real-time (Fig. 1(e)). Owing to memory limitations, AI-based data analysis is not included in the app. The app functions are described in detail in the experimental section and Movie S1 in Appendix A. High-performance MEMS microphones can mix unexpected external noise with bioacoustics signals. Fig. 1(f) shows the opposite directional acoustic response ratio of the noise measured by the two MEMS microphones for the different frequency bands, confirming the effect of the microphone directionality on the external sound influence (Fig. S3 in Appendix A). We compared the microphone sensitivities based on SNR calculations, as expressed by Eq. (1).
where sound noise ratio in decibel scale (SNRdB), measured intensity of the signal (Psignal), and the noise (Pnoise).
The wheezing sounds captured by two different microphones had SNRs of 27.04 dB (uni-directional) and 19.59 dB (omni-directional). The results demonstrated a difference of up to 15% between the uni-directional and omni-directional microphones, with lower noise levels detected by the uni-directional microphone. Therefore, we selected a uni-directional microphone as the LSMP sensor to reduce the influence of external noise. We also assess the effects of different BT module antenna attached to the Child Sim (Fig. S1). Fig. 1(g) shows RSSI values for the different distances between the antenna and the receiver, with an average signal strength of −70 dBm up to 5 m for the external antenna. Smooth communication is achieved without delay when the RSSI value is above −80 dBm [53]. Fig. S4 in Appendix A shows the LSMP for the different antenna types.
For continuous monitoring with the LSMP, we used medical-grade adhesives to attach the sensor to the skin and conducted a simple in-vivo test to assess its sustainability and compatibility. Four samples were attached to the posterior (auscultation position between the scapula and vertebral line) side of the human subject and removed after 1, 3, 5, and 7 d. Fig. 1(h) shows the skin surface after the test. After 7 d, minor skin redness was observed; however, the sensor did not fall off, and there was no skin irritation. The LSMP can be attached for more than 5 d for long-term pulmonary monitoring. Additional adhesive breathability assessments were conducted using water-filled conical tube tests [54] to gauge the transmission of gaseous H2O (Fig. S5 in Appendix A). The Easyderm permeability was improved by adding 1 mm diameter holes (spacing = 3 mm), enhancing LSMP's long-term monitoring capability. Fig. S6 in Appendix A presents the SNR calculations for wheezing over 7 d with values ranging from 19.09 to 21.1 dB. Depending on external noise heating, ventilation, and air conditioning (HVAC), the standard deviation reached 1.15 dB, but averaged at 0.80 dB, resulting in an error rate of approximately 4%. This procedure demonstrates the long-term reliability of the sensor.
3.2. Characterization of the LSMP sensor with a normal subject
We demonstrated the performance of the LSMP with an artificial simulator that generates sound signals (Figs. S7 and S8, and Section S1 in Appendix A). These examples demonstrate that the developed LSMP sensor extracted cleaner signals than the e-stethoscope for various auscultation sounds. The feasibility of classifying respiratory symptoms was also confirmed by assessing the differences in the acoustic properties of the wheezing and crackling sounds for both adventitious breathing types. However, human auscultation includes sounds from other organs and requires additional analyses. This experiment evaluates the performance of human participants (Fig. S9 in Appendix A).
Fig. 2 shows the complete data analysis process and algorithm development for classifying the original bioacoustic signals recorded by the LSMP for heart rate (HR) and respiratory rate (RR) processed simultaneously for interpretation (Figs. 2(a) and (b)). These processing steps were conducted using the application after acquiring the entire auscultation signal. Fig. 2(c) shows the original, 12 s bioacoustic signal acquired using the LSMP from the posterior left lung field of a healthy volunteer (36 years old, male). Before processing, the signal consisted of an indiscriminate mixing of the information from the heart (blue line) and lungs (red line) information with unclear boundary between inhalation and exhalation, making it challenging to classify systolic (S1) and diastolic (S2) of the heart sound. Figs. 2(d) and (e) show the results of the data that produced the heart and respiration sounds for further classification. The estimated HR and RR were calculated by counting cycles for 10 s and presented as the beats per minute and breaths per minute. Fig. 2(f) shows an expansion of the red-dotted box shown in Fig. 2(d), highlighting the S1 and S2 signals within the cardiac cycle. The HR and HR variability can be calculated using cardiac S1 and S2 analyses, providing important information to cardiologists. Fig. 2(g) shows a spectrogram of the entire auscultation signal with normal breathing and heartbeat. Each frequency and normal respiration appeared as a soft, broad signal in the 100–1000 Hz band. The LSMP must maintain communication without interruption for continuous lung-sound monitoring via wireless communication, even when the sensor and cell phone are separated by distance or clothing. Fig. 2(h) shows the RSSI values obtained when the participant was and was not wearing clothes as a function of the distance from the mobile device. The antenna embedded within the LSMP communicates directly with the mobile device, and the distance varies depending on the conditions. Obstacles (clothes) resulted in an average difference of 10 dBm within 5 m and an average signal strength of −78 dBm without any other communication problems. These findings indicate that the LSMP can maintain a steady communication strength within a distance of 5 m from the receiver without interruption when attached to the human participant skin.
In summary, the proposed system’s data-processing algorithm can distinguish the HR and RR of a healthy participant from LSMP-acquired bioacoustic data by extracting soft, breezy, and broadband breath sound. Characterization of the LSMP in a healthy participant revealed excellent auscultation performance compared with that of an e-stethoscope, indicating its suitability for clinical assessment. The LSMP can classify heart and lung sounds even when they mixed, and can classify the S1 and S2 periods, extending its use to the monitoring of cardiovascular diseases.
3.3. Clinical study of pediatric patients with asthma
Pulmonary function tests (PFTs) are the gold standard for diagnosing respiratory diseases, including asthma and COPD. Given the findings of our assessment of the proposed device, we expect that long-term monitoring using the LSMP will help evaluate the degree of worsening in pediatric asthma patients under 6 years old of age as PFTs cannot be performed on them [55]. To confirm the acoustic characteristics of pediatric asthma patients, we conducted additional LSMP measurements and analyses in a patient from this population. The clinical study was conducted on two pediatric patients with asthma: a 15-month-old boy (Fig. 3) and a 6-month-old boy (Fig. S10 in Appendix A), who were hospitalized with an acute respiratory illness.
Fig. 3(a) shows a photograph of the pediatric asthma patient recruited for this experiment. The dashed red box shows a magnified view of the attached LSMP. The LSMP was attached to the patient's back according to the clinician’s instructions, and bioacoustic data were recorded for 15 min.
Fig. 3(b) shows 12 s of representative time-series data, in which normal and abnormal breathing were recorded. The plot shows both the inhalation and exhalation cycles and distinct signal intensity between normal and abnormal breathing. Even during normal breathing, exhalation can be distinguished by a time-series plot, as shown in Fig. 3(b), whereas inhalation is not well distinguished. Physically, inhale–exhale–inhale–exhale responses should occur in this order; therefore, there is an inhalation cycle between the serial exhalation signals. Power spectrum density (PSD) analysis was applied to detect the intensity of the frequency components of the exhaled and the inhaled sections. The region in which the PSD has a higher intensity than the ambient noise range was set as the inhalation region between the two exhaled signals. In the Fig. S11 in Appendix A, the inhale, exhale, and reference noise data are represented by yellow, blue, and gray colored bars, respectively. When moving the PSD window by overlapping it between the signals of different exhales, the part that had a higher intensity in the 100–500 Hz band than the reference (external noise) was calculated as the starting point of the inhale. In particular, the 1st, 2nd, 3rd, and 6th abnormal breaths yield high intensities, which can be used as an indication of an abnormal breathing sound and have a similar trend to the presentation of typical wheezing as observed in the time-intensity plot from the Child Sim, shown in Figs. S7(b) and (h).
Fig. 3(c) shows spectrograms of the normal breathing period (blue-dotted box), and Fig. 3(d) shows spectrograms of the abnormal breathing period (red-dotted box), and the quantitative details of individual physiologic events during normal and abnormal breathing. An analysis of the spectrograms revealed that during normal breathing, no specific frequency peak developed, and exhalations were slightly stronger than inhalation. Wheezing signatures (duration > 200 ms) were confirmed during the abnormal breathing periods. A distinct wheezing signature was observed four times during the exhalation phase, as indicated by the black-dotted box.
Figs. 3(e) and (f) show the fast Fourier transforms (FFTs) of the signal during exhalation in the normal and abnormal breathing periods, respectively. During normal breathing, no signal characteristics other than the background sound component were observed, whereas the FFTs during abnormal breathing revealed a characteristic wheezing peak. In summary, we used the LSMP in a pediatric patient with asthma to analyze the inhalation/exhalation phase of normal breathing and identify wheezing during abnormal respiration. In addition, the average RR per minute for pediatric asthma patient was 57 breaths, which is considerably higher than the average RR (in the range of 28–46 breaths) in normal pediatric age groups. These results are probably not conclusive because of the limited number of pediatric asthma patients (N = 2) included in the study. A considerable number of data sets (at least 20–30) were used to analyze a single patient, and we consider that the results represent acceptable clinical values.
The results were based on long-term recordings and offered insights into the acoustic characteristics of pediatric asthma. This continuous monitoring capability ensures a more detailed assessment of respiratory sounds over time, thereby facilitating the development of better diagnostic and monitoring tools for pediatric asthma. Our final goal was to conduct further clinical trials with a larger cohort of pediatric asthma patients to obtain more definitive results and identify new audio-based biomarkers for asthma diagnosis.
3.4. Clinical study of elderly patients with COPD
We also measured and analyzed the acoustic characteristics of patients with COPD with LSMP. The clinical study was conducted on five elderly patients with COPD, a 72-year-old male (Fig. 4 and Fig. S12(a) in Appendix A), 71-year-old female (Fig. 5 and Fig. S13 in Appendix A), 68-year-old male (Fig. S12(b)), 75-year-old male (Fig. S12(c)), and 69-year-old male (Fig. S12(d)), who were hospitalized with acute respiratory illnesses.
Fig. 4(a) shows a photograph of an elderly patient with COPD. The LSMP was attached to the patient's back, and bioacoustic data were recorded for 15 min. Fig. 4(b) shows 12 s of representative time-series data, in which normal and abnormal breathing sounds were successively recorded among the continuously measured data. Although the inhalation/exhalation cycle can be observed during normal breathing, the assessment of any abnormal breathing intensity differences in the time-intensity plot is challenging owing to the noise. The main objective of a previous study [51] was to acquire HR/RR based on dynamic motion in real-time using two IMUs as the main sensors. In particular, it has major advantage in that the motion noise generated by the body movements can be canceled by two sensors. However, it is impossible to acquire bioacoustic signals because of the characteristics of the sensor, which differ from those of auscultation. As shown in Fig. 4(b), even when it is not easy to distinguish between inhalation and exhalation owing to external noise, it is possible to distinguish them through the discrete wavelet transform (DWT) and continuous wavelet transform (CWT) [56].
Decomposition was performed using DWT (low-pass and high-pass filters), and a threshold was applied to the wavelet coefficients to minimize noise and extract breath sounds from the noisy signal [57]. A distinct wheezing exhalation phase was observed at the beginning of the abnormal period, and the PSD of the exhalation at approximately 4 s yielded the distribution of the spectrum. Between the two exhalations, there was an inhalation that had higher intensity than the noise region; therefore, we continued to analyze the data to identify a PSD region that had higher intensity than the noise region and lasted longer than 0.5 s, which we designated as the inhalation period. Using the same logic, we distinguished the inhaled signal (between 2 and 3 s) and the exhaled signal (between 1 and 2 s) as shown in Fig. S14 in Appendix A. We calculated the SNR of inhalation and exhalation using the noise at pauses between breaths. The differences in the SNR values of the inhale and exhale breaths were 13.90 and 16.39 dB, respectively. PSD and SNR analyses after wavelet denoising can be used to distinguish between inhalation and exhalation, even when ambient noise is present and even when the recorded breath sounds are small. However, the three breaths captured from 6 to 12 s had long duration and high intensities, and although there was extensive noise, these can be interpreted as signs of abnormal respiration.
Fig. 4(c) shows the spectrogram for the normal breathing period, demonstrating the presence of an external HVAC system, a very short duration, and relatively broadband noise between respirations. Fig. 4(d) shows the spectrogram for the abnormal respiration period in which different wheezing signals of varying duration and frequency bands were observed during the three exhalations. This shows a wheezing signature consisting of a strong intensity at a specific frequency in the spectrogram for a certain duration, which can be easily distinguished as the inflection line (red box) from the background sound for each adventitious exhalation. In addition, a comparison of Figs. 3(d) and 4(d) shows that the baseline of the wheezing sounds exhibited a notable difference. The wheezing frequency is known to changes with age, resulting in a difference from baseline [9], [55], [58].
Figs. 4(e) and (f) show the FFT during one exhalation cycle for normal and abnormal breathing, respectively. During the 400 ms expiratory phase of normal breathing (red box in Fig. 4(c)), no characteristic signal other than the background sound can be observed. As shown in Fig. 4(f), FFT analysis was performed to quantify the three types of wheezing signals observed during the expiratory phases of the abnormal breathing period. The analysis revealed a 400 ms monophonic wheezing sound comprising a high-intensity, single 600 Hz peak, and two polyphonic wheezing sounds, demonstrating a high-intensity 400 Hz peak and a 580 Hz peak for 1 s. Polyphonic wheezes are well-known symptoms in patients with extensive airflow obstruction (asthma, COPD, chronic bronchitis, etc.) and are manifested in the form of high-pitched wheezing during breathing when the airway is narrow or stiff [59].
We showed that the LSMP can be used to distinguish between normal and abnormal breathing through short, continuous monitoring in a noisy environment. Given our findings, we expect that long-term monitoring using the LSMP will be useful for classifying the characteristics of abnormal breathing in elderly patients with respiratory diseases. The device configuration employed a single microphone as the primary sensor which demonstrated reliable performance in quiet environments. However, in noisy environments (where multiple input signals with overlapping frequency components are received simultaneously), selectively isolating the desired signal (bioacoustic signal) is challenging.
3.5. AI algorithm-based lung sound analysis
We conducted a simple demonstration to validate the application of machine learning to the LSMP for classifying breathing sounds, as shown in Fig. 5. The Child Sim, R.A.L.E.® Repository [60], and Littmann’s lung sounds [61] were used as reference database entries for these learning data, with an example shown in Fig. 5(a). We performed data augmentation to generate sufficient data to train the deep-learning model. The output shape of each layer and number of parameters are shown in Fig. S15(a) in Appendix A. Starting with a reference database that included 18 samples of normal respiration and 11 samples of wheezing respiration, we modified the length of the extracted sound, representing changes from 1.0 × speed to 0.5 × speed in 0.1 × speed decrements, as shown in Fig. 5(b). Each modified data set was then sliced into a 0.6 s window with 0.06 s overlapping; for each window, the data were converted into Log-Mel spectrograms that resembles more closely human hearing compared to the Mel spectrograms (Fig. 5(c), and Fig. S16 in Appendix A). Through this process, we augmented the original 29 samples into 1637 samples (839 normal breathing and 798 wheezing sound samples). Fig. 5(d) shows the data-processing flow within the deep-learning architecture. The training and validation data were segmented from the overall data sets at an 8:2 ratio, as shown in Fig. S15(b). After training, the validation accuracy of the model was approximately 0.99 (Fig. S15(c)), and the receiver operating characteristic curve of the trained model (depicted in Fig. 5(e) and Fig. S17 in Appendix A) indicated that the model had excellent training efficiency.
3.6. AI-based wheezing counting algorithm for long-term analysis
The use of the LSMP to monitor continuously respiratory patients and quantify the extracted breathing sounds represents a novel clinical diagnostic application that overcomes the limitations of intermittent stethoscope use.
The LSMP was attached to the patient's anterior right lung field by a clinician, bioacoustic data were recorded continuously for 79 min while the patient laid on an air mattress while receiving oxygen therapy (Fig. 6(a)). Fig. 6(b) shows a time-series plot of the entire (continuously measured) waveform; three 12 s data parts highlighted as examples of the patient’s breathing characteristics. During forced breathing (Fig. 6(c)), a high-intensity signal was observed compared with normal respiration in the pediatric (Fig. 3(b)), and COPD patient data (Fig. 4(b)). The regular, high-intensity pattern, and breathing duration were caused by the oxygen therapy device (Fig. S18 in Appendix A). Simple time-series analysis can confuse with abnormal respiration; however, this signal result from artificial ventilation, which produces high-intensity, normal respiration.
The abnormal respiration shown in Fig. 6(d) depicts a low-pitch wheeze during both the inhalations and exhalations. Forced breathing produces a high-intensity signal, but the wheezing signature due to the deformation is distinguishable despite the strong background sound component (Fig. S18(b)). Fig. 6(e) shows the wheezing during exhalation, which is typical of asthma and COPD, with polyphonic wheezing rather than single-component wheezing.
Pulmonologist typically auscultate for 10–15 s per patients, but LSMP enables the recording of all the breathing patterns for play back, although this effort requires the clinician labor. To reduce the effort and misdiagnosis rate, we developed an AI-based event-counting algorithm to monitor the time-varying symptoms from COPD patient data using LSMP. Despite the substantial clinical data length (79 min), AI analysis minimized the clinician’s effort for long-term lung functional evaluations.
Fig. 6(f) shows a 30 s segment of clinical data covering 12 inhalation and exhalation cycles. The blue line represents the bioacoustic data, and the predictions of the trained model are marked with yellow dots. The model predicted the values between 0 and 1 for each label using prediction values for the “wheeze” label. Clinical data were sliced with a fixed 0.6 s window, acquired every 0.06 s, and input into the model to calculate the predicted values for each segment. The prediction resolution to detect normal and wheezing sounds in a single breath cycle was sufficiently high. We included an algorithm to count the number of wheezing events because the model predicted incoming signals every 0.06 s. The predicted values ranged between 0 and 1.0, indicating that the input data were very close to a wheezing sound; a value of 1.0 indicated a wheeze, and a value of 0 indicated normal breathing. We respectively set the wheeze and normal breathing to 0.9 and 0.1 to achieve high-prediction accuracy. When the predicted value dropped from 0.9 (wheezing) to 0.1 (normal breathing), the AI counted a wheezing event (Fig. 6(f), magnified in Fig. 6(g)). The yellow rectangles represent predicted wheezing sound. Event counts and timing precisely matched the 12 wheezing events.
Fig. 6(h) compares the events counted over time using the “AI count” and a “Clinician count.” Over a total of 1630 breaths, the AI and the clinician counted the number of wheezing events every 5 min with the 79 min COPD lung sound data. The total count was 1450 for the clinician and 1430 for the AI, with average match rate of 80.5% (Fig. S13(a)). The results showed that the AI algorithm can classified normal and wheezing sounds with high accuracy, particularly in patients with asthma or COPD, indicating that the LSMP can monitor lung sounds to determine symptoms severity and changes over time. Despite small number of sample size, this study demonstrated an acceptable clinical value for long-term data reliability. AI classification over long-term (79 min) data also demonstrates reliability levels higher than that of the “pulmonologist fellow.” Clinical studies were conducted in a hospital with some noise, which can disrupt the AI count accuracy.
Compared with Lee et al. [39], also developed a wearable stethoscope integrated with an AI algorithm to detect lung and heart sounds. Previous research has demonstrated that AI systems can achieve a classification accuracy of up to 95% using breath sounds. However, this study employed comparatively limited data sets, ranging from 30 s to 2 min. In the best-case scenario, we achieved a higher accuracy (98.8%) when the AI classification was performed on a considerably longer clinical data set (13 min) obtained from actual patients with COPD (Fig. S12(b)). However, we focused more on continuous monitoring and clinical trials. We demonstrated its reliability for clinical use by counting the number of wheezing episodes from the long-term clinical lung sound (75 min). We counted the number of wheezing per minute, instead of detecting whether the patient experienced wheezing. By combining our wearable stethoscope with an AI model for classification, we demonstrated how it can be used for adventitious sound monitoring of patients.
Table 2 [[39], [62], [63], [64], [65]] summarizes the recently reported AI-based abnormal breath sound classification performances. We obtained not only the highest training validation performance but also the best classification accuracy using real clinical data for a long time. Compared with the other sensor platforms presented in Table 1, the LSMP is suitable for continuous auscultation for a long time as a flexible patch-type sensor. However, it also shows a high accuracy (≥ 80.5%) using actual clinical data from open databases after AI training. Considering that the temporal span of the clinical data used for validation in most studies is between 30 s and 1 min, this study validated the sensor and AI model for at least 10 min and up to 78 min per data set, showing the high-reliability attributes of the sensor and AI model. As a supplement, we compared the count trajectories for all clinical cases (two healthy adults, two pediatric patients with asthma, and five elderly patients with COPD) used in this study every minute (Figs. S9, S10, S12, and S13(b)) and plotted the prediction of three extracted regions from Fig. 6 data (Fig. S19 in Appendix A) whose lengths were approximately 18 s to verify the sensing reliability. Detailed information on all clinical patients (N = 9) and AI classification reliability is listed Table S2.
There were noticeable differences between the counts obtained by clinicians and those obtained by the AI model in some cases. These differences were primarily attributed to factors such as the frequency overlap caused by surrounding environmental noise and limitations in the algorithm owing to the amount of AI training data.
4. Conclusions
Conventional auscultation devices, especially analog and e-stethoscopes, are not suitable for continuous auscultation because of their structural limitations; however, it is important to evaluate changes in lung function based on continuous auscultation in respiratory patients. A patient with acute or intermittent respiratory disease should accurately evaluate the changes in symptoms and receive adequate treatment (oxygen, rescue inhaler) through continuous auscultation [66]. If information, such as lung sounds and the number of wheezing, crackles, and coughs can be delivered to the clinician or the user, it can take preemptive measures can be enforced to prevent dangerous situations.
In this study, we developed a wearable stethoscope that can overcome the spatiotemporal limitations of the conventional stethoscope for continuous lung sound monitoring, which uses wireless communication and is controlled by a mobile device. Based on various evaluations of the LSMP sensor characteristics, the optimal combination for wireless auscultation was identified, leading to the construction of an ideal sensor. Based on these results, the LSMP was miniaturized and lightweight so that it could be attached to the skin for a long time without skin irritation. Adventitious breathing sounds with different characteristics were evaluated following the clinical assessments of pediatric patients with asthma and elderly patients with COPD. In particular, LSMP can be used as a novel continuous auscultation device that can evaluate lung function in cases requiring symptom control for pediatric patients, for whom PFTs cannot be performed, under 6 years old. Furthermore, the rate of classifying abnormal breathing sounds achieved a high accuracy at the fellow pulmonologist level (80.1%). The efforts of clinicians in long-term lung function evaluations were minimized by using AI analyses of the extracted signals.
Future work will include the integration of novel, active-noise cancellation technology, which will facilitate the auscultation of physiologic signals during daily activities and enable long-term monitoring over a 24 h period; this will allow the understanding of the relationship between lung sounds and environmental changes or drug administration and will provide clinical data for medical decision-making. In particular, continuous monitoring of patients with chronic respiratory can provide useful information for evaluating the status of the disease as cough, average RR, dyspnea, and wheezing, which occur in these diseases, are related to worsening symptoms.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported by the Korea Environment Industry & Technology Institute (KEITI) through Digital Infrastructure Building Project for Monitoring, Surveying and Evaluating the Environmental Health program, funded by the Korea Ministry of Environment (MOE) (2021003330008). This work was supported by the KIST Internal program (2E32851). This work was supported by the Korea Health Technology Research and Development (R&D) Project through the Korea Health Industry Development Institute (KHIDI) and Korea Dementia Research Center (KDRC), funded by the Ministry of Health & Welfare and Ministry of Science and ICT, Republic of Korea (HU20C0164); the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2022R1A6A3A01087298).
GeddesLA.Birth of the stethoscope.IEEE Eng Med Biol Mag2005; 24(1):84-86.
[2]
SarkarM, MadabhaviI, NiranjanN, DograM.Auscultation of the respiratory system.Ann Thorac Med2015; 10(3):158-168.
[3]
GavrielyN, NissanM, RubinAH, CugellDW.Spectral characteristics of chest wall breath sounds in normal subjects.Thorax1995; 50(12):1292-1300.
[4]
EmmanouilidouD, PatilK, WestJ, ElhilaliM.A multiresolution analysis for detection of abnormal lung sounds.In: Proceedings of the 2012 AnnualInternationalConference of theIEEE Engineering inMedicine andBiologySociety; 2012 Aug 28–Sep 1; SanDiego, CA, USA. Piscataway: IEEE; 2012. p. 3139–42.
[5]
HaiderNS, JosephJ, PeriyasamyR.An investigation on the statistical significance of spectral signatures of lung sounds.Biomed Res2017; 28(6):2801-2810.
[6]
SutherlandER.Nocturnal asthma: underlying mechanisms and treatment.Curr Allergy Asthma Rep2005; 5:161-167.
RietveldS, OudM, DooijesEH.Classification of asthmatic breath sounds: preliminary results of the classifying capacity of human examiners versus artificial neural networks.Comput Biomed Res1999; 32(5):440-448.
[9]
Aviles-SolisJC, JacomeC, DavidsenA, EinarsenR, VanbelleS, PasterkampH, et al.Prevalence and clinical associations of wheezes and crackles in the general population: the Troms∅ study.BMC Pulm Med2019; 19:173.
KimY, HyonY, JungSS, LeeS, YooG, ChungC, et al.Respiratory sound classification for crackles, wheezes, and rhonchi in the clinical field using deep learning.Sci Rep2021; 11:17186.
[12]
ShahMA, ShahIA, LeeDG, HurS.Design approaches of MEMS microphones for enhanced performance.J Sens2019; 2019:9294528.
[13]
AlgamiliAS, KhirMHM, DennisJO, AhmedAY, AlabsiSS, BaSS Hashwan, et al.A review of actuation and sensing mechanisms in MEMS-based sensor devices.Nanoscale Res Lett2021; 16(1):16.
[14]
LakheA, SodhiI, WarrierJ, SinhaV.Development of digital stethoscope for telemedicine.J Med Eng Technol2016; 40(1):20-24.
[15]
VilendrerS, SackeyfioS, AkinbamiE, GhoshR, LuuJH, PathakD, et al.Patient perspectives of inpatient telemedicine during the COVID-19 pandemic: qualitative assessment.JMIR Form Res2022; 6(3):e32933.
[16]
MesquitaCT, ReisJC, SimoesLS, MouraEC, RodriguesGA, AthaydeCC, et al.Digital stethoscope as an innovative tool on the teaching of auscultatory skills.Arq Bras Cardiol2013; 100(2):187-189.
[17]
LeggetME, TohM, MeintjesA, FitzsimonsS, GambleG, DoughtyRN.Digital devices for teaching cardiac auscultation—a randomized pilot study.Med Educ Online2018; 23(1):1524688.
[18]
JinY, ChenG, LaoK, LiS, LuY, GanY, et al.Identifying human body states by using a flexible integrated sensor.npj Flex Electron2020; 4:28.
[19]
GaoY, YuL, YeoJC, LimCT.Flexible hybrid sensors for health monitoring: materials and mechanisms to render wearability.Adv Mater2020; 32(15):1902133.
[20]
LinghuC, ZhangS, WangC, SongJ.Transfer printing techniques for flexible and stretchable inorganic electronics.npj Flex Electron2018; 2:26.
JangH, SelK, KimE, KimS, YangX, KangS, et al.Graphene e-tattoos for unobstructive ambulatory electrodermal activity sensing on the palm enabled by heterogeneous serpentine ribbons.Nat Commun2022; 13(1):6604.
HuY, XuY.An ultra-sensitive wearable accelerometer for continuous heart and lung sound monitoring.In: Proceeding of the 2012 AnnualInternationalConference of theIEEE Engineering inMedicine andBiologySociety. 2012 Aug 28–Sep 1; SanDiego, CA, USA. Piscataway: IEEE; 2012. p. 694–7.
[34]
GuptaP, MoghimiMJ, JeongY, GuptaD, InanOT, AyaziF.Precision wearable accelerometer contact microphones for longitudinal monitoring of mechano-acoustic cardiopulmonary signals.npj Digit Med2020; 3:19.
[35]
LeeK, NiX, LeeJY, ArafaH, PeDJ, XuS, et al.Mechano-acoustic sensing of physiological processes and body motions via a soft wireless device placed at the suprasternal notch.Nat Biomed Eng2020; 4(2):148-158.
[36]
PrasadM, SahulaV, KhannaVK.Design and fabrication of Si-diaphragm, ZnO piezoelectric film-based MEMS acoustic sensor using SOI wafers.IEEE Trans Semicond Manuf2013; 26(2):233-241.
[37]
HayberSE, TabaruTE, KeserS, SaracogluOG.A simple, high sensitive fiber optic microphone based on cellulose triacetate diaphragm.J Lightwave Technol2018; 36(23):5650-5655.
LeeSH, KimYS, YeoMK, MahmoodM, ZavanelliN, ChungC, et al.Fully portable continuous real-time auscultation with a soft wearable stethoscope designed for automated disease diagnosis.Sci Adv2022; 8(21):eabo5867.
[40]
YilmazG, RapinM, PessoaD, RochaBM, deAM Sousa, RusconiR, et al.A wearable stethoscope for long-term ambulatory respiratory health monitoring.Sensors2020; 20(18):5124.
[41]
ChungHU, KimBH, LeeJY, LeeJ, XieZQ, IblerEM, et al.Binodal, wireless epidermal electronic systems with in-sensor analytics for neonatal intensive care.Science2019; 363(6430):eaau0780.
[42]
ChungHU, RweiAY, Hourlier-FargetteA, XuS, LeeKY, DunneEC, et al.Skin-interfaced biosensors for advanced wireless physiological monitoring in neonatal and pediatric intensive-care units.Nat Med2020; 26(3):418-429.
[43]
LiuY, NortonJJS, QaziR, ZouZN, AmmannKR, LiuH, et al.Epidermal mechano-acoustic sensing electronics for cardiovascular diagnostics and human-machine interfaces.Sci Adv2016; 2(11):e1601185.
[44]
KramanSS, WodickaGR, PresslerGA, PasterkampH.Comparison of lung sound transducers using a bioacoustic transducer testing system.J Appl Physiol2006; 101(2):469-476.
[45]
KramanSS, PresslerGA, PasterkampH, WodickaGR.Design, construction, and evaluation of a bioacoustic transducer testing (BATT) system for respiratory sounds.IEEE Trans Biomed Eng2006; 53(8):1711-1715.
[46]
ShkelAA, KimES.Wearable low-power wireless lung sound detection enhanced by resonant transducer array for pre-filtered signal acquisition.In: Proceedings of the 2017 19thInternationalConference onSolid-StateSensors, Actuators andMicrosystems (Transducers); 2017 Jun 18–22; Kaohsiung, China. Piscataway: IEEE; 2017. p. 842–5.
[47]
ShkelAA, KimES.Continuous health monitoring with resonant-microphone-array-based wearable stethoscope.IEEE Sens J2019; 19(12):4629-4638.
[48]
LeeSH, KimYS, YeoWH.Advances in microsensors and wearable bioelectronics for digital stethoscopes in health monitoring and disease diagnosis.Adv Healthc Mater2021; 10:2101400.
[49]
SeratoJHL, ReyesR.Automated lung auscultation identification for mobile health systems using machine learning.In: Proceedings of the 2018 IEEEInternationalConference onAppliedSystemInvention (ICASI); 2018 Apr 13–17; Chiba, Japan. Piscataway: IEEE; 2018. p. 287–90.
[50]
BohadanaA, IzbickiG, KramanSS.Fundamentals of lung auscultation.N Engl J Med2014; 370:744-751.
[51]
JeongH, LeeJY, LeeK, KangYJ, KimJT, AvilaR, et al.Differential cardiopulmonary monitoring system for artifact-canceled physiological tracking of athletes, workers, and COVID-19 patients.Sci Adv2021; 7(20):eabg3092.
[52]
BhaskarA.A simple electronic stethoscope for recording and playback of heart sounds.Adv Physiol Educ2012; 36:360-362.
[53]
VallejoM, RecasJ, delPG Valle, AyalaJL.Accurate human tissue characterization for energy-efficient wireless on-body communications.Sensors2013; 13(6):7546-7569.
[54]
WeiY, ShiX, YaoZ, ZhiJ, HuL, YanR, et al.Fully paper-integrated hydrophobic and air permeable piezoresistive sensors for high-humidity and underwater wearable motion monitoring.npj Flex Electron2023; 7:13.
[55]
ReddelHK, BacharierLB, BatemanED, BrightlingCE, BrusselleGG, BuhlR, et al.Global initiative for asthma strategy 2021: executive summary and rationale for key changes.Eur Respir J2024; 59(1):2102730.
[56]
PouyaniMF, ValiM, GhasemiMA.Lung sound signal denoising using discrete wavelet transform and artificial neural network.Biomed Signal Process Control2022; 78(Part B):103329.
MendesL, VogiatzisIM, PerantoniE, KaimakamisE, ChouvardaI, MaglaverasN, et al.Detection of wheezes using their signature in the spectrogram space and musical features.In: Proceedings of the 2015 37thAnnualInternationalConference of theIEEE Engineering inMedicine andBiologySociety (EMBC); 2015 Aug 25–29; Milan, Italy. Piscataway: IEEE; 2015. p. 5581–4.
[63]
Torre-CruzJ, Canadas-QuesadaF, Carabias-OrtiJ, Vera-CandeasP, Ruiz-ReyesN.A novel wheezing detection approach based on constrained non-negative matrix factorization.Appl Acoust2019; 148:276-288.
[64]
PernaD, TagarelliA.Deep auscultation: predicting respiratory anomalies and diseases via recurrent neural networks.In: Proceedings of the 2019 IEEE 32ndInternationalSymposium onComputer-BasedMedicalSystems (CBMS); 2019 Jun 5–7; Cordoba, Spain; Piscataway: IEEE; 2019. p. 50–5.
[65]
ImS, KimT, MinC, KangS, RohY, KimC, et al.Real-time counting of wheezing events from lung sounds using deep learning algorithms: implications for disease prediction and early intervention.PLOS ONE2023; 18(11):e0294447.
[66]
DiMangoE, RogersL, ReibmanJ, GeraldLB, BrownM, SugarEA, et al.Risk factors for asthma exacerbation and treatment failure in adults and adolescents with well-controlled asthma during continuation and step-down therapy.Ann Am Thorac Soc2018; 15(8):955-961.