Keywords

1 Introduction

The aging society is increasing in population globally. The World Health Organization defines an older adult as a person above the age of 65 [1]. According to the United Nations, the proportion of the older adults will increase from 2020 to 2050 in all parts of the world, and the world’s population aged 65 and above is expected to increase from 9.3% to 16.0% [2]. The ratio of the older adults in Japan is relatively high; according to the Cabinet Office, the aging rate was 28.4% (35.8 million people) in 2019, and it is expected to reach 37.7% in 2050 [3].

To solve the aging problem, various healthcare devices and older adults life support robots have been developed for home medical care/nursing care and general support. These devices utilize information and communication technology. Particularly, several studies have been conducted to monitor the health conditions of the older adults living alone by using a monitoring system based on physical health data to prevent lonely deaths [4,5,6,7]. Although advances in information equipment make it easy to build a system that provides physical support, little has been done in the advancement of information equipment for reducing loneliness, anxiety and improving the mental health of the older adults. However, these few studies suggest that communication robots can be potential partners to improve the mental of the older adults. In a previous study about the mental effect of communication robots on the older adults live in the nursing home, experiments were conducted for five weeks by using seal robots [8].

Libin et al. reported that humans achieve positive emotions, such as joy and interest, through their involvement with interactive robots that have the appearance of living things [9]. Exploratory pilot tests using a conversation agent-based system also showed that agents actively engaging older people in interactions are more effective against loneliness than those of passive conversations [10]. These results indicate that a communication robot can reduce the feeling of loneliness among the older adults. While several studies have investigated the relationship between the quality of life of the older adults and the use of support robots, research has not been conducted on a system that simultaneously provides physical and mental care for the older adults. Therefore, we aimed to develop a system that supports the physical and mental care of older adults aged 65 and above. Thus, we developed a support system that uses a communication robot named Unibo, and we evaluated the system from the subjective and objective perspectives of the participants and on the speech recognition rate, respectively. In this system, the communication robot regularly measures the body temperature, blood pressure, and pulse of the older adults. The results are then sent to medical personnel to provide physical care support and provide conversations and games to alleviate loneliness and anxiety.

2 Methods

2.1 System Description

The communication robot is the core of the proposed system, and Unibo (Unirobot Corp.) was used because of its functionality and appearance familiarity. As shown in Fig. 1, Unibo has a deformed childlike appearance. The robot is 32 cm in height, 26 cm in width, 16 cm in depth, and 2.5 kg in weight. Unibo picks up the speech sound of the participant through the microphone on the head, and the speech is analyzed by the application programming interface (API) of the conversation on the cloud via the Internet. The API returns an appropriate response according to the semantic content of the analyzed speech. The robot’s face is a liquid crystal display (LCD) touch screen that reads the touch operation of the participant and displays facial expressions or images. A camera is attached above the face, and it can take pictures and record videos. Both arms of the robot move 90° back and forth by a shaft motor, and touch sensors are mounted on the legs. The robot has a built-in communication module that enables wireless communication such as Wi-Fi and Bluetooth.

Fig. 1.
figure 1

The communication robot, Unibo.

Fig. 2.
figure 2

A participant using Unibo.

Our research on an oral function improvement system for the older adults [11] has shown that, in many cases, the tablet terminal does not respond to the touch operations of the older adults. Therefore, in this study, the communication mode of the system is mainly verbal, and the touch operation is a hybrid specification that is performed when necessary. The physical support system that measures vital signs does not require any touch operation, and the mental health support system mainly uses verbal conversations and instructions, unless occasionally during quizzes that require voice or tap operations to select answers. In this study, the distance between Unibo and the participant was set at approximately 1 m. Figure 2 shows an example of a participant using Unibo.

In this study, the physical support system for measuring vital signs aims to support the management of the daily physical conditions of the older adults. Unibo gives verbal instructions to older adults so that it can measure vital values using a thermometer and sphygmomanometer. Unibo automatically collects the acquired results and sends them to the medical personnel. This system is designed to enable older adults to naturally measure body temperature and blood pressure by listening to the verbal instruction from Unibo and watching the images displayed on Unibo’s screen. The activity diagram of this system is shown in Fig. 3. Unibo automatically collects the measurement results from each device and completes the daily report by sending an e-mail summarizing the measurement data to the medical personnel.

Fig. 3.
figure 3

Flow chart of the vital signs measurement system.

Fig. 4.
figure 4

System configuration diagram.

The vital sign information measured by the physical support system are blood pressure, pulse rate, and body temperature because abnormality in these vital signs can be easily detected. The devices used for these measurements were selected based on compatibility with Unibo. Thus, an oscillometric-type electronic sphygmomanometer connected to the upper arm (UA-651BLE, A&D Co., Ltd,) was used, and a thermistor-type electronic thermometer connected to the axilla (estimated 30 s) (UT-201BLE, A&D Co., Ltd.) was used. The results measured by each device are sent to Unibo by wireless communication, and Bluetooth low energy is used as the wireless communication standard. Figure 4 illustrates the configuration of the vital sign support system for evaluating the physical conditions.

To increase the speech recognition rate of Unibo, depending on the situation, we used two speech recognition functions. The first function is a mode that starts speech recognition when the older adults says “Unibo” (hereafter “wake-up word”). The flow chart of this mode is shown in Fig. 5. In this mode, Unibo only responds when it detects a wake-up word. Therefore, even if Unibo were installed in the living space of the older adults, it would not interfere with unnecessary reactions. After speaking the wake-up word to Unibo, it is expected to become easier for the older adults to identify the time when Unibo’s speech recognition begins. Therefore, the trigger mode is expected to increase the success rate of speech recognition when calling various contents activated by speaking to Unibo, such as the beginning of a conversation with Unibo games, and Internet searching. Hence, the trigger mode was made the default setting when installing Unibo in the experimental environment as well as in the experiment of calling various contents and evaluating the impression of the contents. Unibo can engage in continuous and flexible daily conversation (“free conversation”) with the participant. However, in these situations, it is complicated and cumbersome to speak the word of awakening each time the older adults speaks. Therefore, to realize continuous and flexible free conversations, the trigger mode setting is disabled.

Fig. 5.
figure 5

Flow chart of the trigger mode.

Fig. 6.
figure 6

Flow chart of the non-trigger mode.

Figure 6 is an activity diagram for when the trigger mode is turned off. During a free conversation, for example, if a older adult utters the statement, “Good morning”, to off? Unibo, the robot will respond likewise by saying “Good morning. Did you sleep well yesterday?” The older adults will then respond appropriately to the question. Unibo again responds with appropriate reactions and expressions to the speeches of the elderly. The conversation continues this way, establishing free conversations. This flexibility of free conversations is a major conversational feature of the developed system.

2.2 Survey Method

The participants in this study are older adults.We aimed to evaluate the system through subjective evaluation by the participants and objective evaluation by checking the speech recognition rate. Hence, the survey was divided into three sections to verify the applicability of the system. Additionally, the experimental scenery were record by a video camera.

In the first section, participants evaluated the physical support system subjectively by testing the system that measures vital signs at the verbal instruction of Unibo. Here, the participant operated the thermometer and sphygmomanometer while following Unibo’s instructions for measuring vital signs. The experiment was completed when the measurement results were sent by e-mail to the designated terminal.

In the second section, the participant activated Unibo’s contents using the trigger mode and experienced the activated content. This helped to obtain the impression for the subjective evaluation; the objective evaluation for checking the effectiveness of the trigger mode was performed by observing the videos of the section. Here, the investigator first verbally explained the use of the trigger mode. The participant then activated two types of contents using the trigger mode: the song and quiz contents. This section ended when the execution of the two types of contents was completed.

In the third section, the participant and Unibo engaged in a free conversation with the trigger mode turned off. The participant obtained the impression for subjective evaluation in the free conversation with Unibo, and the recorded videos of this section were used for the objective evaluation to determine the effectiveness of the free conversation. During the experiment, the investigator urged the participant to greet Unibo or interview Unibo first and afterward ask Unibo freely. This section ended after three minutes.

After completing the three sections, the participant subjectively evaluated the system by filling a questionnaire containing prepared questions and free-form fields. The questionnaire items were classified into two parts: a part collected the attributes of the participant, and the other part collected the subjective evaluation responses of the system. The participants were instructed to complete the VAS (Visual Analogue Scale) questionnaire before and after the experiment to assess changes in mental state. The scale possessed scores of 0–10 (“0 = dark feeling” to “10 = bright feeling”), and then marking the current feeling on the line (see Fig. 7).

Fig. 7.
figure 7

Example of VAS.

The experimental scenes were recorded by turning on the video camera from behind the participants. The videos were recorded to protect privacy. In addition, the effectiveness of each speech recognition rate in the trigger and non-trigger modes was confirmed by checking the recorded videos. This was done by counting the number of times in which communication was not established between Unibo and the participant in each mode and dividing the count by the total number of conversations in each mode (“error rate”).

This study was conducted with the approval of the University of Nagasaki General Research Ethics Committee with approval number 440.

3 Result

3.1 Participant’s Characteristics

The attributes of the participants are shown in Table 1. The participants included 7 male and 4 females. Their ages ranged from 65 to 88 years, with a mean ± SD of 71.1 ± 6.7. No participant had a disability in the upper limbs. All participants had experience with mobile devices, including tablets, but none had experience with communication robots.

Table 1. Characteristics of participants.

3.2 Evaluation of the System

The results of evaluating the system operation are shown in Table 2. The impression about the system was extremely good for 4 participants (36.4%) and moderately good for 7 (63.6%). The ease of operation during the temperature measurements was extremely easy for 7 participant (63.6%), moderately easy for 2 (18.2%), moderately difficult for 2 (18.2%). The ease of operation during the blood pressure measurements was extremely easy for 9 The ease of operation during the temperature measurements was extremely easy for 7 participant (63.6%), moderately easy for 2 (18.2%), moderately difficult for 2 (18.2%). (81.8%), moderately easy for 2 (18.2%). The ease of operation during the body temperature measurements was extremely easy for 7 participant (63.6%), moderately easy for 2 (18.2%), moderately difficult for 2 (18.2%). All 11 (100%) faced no problem with the touch function of Unibo’s screen. The conversation with Unibo was extremely fun for 5 participants (45.5%), moderately fun for 6 (54.5%). In response to the question “Do you think this system can reduce loneliness and anxiety?”, 5 participants (45.5%) said able and 6 (54.5%) said moderately able. In response to the question “Do you think this system can reduce loneliness and anxiety by contacting medical institutions?”, 5 participants (45.5%) said able and 6 (54.5%) said moderately able. When asked if they would like to use the proposed system in the future, 10 participants (90.9%) said yes, whereas 1 (9.1%) said no. The results of mental state before and after the experiments are shown in Table 3. The average and ± SD of the VAS values before the experiment was 6.1 cm ± 1.9 cm, and after the experiment it was 8.3 cm ± 1.0 cm; the rate of change average ± SD was 28.1% ± 17.0%. Table 4 presents the result of each mode and the experimental error rate. The average and ± SD of the error rate in the trigger mode non-trigger mode and total of both mode was 26.5% ± 26.8, 28.0% ± 14.9, and 27.4% ± 21.2, respectively.

Table 2. The results of evaluating the system operation.
Table 3. The results of mental state by VAS.
Table 4. Confirmation of error rate.

4 Discussion

In this research, we developed a system that supports the physical and mental of the older adults using a conversation robot and verified the system. It can be said that the developed system is friendly with the older adults because all the participants evaluated the system impression as more than moderately good.

The average ± SD of the mental state changes before and after the experiment measured by VAS was 28.1% ± 17.0. This indicates that the participants became cheerful after experiencing the developed system. Hence, it can be said that the proposed system, which uses Unibo’s verbal communication interface, is highly suitable for the older adults. Additionally, the system can alleviate the anxiety and loneliness of the older adults effectively by sending the measured values of vital signs to the medical institution. Another remarkable result was that all participants reported no problem when operating Unibo’s LCD by touch, and the LCD was easy for them to operate.

In the measurement of vital signs using each measurement device, 2 participants said as moderately hard, meaning there is room for improvement. This was because the timing of the thermometer’s Bluetooth communication did not match that of Unibo, making some of the participants repeat the measurements. In addition, it was sometimes difficult for the participants to wrap the sphygmomanometer’s manchette on their upper arms by themselves. Hence, the measuring device and Unibo should be adjusted to ensure that the measurement is performed in one attempt, and the sphygmomanometer should be changed to a model that users can easily use themselves.

By using the trigger mode during the experiment, a clear improvement in speech recognition was observed for participants who understood the trigger mode specifications. However, no improvement in the speech recognition rate was observed for the participants who did not understand the specifications of the trigger mode and the timing of the start of speech recognition by Unibo. This observation is also reflected in the standard deviation of the error rate in the trigger mode, with an average error rate ± SD of 26.5% ± 26.8. Further, it was found that the average error rate in the trigger mode was slightly reduced compared with that in the free conversation. As the error rate in the free conversation was 28.0% ± 14.9, the SD suggests that the difference between the participants was not larger than in trigger mode. Hence, it can be said that the trigger mode improved the voice recognition rate depending on the understanding of the specifications. The considerable dependence of the voice recognition on the participant’s understanding of the specifications suggests that the trigger mode needs to be improved to make the specifications easier to understand.

Several comments were written in the comment section. These comments consist of positive comments and negative comments (see Table 5). The positive comments were about the robot’s cuteness and high level of perfection. Our research group have conducted research for the older adults using tablets [11]. Through this research, we learned the lesson that “devices must be cute” If the device is not cute, it will be difficult to get it accepted by the older adults. Negative comments were related to Unibo’s conversational ability. As noted above, the participants’ comments also indicate that the Unibo’s conversational skills need to be improved.

Table 5. Comments from participants.

5 Conclusions and Future Works

In this study, we developed and verified the usability of a system that supports the physical conditions and mental health of older adults aged 65 and above by using a communication robot named Unibo. The verification results demonstrated that the system has a human interface that the older adults consider to be friendly, and it was found that the older adults can speak with Unibo in the same way that young people use the system. In addition, the questionnaire responses suggested that the system has the same effect of reducing anxiety and loneliness as the support provided by medical institutions. Regarding the speech recognition rate, the results differed considerably depending on how well the participant understood the system specifications, making it necessary to improve the usability of the interface.