Close

Presentation

Investigating Semantic Integration of Gestures With Nouns in Neurodiverse Individuals via Event Related Potentials
DescriptionINTRODUCTION
Language skills interventions are beneficial to improving communication outcomes in children with Autism Spectrum Disorder (ASD, ~3% of children), and with Developmental Language Disorder (DLD, ~7%), to enable better academic and social outcomes. Also, Attention Deficit (Hyperactivity) Disorder (ADHD, ~11%) overlaps with ASD (10-70%, “AuDHD”) and DLD (~50%), resulting in compounding challenges.

However, access to crucial language services is limited in rural settings, with high rates of underdiagnosis and resulting undertreatment in rural areas. Thus, rural-inclusive delivery of language supports is critical.

One viable and accessible support framework is using complementary hand gestures with speech to enhance learning and support verbal communication, that is multimodal communication. Caregiver and child hand gestures improve comprehension of content words, like nouns and verbs, in Typically Developing (TD) children and those with DLD/ASD, so gestures are apt for at-home intervention by caregivers.

A key question is whether children’s brains process hand gestures like communicative material. For those brains that process hand gestures, an efficacious intervention with hand gestures may be beneficial and support communication, but for individuals whose brains do not respond to hand gestures, such an intervention may be unwarranted. Thus, the fundamental research on recording brain activity during communication is needed prior to development and implementation of a possible intervention.

Noninvasive recording of electrical brain activity over time, EEG (electroencephalography), can be used to measure ERP (event-related potentials), which are brain responses as measured when time-locked to the onset of a stimulus. The N400 waveform (a negative voltage, 400 milliseconds after the onset of the stimulus) indicates semantic processing and integration of matching vs. mismatching pictures and speech. Semantic processing of iconic gestures (gestures that look like the concept that is verbally expressed) with speech occurs successfully in TD children and adults as measured by ERPs. So far, no studies have used ERPs to examine gesture-speech comprehension in individuals with ASD or DLD.

In this paper, we leverage EEG and ERPs to investigate the neural processing of multimodal comprehension of speech with and without gestures during a passive-viewing task in a neurodiverse population (i.e., including both typical and neurodivergent individuals).

METHODS
Participants
As part of RAISE study (Rural Autistic Individuals – Supporting Expression), 13 hearing native speakers of English participated: three TD 4-year-olds (“FOURs”); three older children (10-14 years old) with ASD and/or ADHD (called ND or neurodivergent group); four TD older children (9-12 years old); and 2 adults (32-44), one TD and one ASD. All participants had average or above-average cognitive and language skills according to standardized tests of nonverbal IQ (KBIT), and vocabulary comprehension (PPVT).

Passive viewing vocabulary-gesture task and procedure
We used 40 nouns from Peabody Picture Vocabulary Test (PPVT) starting from simpler nouns (“car”) to more complex nouns (“amphibian”). In 20 matching scenarios, a noun has a matching cartoon animation and/or a matching iconic gesture (that looks like the concept of the noun). In 20 mismatching scenarios, different nouns mismatch with animations and gestures. We adapted signs for these nouns from American Sign Language, both for standardization of gestures and generalizability of findings.

We employed a pretest–test approach to measure the effect of gestures on the comprehension of nouns, by presenting the same 40 nouns and 31 animations and/or gestures twice. Each Time has gesture, animation, and noun for consistency. The word always comes last.

Time 0 (Animation): a neutral grooming gesture followed by match or mismatch animation for noun with spoken word (standard set up in ERP research).
Time 1 (Gesture): A neutral animation followed by match or mismatch of communicative gesture with word.

A custom-developed video processing suite used to create the passive-viewing task leveraged advanced computer vision techniques to segment videos, focusing on precise events such as onset of gestures or onset of lip movements in speech, with consistent time buffers added for each occurrence. This ensured millisecond-level alignment of video stimuli, addressing critical timing challenges and enhancing the reliability of ERPs.

During testing, participants sat in front of a computer screen while we recorded their EEG/ERP.

EEG procedure
Using continuous EEG recording, with ERPs time-locked to the onset of auditory words, visual gestures, and animations, we identify lexical and semantic processing and integration is indicated by N400, evident when there is a mismatch in the stimuli leading to semantic incongruency, e.g., an animation or gesture of “bouquet” with the word “cat”. A Neuroelectrics Enobio gel-electrode EEG system was used, with 20 channels, mounted according to the 10-20 international system, sampled at 500 Hz, integrated into iMotions software. We preserved each participant’s continuous EEG timeline while cleaning the data through bandpass filtering, automatic detection and interpolation of bad channels, ICA removal of artifact components, and reconstruction of a full, time-aligned channel dataset.

EEG analysis
Data were epoched from –200 to 800 ms around speech onset, baseline corrected (–200 to 0 ms), assigned to one of four conditions: T0_match, T0_mismatch, T1_match, T1_mismatch, and averaged by word, condition, and then participant. We computed these from six centro-parietal electrodes (C3, Cz, C4, P3, Pz, P4), for N400 window.

RESULTS
Global Field Power (GFP) measures the spatial standard deviation across all channels at each moment, capturing how strongly the brain’s response is synchronized across the scalp, with higher values indicating more pronounced neural activity.

In the N400 time window, GFP was generally highest for FOURs and older TD, intermediate for Adults, and lowest for ND. For animation trials (Time 0), FOURs and Adults show clear increases from match to mismatch (FOURs ≈ 5.0 to 8.0 µV, Adults ≈ 3.1 to 7.8 µV). ND also increase from about 3.1 to 5.0 µV, but remain lower overall. Older TD are the outlier here: their GFP is already high on T0 match (≈ 6.4 µV) and is slightly lower on mismatch (≈ 5.8 µV), so they do not show the same T0 GFP increase in the mismatch condition seen in the other groups.

For gesture trials (Time 1), FOURs and Adults again show higher GFP on mismatch than match (FOURs ≈ 6.7 to 7.4 µV, Adults ≈ 5.1 to 6.1 µV), and older TD show a smaller but similar increase (≈ 6.2 to 6.6 µV). In contrast, ND stay flat in this window, with match and mismatch both around 3.2 µV. These results support the idea that FOURs, older TD, and Adults all show strong, sustained scalp level engagement for both animation and gesture mismatches, while ND participants show a modest GFP increase only for animation mismatches and little change for gesture mismatches.

Thus, it seems that the younger and older TD and adults processes and semantically integrate both animations and gestures with spoken words, while ND process animations with words well, but have reduced sensitivity to multimodal word-gesture mismatches. Importantly, while increased GFP is not inherently indicative of “better” processing, in the context of this paradigm, where mismatch trials are expected to evoke increased semantic effort, higher GFP during the N400 window supports the interpretation of more robust semantic integration mechanisms for multimodal input.

It is possible that ND semantically process gestures with words, but do so in a different way from TD participants, perhaps with some compensatory strategies.

In future analyses, we will work towards classifying participants into TD or ND based on brain EEG and ERP signatures.

CONCLUSION
This research informs initial steps towards a new evidence-based and practical gesture-based intervention, accessible without physical clinician presence, to support multimodal communication in neurodiverse individuals of all ages, thereby reducing rural access disparity, and promoting rural equity and public health.
Event Type
Poster Presentation
TimeTuesday, March 244:45pm - 6:15pm EDT
LocationRhinelander Gallery
Tracks
Patient Safety Research and Initiatives