This script documents the analysis of ENGS production data and MALD corpus data with measures of Linear Discriminative Learning (LDL; e.g. Baayen et al., 2018). This is a work in progress.

The idea of this LDL implementation is the following:
Networks trained on real word data and their transformation matrices are used to create semantic matrices for pseudowords, following the implementation by Chuang et al. (2020). The underlying assumption is that speakers who face pseudowords do so with their knowledge on previously encountered words, i.e. their lexicon. Thus, pseudowords are not semantically empty shells but contain some sort of meaning.

The following script includes a reproduction of most steps found in the two previous scripts, here and here. Newly added is the updated pseudoword S matrix creation as well as a more thorough modelling procedure. [to be continued]

1 Data

1.1 MALD data

Data on real words is taken from the Massive Auditory Lexical Decision (MALD) Database (Tucker et al., 2018). It is a subset of the complete MALD database, containing 8362 words of which 2120 words contain one of 36 different affixes.

## load real words (MALD)
mald.data <- read.csv("data666/mald.subset.dfr.csv")[2:5]

dim(mald.data)
[1] 8362    4

1.2 ENGS data

Pseudoword data is taken from the ENGS production experiment (Schmitz et al., 2020). In total, 48 pseudowords are contained within this data set, with half of them representing monomorphemic words (e.g. singular bloufs) and half of them representing plural wordforms (e.g. plural bloufs). As a number of pseudowords showed more than one consistent pronunciation, these are contained more than twice, e.g. prups.

## load pseudowords (ENGS)
engs.data <- read.csv("data666/final_S_data_LDL_types.csv")

dim(engs.data)
[1] 78  4

1.3 COMB data

The combined data set contains both data on real and on nonce words as introduced in 1.1 and 1.2. In total, 8440 words and 36 different affixes are part of this data set.

## create a combined data set
comb.data <- rbind(mald.data, engs.data)

## sort combined data set alphabetically
comb.data <- comb.data[order(comb.data$Word),]

dim(comb.data)
[1] 8440    4

2 A matrix

We can create an A matrix for real words using the semantic vectors constructed of the TASA corpus. The A matrix consists of semantic vectors (columns) for all words in orthographic form (rows).

X meat mechanic mechanical mechanism medal
eat 0.0118417 -0.0001872 0.0002885 -0.0002331 1.98e-05
echo 0.0005983 -0.0000384 -0.0000922 -0.0000899 -2.05e-05
eclipse 0.0000350 -0.0003962 -0.0000451 -0.0000278 -9.70e-06
## load TASA vectors
load("data_simon/wL2Ld.rda")

## find all unique MALD (= real word) lexomes
mald.subset.allLexomes = unique(c(mald.data$Word, mald.data$Affix))
mald.subset.allLexomes = mald.subset.allLexomes[!is.na(mald.subset.allLexomes)]

## find vectors for all unique lexomes > this then is the A matrix
mald.Amat = wL2Ld[mald.subset.allLexomes, ]

3 C matrices

C matrices contain all triphones of a data set (columns) which are coded binary for all words in phonological form (rows).

X X..b X.b. b.n X.nd nd. d.n X.nm nm. m.n X.nt nt. b.S X.S. X..b.1 X.bI
@b{ 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
@b{S 1 1 0 0 0 0 0 0 0 0 0 1 1 0 0
{bI 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
{ 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
{bdIk1t 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0

3.1 real words

The C matrix for real words is computed based on the MALD data set and \(Word+Affix\), i.e. word forms and affixes are considered.

mald.cues = make_cue_matrix(data = mald.data, formula = ~ Word + Affix, grams = 3, wordform = "DISC")

dim(mald.cues$matrices$C)
# 8328 7587

3.2 pseudowords

The C matrix for nonce words is computed based on the ENGS data set and \(Word+Affix\), i.e. word forms and affixes are considered.

engs.cues = make_cue_matrix(data = engs.data, formula = ~ Word + Affix, grams = 3, wordform = "DISC")

dim(engs.cues$matrices$C)
# 48 63

3.3 combined

The C matrix for real and nonce words is computed based on the combined data set and \(Word+Affix\), i.e. word forms and affixes are considered.

comb.cues = make_cue_matrix(data = comb.data, formula = ~ Word + Affix, grams = 3, wordform = "DISC")

dim(comb.cues$matrices$C)
# 8376 7609

4 S matrices

S matrices contain all semantic vectors of a data set (columns) and their pertinent values for all words in phonological form (rows).

X meat mechanic mechanical mechanism medal
it 0.0118417 -0.0001872 0.0002885 -0.0002331 1.98e-05
Ek5 0.0005983 -0.0000384 -0.0000922 -0.0000899 -2.05e-05
IklIps 0.0000350 -0.0003962 -0.0000451 -0.0000278 -9.70e-06

4.1 real words

The S matrix for real words is computed based on the MALD data set, the MALD A matrix, and \(Word+Affix\), i.e. word forms and affixes are considered.

mald.Smat = make_S_matrix(data = mald.data, formula = ~ Word + Affix, inputA = mald.Amat, grams = 3, wordform = "DISC")

dim(mald.Smat)
# 8328 5487

4.2 pseudowords

We can create a semantic matrix for pseudoword based on the semantics of real words solving the following equation

\[engs.Smat = engs.C * F\] with engs.Smat being the semantic matrix for pseudowords, engs.C being the C matrix contained within the cue matrix of pseudowords, and F being the transformation matrix for mapping real word cues onto real word semantics.

To obtain this real word transformation matrix F (Hp in Chuang et al., 2020) we first need to learn comprehension of real words.

mald.comp = learn_comprehension(cue_obj = mald.cues, S = mald.Smat)

Then, we can extract F from mald.comp.

F = mald.comp$F

dim(F) 
# 7587 5487

The following equation illustrates the use of the F transformation matrix in solving \(mald.C * F = estimated.mald.Smat\):

\[ \left(\begin{array}{cc} A & B & C\\ D & E & F\\ G & H & I \end{array}\right) \left(\begin{array}{cc} K & L & M\\ N & O & P\\ Q & R & S \end{array}\right) = \left(\begin{array}{cc} AK+BN+CQ & AL+BO+CR & AM+BP+CS\\ DK+EN+FQ & DL+EO+FR & DM+EP+FS\\ GK+HN+IQ & GL+HN+IR & GM+HP+IS \end{array}\right) \] That is, the F matrix corresponds to the order of columns and rows found in its original computation (i.e. in learn_comprehension). Thus, one must consider this order when using the F matrix in further calculations. Otherwise, wrong rows will be multiplied with wrong columns, as is shown in the following examples.

\[ \left(\begin{array}{cc} A & \alpha & B\\ D & \beta & E\\ G & \gamma & H \end{array}\right) \left(\begin{array}{cc} K & L & M\\ N & O & P\\ Q & R & S \end{array}\right) = \left(\begin{array}{cc} AK+\alpha N+BQ & AL+\alpha O+BR & AM+\alpha P+BS\\ DK+\beta N+EQ & DL+\beta O+ER & DM+\beta P+ES\\ GK+\gamma N+HQ & GL+\gamma N+HR & GM+\gamma P+HS \end{array}\right) \] Let’s assume the \((\alpha, \beta, \gamma)\) column represents the triphone pru and the \((B, E, H)\) column represents the triphone pli. In the above example, \((\alpha, \beta, \gamma)\) is multiplied with the \((N, O, P)\) row of the F matrix. However, in the real word C matrix, the second column is \((B, E, H)\). Thus, \((B, E, H)\) should be multiplied with \((N, O, P)\) instead. This is shown below:

\[ \left(\begin{array}{cc} A & B & \alpha \\ D & E & \beta \\ G & H & \gamma \end{array}\right) \left(\begin{array}{cc} K & L & M\\ N & O & P\\ Q & R & S \end{array}\right) = \left(\begin{array}{cc} AK+BN+\alpha Q & AL+BO+\alpha R & AM+BP+\alpha S\\ DK+EN+\beta Q & DL+EO+\beta R & DM+EP+\beta S\\ GK+HN+\gamma Q & GL+HN+\gamma R & GM+HP+\gamma S \end{array}\right) \]

Addtionally, for matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix (e.g. Nykamp, 2020).

Thus, we must make sure to use a pseudoword C matrix with the same order of columns as the real word C matrix. This, in turn, also expands the pseudoword C matrix, i.e. to entertain all column names of the real word C matrix we must have an identical number of columns.

Using two functions of the LDLConvFunctions package, this is easily done. First, find which column in the real word C matrix corresponds to which column in the pseudoword C matrix:

found_triphones <- find_triphones(pseudo_C_matrix = engs.cues, real_C_matrix = mald.cues)

Then, use this information to create a new pseudoword C matrix which a) has the same number of columns as the real word C matrix, and b) the same order of columns (i.e. triphones) as the real word C matrix:

pseudo.C.new <- reorder_pseudo_C_matrix(pseudo_C_matrix = engs.cues, real_C_matrix = mald.cues, found_triphones = found_triphones)

Now, we can solve the aforementioned equation \(engs.Smat = engs.C * F\).

engs.Smat = pseudo.C.new %*% F

dim(engs.Smat)
# 78 5487

# change rownames to pseudoword rownames (instead of numbers)
rownames(engs.Smat) <- engs.data$DISC

In this S matrix, 48 pseudowords (i.e. 78 pronunciations) and their semantic vectors are contained. However, all of our pseudoword pairs are homophones, thus, they show identical semantic vectors even though half of them contain a suffix, e.g. the semantic vectors for bloufs (non-morphemic S) and bloufs (plural S) are identical.

X fierce fifteen fifteenth fifth fifty
bl6fs -0.0001453 0.0002496 0.0002782 0.0006957 -0.0001303
bl6fs -0.0001453 0.0002496 0.0002782 0.0006957 -0.0001303
bl6ks 0.0000021 -0.0006015 0.0000090 -0.0012144 0.0003275
bl6ks 0.0000021 -0.0006015 0.0000090 -0.0012144 0.0003275

We “fix” this by adding the “PL” semantic vector values to the semantic vectors of plural pseudowords (idea by Yu-Ying).

Thus, we first need to extract the “PL” semantic vector contained in the general A matrix.

PL.vec <- mald.Amat[rownames(mald.Amat)=="PL"]

length(PL.vec)
# 5487

Taking a closer look at the pseudoword S matrix, we find that every second row corresponds to a plural pseudoword, i.e. this is where we add the PL vector.

engs.pl = engs.Smat[seq(2, nrow(engs.Smat), 2), ]

Taking this plural subset, we add the “PL” semantic vector to all rows.

engs.pl.added <- engs.pl + rep(PL.vec, each = nrow(engs.pl))

We then recombine the original monomorphemic S matrix and the modified plural S matrix, and recreate the original row order.

engs.nm = engs.Smat[seq(1, nrow(engs.Smat), 2), ]

engs.Smat <- rbind(engs.nm, engs.pl.added)

sort.rows <- row.names(engs.Smat.old)
engs.Smat <- engs.Smat[order(match(rownames(engs.Smat), sort.rows)), , drop = FALSE]

Now, homophonous pseudowords have differing semantic vectors.

X fierce fifteen fifteenth fifth fifty
bl6fs -0.0001453 0.0002496 0.0002782 0.0006957 -0.0001303
bl6fs -0.0000796 0.0001372 0.0003686 0.0003187 -0.0002868
bl6ks 0.0000021 -0.0006015 0.0000090 -0.0012144 0.0003275
bl6ks 0.0000678 -0.0007139 0.0000994 -0.0015913 0.0001709

4.3 combined

Finally, we can create one combined S matrix for real and nonce words.

comb.Smat <- rbind(mald.Smat, engs.Smat)

However, LDL commands in R are sensitive to row order. Thus, we need to recreate the original row order as found in the combined C matrix.

all(rownames(comb.cues$matrices$C) == rownames(comb.Smat))
# FALSE

comb.Smat2 <- comb.Smat[rownames(comb.cues$matrices$C),,drop=FALSE]

all(rownames(comb.cues$matrices$C) == rownames(comb.Smat2))
# TRUE

5 Comprehension

Using the combined C and S matrices, we can now learn and evaluate comprehension of real and nonce words in one go, i.e. as if they are encountered by a person with their pertinent lexicon.

First, comprehension is learned.

comb.comp = learn_comprehension(cue_obj = comb.cues, S = comb.Smat2)

Second, the accuracy of this comprehension is evaluated. That is, how good can form (C matrix) be mapped onto meaning (S matrix).

comb.comp_acc = accuracy_comprehension(m = comb.comp, data = comb.data, wordform = "DISC", show_rank = TRUE, neighborhood_density = TRUE)

comb.comp_acc$acc # 0.7440758

We find that comprehension accuracy is about \(74\%\).

6 Production

Using the combined C and S matrices as well as the comprehension results, we now are able to learn production and evaluate its accuracy.

First, production is learned.

comb.prod = learn_production(cue_obj = comb.cues, S = comb.Smat2, comp = comb.comp)

Second, the accuracy of this production is evaluated. That is, how good can meaning (S matrix) be mapped onto form (C matrix).

comb.prod_acc = accuracy_production(m = comb.prod, data = comb.data, wordform = "DISC", full_results = TRUE, return_triphone_supports = TRUE)

comb.prod_acc$acc # 0.9729858

We find that production accuracy is about \(97\%\).

7 Measures

7.1 Comprehension

Extracting and saving comprehension measures is straightforward.

comb.comp_measures = comprehension_measures(comp_acc = comb.comp_acc, Shat = comb.comp$Shat, S = comb.Smat2, A = mald.Amat, affixFunctions = comb.data$Affix)
write.csv(comb.comp_measures, "comb.comp_measures.csv")
  • recognition:
    whether the correct form was recognized
  • cor_max:
    the correlation between the pertinent vector in Shat and the semantic vector of its predicted form in S
  • cor_target:
    the correlation between the pertinent vector in Shat and the semantic vector of its targeted form in S
  • l1norm:
    city block distance / Manhattan distance of Shat
  • l2norm:
    euclidian (direct) distance / length of Shat
  • rank:
    the rank of correlation between s-hat and the semantic vector of its targeted form
  • density:
    the mean correlation of s-hat with the semantic vectors of its top 8 neighbors
  • cor_affix:
    the correlation of s-hat and the semantic vector of its affix function; more or less a measure of transparency

7.2 Production

Extracting and saving production measures runs into problems. It appears that there is 1 word for which comb.prod_acc does not contain all necessary information. Why? I have no clue.

I worked around this problem by removing the word from full, i.e. the list in which all necessary info is contained within comb.prod_acc.

full <- comb.prod_acc[["full"]]

Using the summary() function, the case with missing info is identified. I call this newly created data frame nineortwelve as complete entries consist of 12 entries per word, while incomplete entries consist of only 9 entries per word.

nineortwelve <- as.data.frame(summary(full))

Checking this data frame, the following entry can be found with only 9 individual entries instead of 12. Thus, it is removed from full2.

full2 <- full

full2[[3725]] <- NULL

Fun fact: entry 3725 belongs to the word inauguration.

This reduced full2 then replaces the original full in a copy of comb.prod_acc.

comb.prod_acc2 <- comb.prod_acc

comb.prod_acc2$full <- full2

Only then production measures can be obtained.

comb.prod_measures = production_measures(prod_acc = comb.prod_acc2, S = comb.Smat2, prod = comb.prod)
write.csv(comb.prod_measures, "comb.prod_measures.csv")
  • correlations:
    the correlation of the predicted path with the targeted semantic vector
  • path_counts:
    the number of paths detected
  • path_sum:
    the summed support for the predicted path, could be read as the uncertainty about how to pronounce the word due to competing activation of articulatory motor plans
  • path_entropies:
    Shannon entropy calculated over the path supports
  • lwlr:
    Simon: “The length-weakest link ratios of the predicted path (also sometimes equal to word length). When the length of a word’s path, as measured in number of nodes, becomes higher, while the weakest link between nodes stays the same, uncertainty becomes greater. If triphone support is high, i.e. when all the strengths of the links are almost equal to one, the value will be almost equal to the number of segments in the word. This may not be the case in other models, where the triphone support may be lower.”

8 Duration Data

The data set used in the remainder of this script contains the following parts:

  • ENGS production data (Schmitz et al., 2020)
  • LDL measures as extracted by the WpmWithLdl package
  • LDL measures as extracted by the LDLConvFunctions package

The data set contains 52 columns and 653 rows, i.e. 653 observations of 52 variables:

9 WpmWithLdl measures

WpmWithLdl includes the following measures via the comprehension_measures and production_measures functions: click here.

Without discussing the underlying implications of the individual measures, let’s see whether some of them are correlated. If so, we need to take some measures before including them in any models.

9.1 Correlations

Correlations of all WpmWithLdl measures:

pairscor.fnc(data[,c("cor_max", "cor_target", "l1norm", "l2norm", "density", "cor_affix",
                     "correlations", "path_counts", "path_sum", "path_entropies", "lwlr")])

Note: rank is not considered as a predictive variable as its value is \(1\) for all observations.

We find the following highly correlated (\(\rho\ge0.5\)) variable combinations:

  • cor_max vs. cor_target, \(\rho=1\)
  • cor_max vs. correlations, \(\rho=1\)
  • cor_target vs. correlations, \(\rho=1\)
  • l1norm vs. l2norm, \(\rho=0.95\)
  • density vs. cor_affix, \(\rho=0.84\)
  • path_counts vs. path_entropies, \(\rho=0.95\)
  • path_sum vs. lwlr, \(\rho=-0.76\)

Thus, we need to take some measures before we run into problems of colinearity later on.

9.2 PCA

To avoid colinearity issues when modelling, we combine all highly correlated variables in a principal component analysis.

PCA Computation

First, we extract the highly correlated variables and merge them to form a new data frame.

pca.data1 <- data[colnames(data)=="cor_max"]
pca.data2 <- data[colnames(data)=="cor_target"]
pca.data3 <- data[colnames(data)=="correlations"]
pca.data4 <- data[colnames(data)=="l1norm"]
pca.data5 <- data[colnames(data)=="l2norm"]
pca.data6 <- data[colnames(data)=="density"]
pca.data7 <- data[colnames(data)=="cor_affix"] # this variable is dropped as it only contains data on PL cases
pca.data8 <- data[colnames(data)=="path_counts"]
pca.data9 <- data[colnames(data)=="path_entropies"]
pca.data10 <- data[colnames(data)=="path_sum"]
pca.data11 <- data[colnames(data)=="lwlr"]

pca.data <- cbind(pca.data1,
                  pca.data2,
                  pca.data3,
                  pca.data4,
                  pca.data5,
                  pca.data6,
                  #pca.data7,
                  pca.data8,
                  pca.data9,
                  pca.data10,
                  pca.data11)

Then, we compute a principal component analysis using the princomp function:

pc <- princomp(pca.data, cor=TRUE, score=TRUE)

In order to decide which components to keep, we follow three criteria:

  1. The Eigenvalue-criterion: Any component that displays an eigenvalue greater than 1.00 is accounting for a greater amount of variance than had been contributed by one variable. Such a component is therefore accounting for a meaningful amount of variance, and is worthy of being retained.
  2. The scree-criterion: Plot the eigenvalues associated with each component and look for a “break” between the components with relatively large eigenvalues and those with small eigenvalues. The components that appear before the break are assumed to be meaningful and are retained (Cattell, 1966).
  3. The proportion-of-variance-criterion: Retain enough components so that the cumulative percent of variance accounted for is equal to some minimal value, i.e. 80 percent.
# load package to extract Eigenvalues
library("factoextra")

# extract Eigenvalues
get_eigenvalue(pc)
         eigenvalue variance.percent cumulative.variance.percent
Dim.1  3.155143e+00     3.155143e+01                    31.55143
Dim.2  2.782508e+00     2.782508e+01                    59.37650
Dim.3  2.002288e+00     2.002288e+01                    79.39938
Dim.4  1.111486e+00     1.111486e+01                    90.51424
Dim.5  6.368246e-01     6.368246e+00                    96.88248
Dim.6  2.192438e-01     2.192438e+00                    99.07492
Dim.7  6.284442e-02     6.284442e-01                    99.70337
Dim.8  2.966341e-02     2.966341e-01                   100.00000
Dim.9  7.170136e-17     7.170136e-16                   100.00000
Dim.10 0.000000e+00     0.000000e+00                   100.00000

Taking all three criteria into consideration, we find that Components 1, 2, and 3 show an Eigenvalue of \(v\ge1\), are followed by a break in Eigenvalues with the Eigenvalue of Component 3 being \(v=2\) and the Eigenvalue of Component 4 being \(v=1.1\), and that the cumulative percentage of variance accounted for by all three components is \(σ^2=79.4\%\), i.e. close to our target level of \(σ^2=80\%\).

Thus, we add components 1, 2, and 3 to our data frame.

pcframe <- as.data.frame(pc$scores)
pcframe <- pcframe[1:3]

names(pcframe)[names(pcframe) == "Comp.1"] <- "WWL_Comp.1"
names(pcframe)[names(pcframe) == "Comp.2"] <- "WWL_Comp.2"
names(pcframe)[names(pcframe) == "Comp.3"] <- "WWL_Comp.3"

data <- cbind(data, pcframe)

PCA Loadings

Now we can take a closer look at the individual components and their loadings.

pc$loadings

Loadings:
               Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9
cor_max         0.462  0.339                                            0.349
cor_target      0.462  0.339                                            0.465
correlations    0.462  0.339                                           -0.814
l1norm          0.211 -0.179 -0.579  0.179 -0.247        -0.509  0.484       
l2norm          0.213 -0.170 -0.592        -0.242 -0.158  0.501 -0.483       
density               -0.246         0.682  0.665  0.120                     
path_counts     0.296 -0.413        -0.396  0.283        -0.505 -0.499       
path_entropies  0.279 -0.408        -0.465  0.190  0.131  0.474  0.510       
path_sum       -0.208  0.297 -0.377 -0.272  0.516 -0.609         0.119       
lwlr            0.229 -0.328  0.403  0.219 -0.224 -0.755                     
               Comp.10
cor_max         0.738 
cor_target     -0.671 
correlations          
l1norm                
l2norm                
density               
path_counts           
path_entropies        
path_sum              
lwlr                  

               Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8 Comp.9
SS loadings       1.0    1.0    1.0    1.0    1.0    1.0    1.0    1.0    1.0
Proportion Var    0.1    0.1    0.1    0.1    0.1    0.1    0.1    0.1    0.1
Cumulative Var    0.1    0.2    0.3    0.4    0.5    0.6    0.7    0.8    0.9
               Comp.10
SS loadings        1.0
Proportion Var     0.1
Cumulative Var     1.0

But what do these loadings conceptually mean?

Component 1

Component 1 is rather highly correlated with cor_max, cor_target, and correlations.

For cor_max, high values indicate a high correlation between the estimated semantic vector and the semantic vector of another word.

For cor_target, high values indicate a high correlation between the predicted and targeted semantic vector.

For correlations, high values indicate a high correlation of the predicted path [and its predicted semantics] with the targeted semantic vector.

Note: Where does correlations come from? All candidate forms of a word have their own predicted semantic vectors; the value of correlations is the correlation value of the chosen candidate’s semantic vector and the semantic vector that is to be realized in production.

Thus, for Component 1, high values indicate higher semantic certainty, while lower values indicate higher semantic uncertainty.

Component 2

Component 1 is rather highly negatively correlated with path_counts and path_entropies.

For path_counts, high values indicate a high number of paths detected.

For path_entropies, high values indicate a high Shannon entropy calculated over the path supports.

In sum, Component 2 reflects the certainty in production, i.e. high values indicate low certainty.

Component 3

Component 3 is rather highly negatively correlated with l1norm and l2norm.

For l1norm, higher values imply more strong links to many other lexomes.

For l2norm, higher values imply more strong links to many other lexomes.

In sum, Component 3 reflects semantic certainty, i.e. higher values indicate higher certainty.

9.3 GAMMs

Let’s see whether the PCA components are significant predictors for word-final /s/ duration in singular and plural pseudowords. We will use Generalized Additive Mixed Models (GAMMs) to model /s/ durations.

Modelling

Step 0

First, create a model with all relevant variables, including control and PCA variables, and speaker as random effect:

gam.WpmWithLdl.1 <- bam(sDurLog ~
                          Affix +
                          s(WWL_Comp.1) +
                          s(WWL_Comp.2) +
                          s(WWL_Comp.3) +
                          SylPerMin+
                          pauseBin+
                          folType+
                          preC+
                          monoMultilingual+
                          s(speaker, bs="re"),
                        data=data, method = "REML")

Checking the anova() function’s output, we now exclude non-significant variables in a stepwise fashion. After each potential exclusion, compareML() is used to check whether the exclusion is justified in terms of model fit.

anova(gam.WpmWithLdl.1)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + s(WWL_Comp.1) + s(WWL_Comp.2) + s(WWL_Comp.3) + 
    SylPerMin + pauseBin + folType + preC + monoMultilingual + 
    s(speaker, bs = "re")

Parametric Terms:
                 df       F  p-value
Affix             1  30.020 6.29e-08
SylPerMin         1 817.699  < 2e-16
pauseBin          1  39.214 7.22e-10
folType           4   4.915 0.000662
preC              3   1.658 0.174939
monoMultilingual  1   0.452 0.501848

Approximate significance of smooth terms:
                 edf Ref.df     F  p-value
s(WWL_Comp.1)  1.000  1.001 0.012 0.914079
s(WWL_Comp.2)  3.105  3.943 1.424 0.222711
s(WWL_Comp.3)  3.182  3.926 5.524 0.000316
s(speaker)    31.041 38.000 4.992  < 2e-16

Step 1

Exclusion of WWL_Comp.1.

gam.WpmWithLdl.2 <- bam(sDurLog ~
                          Affix +
                          #s(WWL_Comp.1) +
                          s(WWL_Comp.2) +
                          s(WWL_Comp.3) +
                          SylPerMin+
                          pauseBin+
                          folType+
                          preC+
                          monoMultilingual+
                          s(speaker, bs="re"),
                        data=data, method = "REML")

compareML(gam.WpmWithLdl.1, gam.WpmWithLdl.2)
gam.WpmWithLdl.1: sDurLog ~ Affix + s(WWL_Comp.1) + s(WWL_Comp.2) + s(WWL_Comp.3) + 
    SylPerMin + pauseBin + folType + preC + monoMultilingual + 
    s(speaker, bs = "re")

gam.WpmWithLdl.2: sDurLog ~ Affix + s(WWL_Comp.2) + s(WWL_Comp.3) + SylPerMin + 
    pauseBin + folType + preC + monoMultilingual + s(speaker, 
    bs = "re")

Model gam.WpmWithLdl.2 preferred: lower REML score (3.083), and lower df (2.000).
-----
             Model     Score Edf Difference    Df
1 gam.WpmWithLdl.1 -148.2247  19                 
2 gam.WpmWithLdl.2 -151.3075  17     -3.083 2.000

AIC difference: 4.61, model gam.WpmWithLdl.2 has lower AIC.
anova(gam.WpmWithLdl.2)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + s(WWL_Comp.2) + s(WWL_Comp.3) + SylPerMin + 
    pauseBin + folType + preC + monoMultilingual + s(speaker, 
    bs = "re")

Parametric Terms:
                 df       F  p-value
Affix             1  28.318 1.45e-07
SylPerMin         1 819.008  < 2e-16
pauseBin          1  38.521 1.00e-09
folType           4   4.645  0.00106
preC              3   1.439  0.23027
monoMultilingual  1   0.430  0.51207

Approximate significance of smooth terms:
                 edf Ref.df     F  p-value
s(WWL_Comp.2)  1.000  1.001 2.644 0.104453
s(WWL_Comp.3)  3.093  3.891 4.927 0.000684
s(speaker)    31.068 38.000 5.007  < 2e-16

Step 2

Exclusion of WWL_Comp.2.

gam.WpmWithLdl.3 <- bam(sDurLog ~
                          Affix +
                          #s(WWL_Comp.1) +
                          #s(WWL_Comp.2) +
                          s(WWL_Comp.3) +
                          SylPerMin+
                          pauseBin+
                          folType+
                          preC+
                          monoMultilingual+
                          s(speaker, bs="re"),
                        data=data, method = "REML")

compareML(gam.WpmWithLdl.2, gam.WpmWithLdl.3)
gam.WpmWithLdl.2: sDurLog ~ Affix + s(WWL_Comp.2) + s(WWL_Comp.3) + SylPerMin + 
    pauseBin + folType + preC + monoMultilingual + s(speaker, 
    bs = "re")

gam.WpmWithLdl.3: sDurLog ~ Affix + s(WWL_Comp.3) + SylPerMin + pauseBin + folType + 
    preC + monoMultilingual + s(speaker, bs = "re")

Model gam.WpmWithLdl.3 preferred: lower REML score (2.707), and lower df (2.000).
-----
             Model     Score Edf Difference    Df
1 gam.WpmWithLdl.2 -151.3075  17                 
2 gam.WpmWithLdl.3 -154.0144  15     -2.707 2.000

AIC difference: -1.38, model gam.WpmWithLdl.2 has lower AIC.
anova(gam.WpmWithLdl.3)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + s(WWL_Comp.3) + SylPerMin + pauseBin + folType + 
    preC + monoMultilingual + s(speaker, bs = "re")

Parametric Terms:
                 df       F  p-value
Affix             1  26.856 2.99e-07
SylPerMin         1 816.208  < 2e-16
pauseBin          1  38.939 8.21e-10
folType           4   4.850 0.000742
preC              3   1.368 0.251565
monoMultilingual  1   0.445 0.505057

Approximate significance of smooth terms:
                 edf Ref.df     F  p-value
s(WWL_Comp.3)  3.185  4.002 4.819 0.000775
s(speaker)    30.943 38.000 4.915  < 2e-16

Step 3

Exclusion of monoMultilingual.

gam.WpmWithLdl.4 <- bam(sDurLog ~
                          Affix +
                          #s(WWL_Comp.1) +
                          #s(WWL_Comp.2) +
                          s(WWL_Comp.3) +
                          SylPerMin+
                          pauseBin+
                          folType+
                          preC+
                          #monoMultilingual+
                          s(speaker, bs="re"),
                        data=data, method = "REML")

compareML(gam.WpmWithLdl.3, gam.WpmWithLdl.4)
gam.WpmWithLdl.3: sDurLog ~ Affix + s(WWL_Comp.3) + SylPerMin + pauseBin + folType + 
    preC + monoMultilingual + s(speaker, bs = "re")

gam.WpmWithLdl.4: sDurLog ~ Affix + s(WWL_Comp.3) + SylPerMin + pauseBin + folType + 
    preC + s(speaker, bs = "re")

Model gam.WpmWithLdl.4 preferred: lower REML score (2.044), and lower df (1.000).
-----
             Model     Score Edf Difference    Df
1 gam.WpmWithLdl.3 -154.0144  15                 
2 gam.WpmWithLdl.4 -156.0587  14     -2.044 1.000

AIC difference: 0.33, model gam.WpmWithLdl.4 has lower AIC.
anova(gam.WpmWithLdl.4)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + s(WWL_Comp.3) + SylPerMin + pauseBin + folType + 
    preC + s(speaker, bs = "re")

Parametric Terms:
          df       F  p-value
Affix      1  26.825 3.03e-07
SylPerMin  1 831.268  < 2e-16
pauseBin   1  38.776 8.88e-10
folType    4   4.838 0.000757
preC       3   1.353 0.256172

Approximate significance of smooth terms:
                 edf Ref.df     F  p-value
s(WWL_Comp.3)  3.191  4.009 4.829 0.000752
s(speaker)    31.659 39.000 4.842  < 2e-16

Step 4

Exclusion of preC.

gam.WpmWithLdl.5 <- bam(sDurLog ~
                          Affix +
                          #s(WWL_Comp.1) +
                          #s(WWL_Comp.2) +
                          s(WWL_Comp.3) +
                          SylPerMin+
                          pauseBin+
                          folType+
                          #preC+
                          #monoMultilingual+
                          s(speaker, bs="re"),
                        data=data, method = "REML")

compareML(gam.WpmWithLdl.4, gam.WpmWithLdl.5)
gam.WpmWithLdl.4: sDurLog ~ Affix + s(WWL_Comp.3) + SylPerMin + pauseBin + folType + 
    preC + s(speaker, bs = "re")

gam.WpmWithLdl.5: sDurLog ~ Affix + s(WWL_Comp.3) + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

Model gam.WpmWithLdl.5 preferred: lower REML score (7.281), and lower df (3.000).
-----
             Model     Score Edf Difference    Df
1 gam.WpmWithLdl.4 -156.0587  14                 
2 gam.WpmWithLdl.5 -163.3395  11     -7.281 3.000

AIC difference: 1.40, model gam.WpmWithLdl.5 has lower AIC.
anova(gam.WpmWithLdl.5)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + s(WWL_Comp.3) + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

Parametric Terms:
          df       F  p-value
Affix      1  26.473 3.61e-07
SylPerMin  1 850.839  < 2e-16
pauseBin   1  39.885 5.19e-10
folType    4   5.143 0.000442

Approximate significance of smooth terms:
                 edf Ref.df     F p-value
s(WWL_Comp.3)  3.151  3.964 4.636 0.00106
s(speaker)    31.592 39.000 4.785 < 2e-16

Step 5

Is the smooth for WWL_Comp.3 necessary?

gam.WpmWithLdl.6 <- bam(sDurLog ~
                          Affix +
                          #s(WWL_Comp.1) +
                          #s(WWL_Comp.2) +
                          WWL_Comp.3 +
                          SylPerMin+
                          pauseBin+
                          folType+
                          #preC+
                          #monoMultilingual+
                          s(speaker, bs="re"),
                        data=data, method = "REML")

compareML(gam.WpmWithLdl.5, gam.WpmWithLdl.6)
gam.WpmWithLdl.5: sDurLog ~ Affix + s(WWL_Comp.3) + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

gam.WpmWithLdl.6: sDurLog ~ Affix + WWL_Comp.3 + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

Chi-square test of REML scores
-----
             Model     Score Edf Difference    Df p.value Sig.
1 gam.WpmWithLdl.6 -159.2081  10                              
2 gam.WpmWithLdl.5 -163.3395  11      4.131 1.000   0.004  ** 

AIC difference: -9.43, model gam.WpmWithLdl.5 has lower AIC.

Yes, let’s keep the smooth.

Summary

summary.gam(gam.WpmWithLdl.5)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + s(WWL_Comp.3) + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

Parametric coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)   -0.8722959  0.0473099 -18.438  < 2e-16 ***
AffixPL       -0.0754263  0.0146596  -5.145 3.61e-07 ***
SylPerMin     -0.5032766  0.0172537 -29.169  < 2e-16 ***
pauseBinpause  0.1122957  0.0177810   6.315 5.19e-10 ***
folTypeF      -0.0185203  0.0550083  -0.337 0.736472    
folTypeN       0.0002509  0.0211516   0.012 0.990540    
folTypeP       0.0095421  0.0185635   0.514 0.607421    
folTypeV      -0.0672956  0.0188033  -3.579 0.000372 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                 edf Ref.df     F p-value    
s(WWL_Comp.3)  3.151  3.964 4.636 0.00106 ** 
s(speaker)    31.592 39.000 4.785 < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.804   Deviance explained = 81.7%
-REML = -163.34  Scale est. = 0.029512  n = 653
Summary:
    * Affix : factor; set to the value(s): PL. 
    * SylPerMin : numeric predictor; set to the value(s): 2.336339423. 
    * pauseBin : factor; set to the value(s): no_pause. 
    * folType : factor; set to the value(s): APP. 
    * WWL_Comp.3 : numeric predictor; with 30 values ranging from -3.513786 to 3.834930. 
    * speaker : factor; set to the value(s): 2. (Might be canceled as random effect, check below.) 
    * NOTE : The following random effects columns are canceled: s(speaker)
 

However, as we only have a few values to the far left, but a clearly linear slope to the right for a multitude of data points, these results may indicate a rather linear relationship. That is, higher WWL_Comp.3 values lead to longer /s/ durations, i.e. a higher semantic certainty leads to longer /s/ durations.

9.4 LMER

Let’s see whether a linear mixed effects regression model does a similarly good job as compared to the previous GAMMs model.

Modelling

First, create a model with all relevant variables, including control and PCA variables, and speaker as random effect:

lmer.1 <- lmer(sDurLog ~
                 Affix +
                 WWL_Comp.1 +
                 WWL_Comp.2 +
                 WWL_Comp.3 +
                 SylPerMin +
                 pauseBin +
                 folType +
                 preC +
                 (1 | monoMultilingual) +
                 (1 | speaker),
               data=data, REML=T)

anova(lmer.1)
Type III Analysis of Variance Table with Satterthwaite's method
            Sum Sq Mean Sq NumDF  DenDF  F value    Pr(>F)    
Affix       0.8772  0.8772     1 627.56  29.3486 8.626e-08 ***
WWL_Comp.1  0.0094  0.0094     1 611.01   0.3131  0.575983    
WWL_Comp.2  0.0724  0.0724     1 610.91   2.4219  0.120166    
WWL_Comp.3  0.2908  0.2908     1 607.50   9.7290  0.001900 ** 
SylPerMin  24.3084 24.3084     1 524.79 813.2841 < 2.2e-16 ***
pauseBin    1.1917  1.1917     1 634.01  39.8705 5.105e-10 ***
folType     0.5508  0.1377     4 610.60   4.6074  0.001135 ** 
preC        0.1804  0.0601     3 607.23   2.0114  0.111149    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Using the step() function, we exclude all non-significant predictors:

step(lmer.1)
Backward reduced random-effect table:

                       Eliminated npar  logLik     AIC    LRT Df Pr(>Chisq)    
<none>                              17 144.931 -255.86                         
(1 | monoMultilingual)          1   16 144.931 -257.86  0.000  1          1    
(1 | speaker)                   0   15  96.682 -163.37 96.498  1     <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Backward reduced fixed-effect table:
Degrees of freedom method: Satterthwaite 

           Eliminated  Sum Sq Mean Sq NumDF  DenDF  F value    Pr(>F)    
WWL_Comp.1          1  0.0094  0.0094     1 611.01   0.3131 0.5759828    
preC                2  0.1711  0.0570     3 608.37   1.9105 0.1266333    
WWL_Comp.2          3  0.0708  0.0708     1 616.28   2.3597 0.1250230    
Affix               0  0.8281  0.8281     1 633.30  27.5107 2.133e-07 ***
WWL_Comp.3          0  0.2137  0.2137     1 612.02   7.0983 0.0079192 ** 
SylPerMin           0 24.9558 24.9558     1 532.92 829.1163 < 2.2e-16 ***
pauseBin            0  1.2666  1.2666     1 639.57  42.0816 1.754e-10 ***
folType             0  0.6187  0.1547     4 615.86   5.1387 0.0004454 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Model found:
sDurLog ~ Affix + WWL_Comp.3 + SylPerMin + pauseBin + folType + 
    (1 | speaker)

This leaves us with the following final model:

lmer.2 <- lmer(sDurLog ~
                 Affix +
                 WWL_Comp.3 +
                 SylPerMin +
                 pauseBin +
                 folType +
                 (1 | speaker),
               data=data, REML=T)

anova(lmer.2)
Type III Analysis of Variance Table with Satterthwaite's method
            Sum Sq Mean Sq NumDF  DenDF  F value    Pr(>F)    
Affix       0.8281  0.8281     1 633.30  27.5107 2.133e-07 ***
WWL_Comp.3  0.2137  0.2137     1 612.02   7.0983 0.0079192 ** 
SylPerMin  24.9558 24.9558     1 532.92 829.1163 < 2.2e-16 ***
pauseBin    1.2666  1.2666     1 639.57  42.0816 1.754e-10 ***
folType     0.6187  0.1547     4 615.86   5.1387 0.0004454 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

LMER vs. GAMM

Let’s compare the results of both, LMER and GAMM, models:

summary(lmer.2)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: sDurLog ~ Affix + WWL_Comp.3 + SylPerMin + pauseBin + folType +  
    (1 | speaker)
   Data: data

REML criterion at convergence: -318.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.8969 -0.5968 -0.0009  0.6709  3.3861 

Random effects:
 Groups   Name        Variance Std.Dev.
 speaker  (Intercept) 0.00846  0.09198 
 Residual             0.03010  0.17349 
Number of obs: 653, groups:  speaker, 40

Fixed effects:
                Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)    -0.886456   0.047377 391.229475 -18.711  < 2e-16 ***
AffixPL        -0.077320   0.014741 633.298586  -5.245 2.13e-07 ***
WWL_Comp.3      0.013042   0.004895 612.024712   2.664 0.007919 ** 
SylPerMin      -0.497273   0.017270 532.920182 -28.794  < 2e-16 ***
pauseBinpause   0.116108   0.017898 639.573724   6.487 1.75e-10 ***
folTypeF       -0.022973   0.055467 619.793165  -0.414 0.678891    
folTypeN       -0.001099   0.021344 610.959988  -0.051 0.958951    
folTypeP        0.008526   0.018726 612.168568   0.455 0.649052    
folTypeV       -0.068649   0.018978 619.630061  -3.617 0.000322 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) AffxPL WWL_C. SylPrM psBnps flTypF flTypN flTypP
AffixPL     -0.086                                                 
WWL_Comp.3   0.094  0.077                                          
SylPerMin   -0.893 -0.080 -0.108                                   
pauseBinpas -0.449 -0.124 -0.109  0.418                            
folTypeF    -0.087  0.059 -0.021  0.020  0.002                     
folTypeN    -0.136  0.091  0.026 -0.035 -0.072  0.135              
folTypeP    -0.075  0.029 -0.012 -0.117 -0.093  0.153  0.409       
folTypeV     0.045 -0.016  0.021 -0.230 -0.227  0.139  0.413  0.489
summary.gam(gam.WpmWithLdl.5)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + s(WWL_Comp.3) + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

Parametric coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)   -0.8722959  0.0473099 -18.438  < 2e-16 ***
AffixPL       -0.0754263  0.0146596  -5.145 3.61e-07 ***
SylPerMin     -0.5032766  0.0172537 -29.169  < 2e-16 ***
pauseBinpause  0.1122957  0.0177810   6.315 5.19e-10 ***
folTypeF      -0.0185203  0.0550083  -0.337 0.736472    
folTypeN       0.0002509  0.0211516   0.012 0.990540    
folTypeP       0.0095421  0.0185635   0.514 0.607421    
folTypeV      -0.0672956  0.0188033  -3.579 0.000372 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                 edf Ref.df     F p-value    
s(WWL_Comp.3)  3.151  3.964 4.636 0.00106 ** 
s(speaker)    31.592 39.000 4.785 < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.804   Deviance explained = 81.7%
-REML = -163.34  Scale est. = 0.029512  n = 653

10 LDLConvFunctions measures

LDLConvFunctions includes the following measures as implemented in Chuang et al. (2020): click here.

Without discussing the underlying implications of the individual measures, let’s see whether some of them are correlated. If so, we need to take some measures before including them in any models.

10.1 Correlations

Correlations of all LDLConvFunctions measures:

pairscor.fnc(data[,c("ALC", "ALDC", "DRC", "EDNN", "NNC", "SCPP")])

We find the following highly correlated (\(\rho\ge0.5\)) variable combination:

  • EDNN vs. SCPP, \(\rho=1\)

Thus, we need to take some measures before we run into problems of colinearity later on.

10.2 PCA

To avoid colinearity issues when modelling, we combine all highly correlated variables in a principal component analysis.

PCA Computation

First, we extract the highly correlated variables and merge them to form a new data frame.

pca.data1 <- data[colnames(data)=="EDNN"]
pca.data2 <- data[colnames(data)=="SCPP"]


pca.data <- cbind(pca.data1,
                  pca.data2)

Then, we compute a principal component analysis using the princomp function:

pc <- princomp(pca.data, cor=TRUE, score=TRUE)

In order to decide which components to keep, we follow three criteria:

  1. The Eigenvalue-criterion: Any component that displays an eigenvalue greater than 1.00 is accounting for a greater amount of variance than had been contributed by one variable. Such a component is therefore accounting for a meaningful amount of variance, and is worthy of being retained.
  2. The scree-criterion: Plot the eigenvalues associated with each component and look for a “break” between the components with relatively large eigenvalues and those with small eigenvalues. The components that appear before the break are assumed to be meaningful and are retained (Cattell, 1966).
  3. The proportion-of-variance-criterion: Retain enough components so that the cumulative percent of variance accounted for is equal to some minimal value, i.e. 80 percent.
# load package to extract Eigenvalues
library("factoextra")

# extract Eigenvalues
get_eigenvalue(pc)
      eigenvalue variance.percent cumulative.variance.percent
Dim.1  1.8592701        92.963503                     92.9635
Dim.2  0.1407299         7.036497                    100.0000

Taking all three criteria into consideration, we find that Component 1 shows an Eigenvalue of \(v\ge1\), is followed by a break in Eigenvalues with the Eigenvalue of Component 1 being \(v=1.9\) and the Eigenvalue of Component 2 being \(v=0.1\), and that the cumulative percentage of variance accounted for by Component 1 is \(σ^2=93\%\), i.e. above our target level of \(σ^2=80\%\).

Thus, we add component 1 to our data frame.

pcframe <- as.data.frame(pc$scores)
pcframe <- pcframe[1]


names(pcframe)[names(pcframe) == "Comp.1"] <- "LCF_Comp.1"


data <- cbind(data, pcframe)

PCA Loadings

Now we can take a closer look at the component’s loadings.

pc$loadings

Loadings:
     Comp.1 Comp.2
EDNN  0.707  0.707
SCPP -0.707  0.707

               Comp.1 Comp.2
SS loadings       1.0    1.0
Proportion Var    0.5    0.5
Cumulative Var    0.5    1.0

But what do these loadings conceptually mean?

Component 1

Component 1 is rather highly correlated with EDNN and rather highly negatively correlated with SCPP.

For EDNN, high values indicate that a word is more distant from its nearest word neighbour.

For SCPP, high values indicate a high correlation of the predicted path [and its predicted semantics] with the targeted semantic vector.

Note: SCPP is identical to the correlations measure of the WpmWithLdl package.

Thus, for Component 1, high values indicate less activation diversity, while lower values indicate more activation diversity.

10.3 GAMMs

Let’s see whether PCA component and LCF measures are significant predictors for word-final /s/ duration in singular and plural pseudowords. We will use Generalized Additive Mixed Models (GAMMs) to model /s/ durations.

Modelling

Step 0

First, create a model with all relevant variables, including control and PCA variables, and speaker as random effect:

gam.LCF.1 <- bam(sDurLog ~
                          Affix +
                          s(LCF_Comp.1) +
                          s(ALC) +
                          s(ALDC, k=3) +
                          s(DRC) +
                          s(NNC) +
                          SylPerMin+
                          pauseBin+
                          folType+
                          preC+
                          monoMultilingual+
                          s(speaker, bs="re"),
                        data=data, method = "REML")

Checking the anova() function’s output, we now exclude non-significant variables in a stepwise fashion. After each potential exclusion, compareML() is used to check whether the exclusion is justified in terms of model fit.

anova(gam.LCF.1)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + s(LCF_Comp.1) + s(ALC) + s(ALDC, k = 3) + s(DRC) + 
    s(NNC) + SylPerMin + pauseBin + folType + preC + monoMultilingual + 
    s(speaker, bs = "re")

Parametric Terms:
                 df       F  p-value
Affix             1  18.228 2.28e-05
SylPerMin         1 826.702  < 2e-16
pauseBin          1  40.703 3.53e-10
folType           4   4.794 0.000819
preC              3   2.527 0.056543
monoMultilingual  1   0.352 0.553418

Approximate significance of smooth terms:
                 edf Ref.df     F p-value
s(LCF_Comp.1)  1.000  1.000 0.339  0.5606
s(ALC)         2.334  2.917 3.856  0.0104
s(ALDC)        1.000  1.000 0.031  0.8602
s(DRC)         1.443  1.746 0.996  0.4691
s(NNC)         1.647  2.031 2.820  0.0585
s(speaker)    31.391 38.000 5.275  <2e-16

Step 1

Exclusion of ALDC.

gam.LCF.2 <- bam(sDurLog ~
                   Affix +
                   s(LCF_Comp.1) +
                   s(ALC) +
                   #s(ALDC, k=3) +
                   s(DRC) +
                   s(NNC) +
                   SylPerMin+
                   pauseBin+
                   folType+
                   preC+
                   monoMultilingual+
                   s(speaker, bs="re"),
                 data=data, method = "REML")

compareML(gam.LCF.1, gam.LCF.2)
gam.LCF.1: sDurLog ~ Affix + s(LCF_Comp.1) + s(ALC) + s(ALDC, k = 3) + s(DRC) + 
    s(NNC) + SylPerMin + pauseBin + folType + preC + monoMultilingual + 
    s(speaker, bs = "re")

gam.LCF.2: sDurLog ~ Affix + s(LCF_Comp.1) + s(ALC) + s(DRC) + s(NNC) + 
    SylPerMin + pauseBin + folType + preC + monoMultilingual + 
    s(speaker, bs = "re")

Model gam.LCF.2 preferred: lower REML score (3.869), and lower df (2.000).
-----
      Model     Score Edf Difference    Df
1 gam.LCF.1 -145.5339  23                 
2 gam.LCF.2 -149.4032  21     -3.869 2.000

AIC difference: 2.76, model gam.LCF.2 has lower AIC.
anova(gam.LCF.2)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + s(LCF_Comp.1) + s(ALC) + s(DRC) + s(NNC) + 
    SylPerMin + pauseBin + folType + preC + monoMultilingual + 
    s(speaker, bs = "re")

Parametric Terms:
                 df       F  p-value
Affix             1  18.548 1.93e-05
SylPerMin         1 830.084  < 2e-16
pauseBin          1  40.760 3.44e-10
folType           4   4.803 0.000806
preC              3   2.506 0.058130
monoMultilingual  1   0.352 0.553339

Approximate significance of smooth terms:
                 edf Ref.df     F p-value
s(LCF_Comp.1)  1.000  1.001 0.383 0.53643
s(ALC)         2.389  2.980 3.920 0.00896
s(DRC)         1.505  1.831 1.174 0.40004
s(NNC)         1.624  2.003 2.874 0.05689
s(speaker)    31.408 38.000 5.287 < 2e-16

Step 2

Exclusion of LCF_Comp.1.

gam.LCF.3 <- bam(sDurLog ~
                   Affix +
                   #s(LCF_Comp.1) +
                   s(ALC) +
                   #s(ALDC, k=3) +
                   s(DRC) +
                   s(NNC) +
                   SylPerMin+
                   pauseBin+
                   folType+
                   preC+
                   monoMultilingual+
                   s(speaker, bs="re"),
                 data=data, method = "REML")

compareML(gam.LCF.2, gam.LCF.3)
gam.LCF.2: sDurLog ~ Affix + s(LCF_Comp.1) + s(ALC) + s(DRC) + s(NNC) + 
    SylPerMin + pauseBin + folType + preC + monoMultilingual + 
    s(speaker, bs = "re")

gam.LCF.3: sDurLog ~ Affix + s(ALC) + s(DRC) + s(NNC) + SylPerMin + pauseBin + 
    folType + preC + monoMultilingual + s(speaker, bs = "re")

Model gam.LCF.3 preferred: lower REML score (3.834), and lower df (2.000).
-----
      Model     Score Edf Difference    Df
1 gam.LCF.2 -149.4032  21                 
2 gam.LCF.3 -153.2369  19     -3.834 2.000

AIC difference: 1.70, model gam.LCF.3 has lower AIC.
anova(gam.LCF.3)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + s(ALC) + s(DRC) + s(NNC) + SylPerMin + pauseBin + 
    folType + preC + monoMultilingual + s(speaker, bs = "re")

Parametric Terms:
                 df       F  p-value
Affix             1  18.619 1.86e-05
SylPerMin         1 831.416  < 2e-16
pauseBin          1  40.924 3.17e-10
folType           4   4.911 0.000667
preC              3   2.770 0.040903
monoMultilingual  1   0.347 0.555956

Approximate significance of smooth terms:
              edf Ref.df     F p-value
s(ALC)      2.383  2.973 4.280 0.00554
s(DRC)      1.562  1.910 1.116 0.39325
s(NNC)      1.633  2.015 3.027 0.04825
s(speaker) 31.383 38.000 5.281 < 2e-16

Step 3

Exclusion of DRC.

gam.LCF.4 <- bam(sDurLog ~
                   Affix +
                   #s(LCF_Comp.1) +
                   s(ALC) +
                   #s(ALDC, k=3) +
                   #s(DRC) +
                   s(NNC) +
                   SylPerMin+
                   pauseBin+
                   folType+
                   preC+
                   monoMultilingual+
                   s(speaker, bs="re"),
                 data=data, method = "REML")

compareML(gam.LCF.3, gam.LCF.4)
gam.LCF.3: sDurLog ~ Affix + s(ALC) + s(DRC) + s(NNC) + SylPerMin + pauseBin + 
    folType + preC + monoMultilingual + s(speaker, bs = "re")

gam.LCF.4: sDurLog ~ Affix + s(ALC) + s(NNC) + SylPerMin + pauseBin + folType + 
    preC + monoMultilingual + s(speaker, bs = "re")

Model gam.LCF.4 preferred: lower REML score (3.158), and lower df (2.000).
-----
      Model     Score Edf Difference    Df
1 gam.LCF.3 -153.2369  19                 
2 gam.LCF.4 -156.3945  17     -3.158 2.000

AIC difference: 2.49, model gam.LCF.4 has lower AIC.
anova(gam.LCF.4)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + s(ALC) + s(NNC) + SylPerMin + pauseBin + folType + 
    preC + monoMultilingual + s(speaker, bs = "re")

Parametric Terms:
                 df       F  p-value
Affix             1  21.522 4.29e-06
SylPerMin         1 829.083  < 2e-16
pauseBin          1  41.677 2.21e-10
folType           4   4.840 0.000756
preC              3   2.173 0.090067
monoMultilingual  1   0.378 0.538845

Approximate significance of smooth terms:
              edf Ref.df     F  p-value
s(ALC)      2.653  3.304 5.397 0.000872
s(NNC)      1.326  1.573 4.464 0.012851
s(speaker) 31.367 38.000 5.273  < 2e-16

Step 4

Exclusion of monoMultilingual.

gam.LCF.5 <- bam(sDurLog ~
                   Affix +
                   #s(LCF_Comp.1) +
                   s(ALC) +
                   #s(ALDC, k=3) +
                   #s(DRC) +
                   s(NNC) +
                   SylPerMin+
                   pauseBin+
                   folType+
                   preC+
                   #monoMultilingual+
                   s(speaker, bs="re"),
                 data=data, method = "REML")

compareML(gam.LCF.4, gam.LCF.5)
gam.LCF.4: sDurLog ~ Affix + s(ALC) + s(NNC) + SylPerMin + pauseBin + folType + 
    preC + monoMultilingual + s(speaker, bs = "re")

gam.LCF.5: sDurLog ~ Affix + s(ALC) + s(NNC) + SylPerMin + pauseBin + folType + 
    preC + s(speaker, bs = "re")

Model gam.LCF.5 preferred: lower REML score (2.056), and lower df (1.000).
-----
      Model     Score Edf Difference    Df
1 gam.LCF.4 -156.3945  17                 
2 gam.LCF.5 -158.4508  16     -2.056 1.000

AIC difference: 0.37, model gam.LCF.5 has lower AIC.
anova(gam.LCF.5)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + s(ALC) + s(NNC) + SylPerMin + pauseBin + folType + 
    preC + s(speaker, bs = "re")

Parametric Terms:
          df       F  p-value
Affix      1  21.459 4.43e-06
SylPerMin  1 844.014  < 2e-16
pauseBin   1  41.531 2.37e-10
folType    4   4.828 0.000771
preC       3   2.174 0.089918

Approximate significance of smooth terms:
              edf Ref.df     F  p-value
s(ALC)      2.657  3.309 5.392 0.000872
s(NNC)      1.338  1.593 4.423 0.012882
s(speaker) 32.088 39.000 5.190  < 2e-16

Step 5

Exclusion of preC.

gam.LCF.6 <- bam(sDurLog ~
                   Affix +
                   #s(LCF_Comp.1) +
                   s(ALC) +
                   #s(ALDC, k=3) +
                   #s(DRC) +
                   s(NNC) +
                   SylPerMin+
                   pauseBin+
                   folType+
                   #preC+
                   #monoMultilingual+
                   s(speaker, bs="re"),
                 data=data, method = "REML")

compareML(gam.LCF.5, gam.LCF.6)
gam.LCF.5: sDurLog ~ Affix + s(ALC) + s(NNC) + SylPerMin + pauseBin + folType + 
    preC + s(speaker, bs = "re")

gam.LCF.6: sDurLog ~ Affix + s(ALC) + s(NNC) + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

Model gam.LCF.6 preferred: lower REML score (6.212), and lower df (3.000).
-----
      Model     Score Edf Difference    Df
1 gam.LCF.5 -158.4508  16                 
2 gam.LCF.6 -164.6630  13     -6.212 3.000

AIC difference: 2.93, model gam.LCF.6 has lower AIC.
anova(gam.LCF.6)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + s(ALC) + s(NNC) + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

Parametric Terms:
          df       F  p-value
Affix      1  23.657 1.47e-06
SylPerMin  1 862.936  < 2e-16
pauseBin   1  41.921 1.96e-10
folType    4   5.055 0.000517

Approximate significance of smooth terms:
              edf Ref.df     F p-value
s(ALC)      2.762  3.436 4.627 0.00241
s(NNC)      1.000  1.000 7.142 0.00772
s(speaker) 32.011 39.000 5.111 < 2e-16

Step 6

Remove smooth of NNC.

gam.LCF.7 <- bam(sDurLog ~
                   Affix +
                   #s(LCF_Comp.1) +
                   s(ALC) +
                   #s(ALDC, k=3) +
                   #s(DRC) +
                   NNC +
                   SylPerMin+
                   pauseBin+
                   folType+
                   #preC+
                   #monoMultilingual+
                   s(speaker, bs="re"),
                 data=data, method = "REML")

compareML(gam.LCF.6, gam.LCF.7)
gam.LCF.6: sDurLog ~ Affix + s(ALC) + s(NNC) + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

gam.LCF.7: sDurLog ~ Affix + s(ALC) + NNC + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

Model gam.LCF.7 preferred: lower REML score (3.184), and lower df (1.000).
-----
      Model     Score Edf Difference    Df
1 gam.LCF.6 -164.6630  13                 
2 gam.LCF.7 -167.8472  12     -3.184 1.000

AIC difference: 0.02, model gam.LCF.7 has lower AIC.
anova(gam.LCF.7)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + s(ALC) + NNC + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

Parametric Terms:
          df       F  p-value
Affix      1  23.660 1.47e-06
NNC        1   7.145 0.007717
SylPerMin  1 862.936  < 2e-16
pauseBin   1  41.921 1.96e-10
folType    4   5.055 0.000517

Approximate significance of smooth terms:
              edf Ref.df     F p-value
s(ALC)      2.762  3.436 4.627 0.00241
s(speaker) 32.011 39.000 5.111 < 2e-16

Step 7

Remove smooth of ALC.

gam.LCF.8 <- bam(sDurLog ~
                   Affix +
                   #s(LCF_Comp.1) +
                   ALC +
                   #s(ALDC, k=3) +
                   #s(DRC) +
                   NNC +
                   SylPerMin+
                   pauseBin+
                   folType+
                   #preC+
                   #monoMultilingual+
                   s(speaker, bs="re"),
                 data=data, method = "REML")

compareML(gam.LCF.7, gam.LCF.8)
gam.LCF.7: sDurLog ~ Affix + s(ALC) + NNC + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

gam.LCF.8: sDurLog ~ Affix + ALC + NNC + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

Model gam.LCF.8 preferred: lower REML score (2.857), and lower df (1.000).
-----
      Model     Score Edf Difference    Df
1 gam.LCF.7 -167.8472  12                 
2 gam.LCF.8 -170.7037  11     -2.857 1.000

AIC difference: -4.35, model gam.LCF.7 has lower AIC.
anova(gam.LCF.8)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + ALC + NNC + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

Parametric Terms:
          df       F  p-value
Affix      1  24.728 8.59e-07
ALC        1   9.188 0.002539
NNC        1   5.866 0.015729
SylPerMin  1 852.004  < 2e-16
pauseBin   1  42.316 1.62e-10
folType    4   5.131 0.000452

Approximate significance of smooth terms:
             edf Ref.df     F p-value
s(speaker) 31.82  39.00 4.959  <2e-16

Summary

summary.gam(gam.LCF.8)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + ALC + NNC + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

Parametric coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)   -0.353527   0.215506  -1.640  0.10143    
AffixPL       -0.091180   0.018336  -4.973 8.59e-07 ***
ALC           -2.961153   0.976879  -3.031  0.00254 ** 
NNC           -0.502288   0.207393  -2.422  0.01573 *  
SylPerMin     -0.502371   0.017211 -29.189  < 2e-16 ***
pauseBinpause  0.114892   0.017662   6.505 1.62e-10 ***
folTypeF      -0.034296   0.054918  -0.624  0.53254    
folTypeN      -0.003295   0.021097  -0.156  0.87594    
folTypeP       0.008877   0.018509   0.480  0.63170    
folTypeV      -0.067817   0.018782  -3.611  0.00033 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
             edf Ref.df     F p-value    
s(speaker) 31.82     39 4.959  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.805   Deviance explained = 81.7%
-REML = -170.7  Scale est. = 0.029397  n = 653

However, as we only have linear predictors in our final model, doing GAMMs appears to be rather unnecessary. Thus, let’s see what LMER models can do.

10.4 LMER

Let’s see whether a linear mixed effects regression model does a similarly good job as compared to the previous GAMMs model.

Modelling

First, create a model with all relevant variables, including control and PCA variables, and speaker as random effect:

lmer.1 <- lmer(sDurLog ~
                 Affix +
                 LCF_Comp.1 +
                 ALC +
                 ALDC +
                 DRC +
                 NNC +
                 SylPerMin+
                 pauseBin+
                 folType+
                 preC+
                 (1 | monoMultilingual) +
                 (1 | speaker),
               data=data, REML=T)

anova(lmer.1)
Type III Analysis of Variance Table with Satterthwaite's method
            Sum Sq Mean Sq NumDF  DenDF  F value    Pr(>F)    
Affix       0.6109  0.6109     1 618.06  20.9409 5.727e-06 ***
LCF_Comp.1  0.0076  0.0076     1 609.27   0.2612 0.6094903    
ALC         0.2621  0.2621     1 609.61   8.9846 0.0028334 ** 
ALDC        0.0128  0.0128     1 610.09   0.4389 0.5079133    
DRC         0.0396  0.0396     1 603.60   1.3568 0.2445494    
NNC         0.0959  0.0959     1 604.22   3.2866 0.0703435 .  
SylPerMin  24.2848 24.2848     1 536.13 832.4275 < 2.2e-16 ***
pauseBin    1.1868  1.1868     1 631.01  40.6805 3.467e-10 ***
folType     0.5527  0.1382     4 608.14   4.7359 0.0009061 ***
preC        0.2271  0.0757     3 604.96   2.5952 0.0516470 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Using the step() function, we exclude all non-significant predictors:

step(lmer.1)
Backward reduced random-effect table:

                       Eliminated npar logLik     AIC   LRT Df Pr(>Chisq)    
<none>                              19 157.21 -276.41                        
(1 | monoMultilingual)          1   18 157.21 -278.41   0.0  1          1    
(1 | speaker)                   0   17 105.46 -176.92 103.5  1     <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Backward reduced fixed-effect table:
Degrees of freedom method: Satterthwaite 

           Eliminated  Sum Sq Mean Sq NumDF  DenDF  F value    Pr(>F)    
LCF_Comp.1          1  0.0076  0.0076     1 609.27   0.2612 0.6094904    
ALDC                2  0.0166  0.0166     1 612.23   0.5697 0.4506751    
DRC                 3  0.0627  0.0627     1 605.03   2.1512 0.1429746    
preC                4  0.2058  0.0686     3 608.03   2.3513 0.0712936 .  
Affix               0  0.7270  0.7270     1 625.06  24.7284 8.538e-07 ***
ALC                 0  0.2701  0.2701     1 615.70   9.1884 0.0025378 ** 
NNC                 0  0.1724  0.1724     1 611.25   5.8657 0.0157289 *  
SylPerMin           0 25.0468 25.0468     1 541.64 852.0041 < 2.2e-16 ***
pauseBin            0  1.2440  1.2440     1 637.49  42.3161 1.571e-10 ***
folType             0  0.6034  0.1508     4 614.43   5.1310 0.0004516 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Model found:
sDurLog ~ Affix + ALC + NNC + SylPerMin + pauseBin + folType + 
    (1 | speaker)

This leaves us with the following final model:

lmer.2 <- lmer(sDurLog ~ 
                 Affix + 
                 ALC + 
                 NNC + 
                 SylPerMin + 
                 pauseBin + 
                 folType + 
                 (1 | speaker),
               data=data, REML=T)

anova(lmer.2)
Type III Analysis of Variance Table with Satterthwaite's method
           Sum Sq Mean Sq NumDF  DenDF  F value    Pr(>F)    
Affix      0.7270  0.7270     1 625.06  24.7284 8.538e-07 ***
ALC        0.2701  0.2701     1 615.70   9.1884 0.0025378 ** 
NNC        0.1724  0.1724     1 611.25   5.8657 0.0157289 *  
SylPerMin 25.0468 25.0468     1 541.64 852.0041 < 2.2e-16 ***
pauseBin   1.2440  1.2440     1 637.49  42.3161 1.571e-10 ***
folType    0.6034  0.1508     4 614.43   5.1310 0.0004516 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

LMER vs. GAMM

Let’s compare the results of both, LMER and GAMM, models:

summary(lmer.2)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: sDurLog ~ Affix + ALC + NNC + SylPerMin + pauseBin + folType +  
    (1 | speaker)
   Data: data

REML criterion at convergence: -341.4

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-2.6783 -0.6057 -0.0213  0.7076  3.3649 

Random effects:
 Groups   Name        Variance Std.Dev.
 speaker  (Intercept) 0.00892  0.09445 
 Residual             0.02940  0.17146 
Number of obs: 653, groups:  speaker, 40

Fixed effects:
                Estimate Std. Error         df t value Pr(>|t|)    
(Intercept)    -0.353527   0.215506 627.518776  -1.640  0.10141    
AffixPL        -0.091180   0.018336 625.056760  -4.973 8.54e-07 ***
ALC            -2.961153   0.976879 615.698014  -3.031  0.00254 ** 
NNC            -0.502288   0.207393 611.245425  -2.422  0.01573 *  
SylPerMin      -0.502371   0.017211 541.640869 -29.189  < 2e-16 ***
pauseBinpause   0.114892   0.017662 637.486511   6.505 1.57e-10 ***
folTypeF       -0.034296   0.054918 618.211537  -0.624  0.53253    
folTypeN       -0.003295   0.021097 609.693974  -0.156  0.87594    
folTypeP        0.008877   0.018509 610.753479   0.480  0.63170    
folTypeV       -0.067817   0.018782 618.064091  -3.611  0.00033 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
            (Intr) AffxPL ALC    NNC    SylPrM psBnps flTypF flTypN flTypP
AffixPL     -0.568                                                        
ALC          0.246 -0.364                                                 
NNC         -0.975  0.574 -0.294                                          
SylPerMin   -0.297 -0.013  0.050  0.104                                   
pauseBinpas -0.153 -0.071  0.038  0.057  0.415                            
folTypeF    -0.042  0.050  0.045  0.022  0.024  0.004                     
folTypeN    -0.013  0.056  0.025 -0.018 -0.031 -0.069  0.136              
folTypeP    -0.026  0.031 -0.005  0.011 -0.117 -0.095  0.153  0.409       
folTypeV     0.058 -0.045  0.021 -0.050 -0.232 -0.228  0.138  0.413  0.489
summary.gam(gam.LCF.8)

Family: gaussian 
Link function: identity 

Formula:
sDurLog ~ Affix + ALC + NNC + SylPerMin + pauseBin + folType + 
    s(speaker, bs = "re")

Parametric coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)   -0.353527   0.215506  -1.640  0.10143    
AffixPL       -0.091180   0.018336  -4.973 8.59e-07 ***
ALC           -2.961153   0.976879  -3.031  0.00254 ** 
NNC           -0.502288   0.207393  -2.422  0.01573 *  
SylPerMin     -0.502371   0.017211 -29.189  < 2e-16 ***
pauseBinpause  0.114892   0.017662   6.505 1.62e-10 ***
folTypeF      -0.034296   0.054918  -0.624  0.53254    
folTypeN      -0.003295   0.021097  -0.156  0.87594    
folTypeP       0.008877   0.018509   0.480  0.63170    
folTypeV      -0.067817   0.018782  -3.611  0.00033 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
             edf Ref.df     F p-value    
s(speaker) 31.82     39 4.959  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.805   Deviance explained = 81.7%
-REML = -170.7  Scale est. = 0.029397  n = 653

Interpretation

For ALC, higher values indicate that a pseudoword’s semantic vector is part of a denser semantic neighbourhood.

For NNC, higher values indicate that a pseudoword’s semantics are close to a real word’s semantics.

References

Baayen, R. H., Chuang, Y. Y., and Blevins, J. P. (2018). Inflectional morphology with linear mappings. The Mental Lexicon, 13 (2), 232-270. doi.org/10.1075/ml.18010.baa

Baayen, R. H., Chuang, Y. Y., Shafaei-Bajestan, E., & Blevins, J. P. (2019). The discriminative lexicon: A unified computational model for the lexicon and lexical processing in comprehension and production grounded not in (de)composition but in linear discriminative learning. Complexity, 2019, 1-39. doi.org/10.1155/2019/4895891

Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behavioral Research, 1, 245-276.

Chuang, Y-Y., Vollmer, M-l., Shafaei-Bajestan, E., Gahl, S., Hendrix, P., & Baayen, R. H. (2020). The processing of pseudoword form and meaning in production and comprehension: A computational modeling approach using Linear Discriminative Learning. Behavior Research Methods, 1-51. doi.org/10.3758/s13428-020-01356-w

Nykamp, D. (2020, November 10). Multiplying matrices and vectors. Math Insight. mathinsight.org/matrix_vector_multiplication

Schmitz, D., Baer-Henney, D., & Plag, I. (submitted). The duration of word-final /s/ differs across morphological categories in English: Evidence from pseudowords.

Tucker, B. V., Brenner, D., Danielson, D. K., Kelley, M. C., Nenadic, F., & Sims, M. (2018). The massive auditory lexical decision (MALD) database. Behavior Research Methods, 1-18. doi.org/10.3758/s13428-018-1056-1