WEBVTT

00:00:02.000 --> 00:00:05.000
hi everyone here we are in the video

00:00:05.000 --> 00:00:08.000
instructions for homework 1 assignment

00:00:08.000 --> 00:00:10.000
one I've probably called it both of

00:00:10.000 --> 00:00:12.000
those things uh at various times in the

00:00:12.000 --> 00:00:15.000
semester but homeworks assignments it's

00:00:15.000 --> 00:00:17.000
all the same thing your first assignment

00:00:17.000 --> 00:00:20.000
has been posted and released albeit not

00:00:20.000 --> 00:00:21.000
completely because this video has not

00:00:21.000 --> 00:00:25.000
yet been made on your canvas site um and

00:00:25.000 --> 00:00:26.000
I just want to walk you through some of

00:00:26.000 --> 00:00:29.000
the laboratory Parts meaning some of the

00:00:29.000 --> 00:00:32.000
using props Parts before we do this I

00:00:32.000 --> 00:00:34.000
just want to remind us that we're remind

00:00:34.000 --> 00:00:36.000
us remind you um that we are using the

00:00:36.000 --> 00:00:39.000
pallet of voices uh wave files the

00:00:39.000 --> 00:00:41.000
pallet of voices um digitized speech

00:00:41.000 --> 00:00:43.000
files for this homework assignment the

00:00:43.000 --> 00:00:45.000
pallet of voices is a public access

00:00:45.000 --> 00:00:48.000
Corpus of the speech of a variety of of

00:00:48.000 --> 00:00:50.000
masculine identifying men including both

00:00:50.000 --> 00:00:54.000
cisgender men transgender men and um

00:00:54.000 --> 00:00:57.000
transgender transmasculine non-binary

00:00:57.000 --> 00:01:01.000
folks as well um people who contributed

00:01:01.000 --> 00:01:03.000
to the pallet of voices agreed to have

00:01:03.000 --> 00:01:05.000
their speech released to a public Corpus

00:01:05.000 --> 00:01:08.000
so there's certainly nothing um unusual

00:01:08.000 --> 00:01:11.000
about this use for it but I do want to

00:01:11.000 --> 00:01:13.000
have you keep in mind that the people

00:01:13.000 --> 00:01:16.000
who participated in this um you know

00:01:16.000 --> 00:01:19.000
deserve privacy and really deserve

00:01:19.000 --> 00:01:22.000
respect so you know as you're using this

00:01:22.000 --> 00:01:23.000
treat these data respectfully this was a

00:01:23.000 --> 00:01:25.000
great gift from the people who

00:01:25.000 --> 00:01:28.000
participated to um to the clinical world

00:01:28.000 --> 00:01:30.000
and to the speech Science World such as

00:01:30.000 --> 00:01:33.000
it is so please treat these recordings

00:01:33.000 --> 00:01:36.000
respectfully um and carefully and when

00:01:36.000 --> 00:01:37.000
you're done with this homework

00:01:37.000 --> 00:01:39.000
assignment ask you to delete them from

00:01:39.000 --> 00:01:40.000
whatever computer you're

00:01:40.000 --> 00:01:43.000
using first thing we're going to do is

00:01:43.000 --> 00:01:45.000
download that and so you can see now why

00:01:45.000 --> 00:01:47.000
I started out with all of those provisos

00:01:47.000 --> 00:01:49.000
all of those little warnings is because

00:01:49.000 --> 00:01:51.000
the first thing I'm going to have you do

00:01:51.000 --> 00:01:54.000
is to download the U files on to

00:01:54.000 --> 00:01:55.000
whatever computer you're going to use

00:01:55.000 --> 00:01:56.000
for this homework assignment and the way

00:01:56.000 --> 00:01:59.000
that we're going to do this is just by

00:01:59.000 --> 00:02:01.000
clicking on this do collection file

00:02:01.000 --> 00:02:03.000
that's going to take us to another

00:02:03.000 --> 00:02:06.000
window and in that window there'll be a

00:02:06.000 --> 00:02:08.000
little button that says download blahy

00:02:08.000 --> 00:02:12.000
blahy blah you want to download that now

00:02:12.000 --> 00:02:14.000
where is that going to go really depends

00:02:14.000 --> 00:02:17.000
on your operating system for me on a

00:02:17.000 --> 00:02:18.000
Windows operating system it goes into a

00:02:18.000 --> 00:02:23.000
folder called downloads um I am I am a a

00:02:23.000 --> 00:02:27.000
I am a hybrid Mac PC user and so I um I

00:02:27.000 --> 00:02:28.000
think I know where it would go if I were

00:02:28.000 --> 00:02:31.000
on my Mac I'm not on my Mac right now if

00:02:31.000 --> 00:02:32.000
you're on

00:02:32.000 --> 00:02:34.000
Linux if you're on Linux then you're

00:02:34.000 --> 00:02:35.000
probably the kind of user who doesn't

00:02:35.000 --> 00:02:37.000
need me to tell you where your downloads

00:02:37.000 --> 00:02:39.000
folder is because Linux users tend to be

00:02:39.000 --> 00:02:42.000
kind of power users don't they all right

00:02:42.000 --> 00:02:43.000
so that's the first thing second thing

00:02:43.000 --> 00:02:45.000
and maybe that should have been the

00:02:45.000 --> 00:02:46.000
first thing is we want to make sure that

00:02:46.000 --> 00:02:50.000
you already have Pro so um go to

00:02:50.000 --> 00:02:55.000
www.pr a.org and download the version of

00:02:55.000 --> 00:02:58.000
prot for your operating system I have

00:02:58.000 --> 00:03:00.000
downloaded prot and used 8 chillion

00:03:00.000 --> 00:03:02.000
times and here it is already open on my

00:03:02.000 --> 00:03:05.000
computer so here we are in prot and I

00:03:05.000 --> 00:03:07.000
certainly hope I've shared my screen in

00:03:07.000 --> 00:03:08.000
such a way that I'm not going to have to

00:03:08.000 --> 00:03:12.000
remake this video um so we want to first

00:03:12.000 --> 00:03:14.000
thing we want to do is read in that do

00:03:14.000 --> 00:03:16.000
collection file which is really just an

00:03:16.000 --> 00:03:18.000
archive of 20 different wave files we're

00:03:18.000 --> 00:03:20.000
going to do that through the open menu

00:03:20.000 --> 00:03:24.000
open read from file and then we're going

00:03:24.000 --> 00:03:26.000
to navigate to the folder where this is

00:03:26.000 --> 00:03:28.000
for me um it's going to be in downloads

00:03:28.000 --> 00:03:31.000
you can see I've actually downloaded it

00:03:31.000 --> 00:03:33.000
multiple times

00:03:33.000 --> 00:03:37.000
here um and here we are so I've um read

00:03:37.000 --> 00:03:41.000
in all 20 of these um files so

00:03:41.000 --> 00:03:46.000
Poore 001 012 etc etc those are just the

00:03:46.000 --> 00:03:48.000
different participant IDs and in this

00:03:48.000 --> 00:03:50.000
exercise everybody is going to be saying

00:03:50.000 --> 00:03:51.000
sentence

00:03:51.000 --> 00:03:54.000
11 uh this exercise is going this

00:03:54.000 --> 00:03:55.000
homework this assignment is going to

00:03:55.000 --> 00:03:58.000
have you first listen to the files and

00:03:58.000 --> 00:04:01.000
make some qualitative ratings of them so

00:04:01.000 --> 00:04:02.000
the first thing I want you to do is just

00:04:02.000 --> 00:04:06.000
listen to all 20 files in a row um and

00:04:06.000 --> 00:04:08.000
just get a sense of what the range of

00:04:08.000 --> 00:04:11.000
variation is in these files um I will

00:04:11.000 --> 00:04:13.000
not do that here because it's just going

00:04:13.000 --> 00:04:14.000
to take up more time on this recording

00:04:14.000 --> 00:04:16.000
but if you want to listen to multiple

00:04:16.000 --> 00:04:18.000
files in Pro you can just select

00:04:18.000 --> 00:04:21.000
multiple files with your mouse let's say

00:04:21.000 --> 00:04:23.000
in this case let's just select the first

00:04:23.000 --> 00:04:25.000
three and then once you press play it's

00:04:25.000 --> 00:04:27.000
going to play everything you that you've

00:04:27.000 --> 00:04:28.000
selected so if you select all 20 and

00:04:28.000 --> 00:04:30.000
press play you got to listen to All 20

00:04:30.000 --> 00:04:33.000
I'm not sure there's a An Elegant way to

00:04:33.000 --> 00:04:34.000
interrupt it other than Force quitting

00:04:34.000 --> 00:04:37.000
the application let's just play these an

00:04:37.000 --> 00:04:40.000
ample wall needs at least four layers of

00:04:40.000 --> 00:04:42.000
bricks an ample wall needs at least four

00:04:42.000 --> 00:04:45.000
layers of bricks an ample wall needs at

00:04:45.000 --> 00:04:48.000
least four layers of bricks right so if

00:04:48.000 --> 00:04:49.000
we went through and listened to all 20

00:04:49.000 --> 00:04:51.000
of those that's the first thing I want

00:04:51.000 --> 00:04:53.000
you to do is just listen and get a sense

00:04:53.000 --> 00:04:56.000
of what the range of these is once

00:04:56.000 --> 00:04:59.000
you've done that let's pivot over here

00:04:59.000 --> 00:05:01.000
to the assign itself the first thing

00:05:01.000 --> 00:05:02.000
that I want you to well the first thing

00:05:02.000 --> 00:05:04.000
we've already done the first thing and

00:05:04.000 --> 00:05:05.000
the second thing and the third thing the

00:05:05.000 --> 00:05:08.000
next thing that I want you to do is to

00:05:08.000 --> 00:05:11.000
make some um perceptual ratings so your

00:05:11.000 --> 00:05:15.000
kind of gut instincts of how um fast or

00:05:15.000 --> 00:05:17.000
not fast the person speaks and how much

00:05:17.000 --> 00:05:19.000
pitch variation you think the person

00:05:19.000 --> 00:05:21.000
uses right and the reason that we're

00:05:21.000 --> 00:05:23.000
choosing these two parameters is once

00:05:23.000 --> 00:05:25.000
you've made these perceptual ratings

00:05:25.000 --> 00:05:27.000
you're going to go in and make some um

00:05:27.000 --> 00:05:29.000
objective ratings of rative speech and

00:05:29.000 --> 00:05:31.000
pitch variation and part of the

00:05:31.000 --> 00:05:33.000
assignment is to see how your um rate of

00:05:33.000 --> 00:05:35.000
speech and Pitch variation measures

00:05:35.000 --> 00:05:37.000
measures compar to your rate of speech

00:05:37.000 --> 00:05:40.000
and Pitch variation ratings so for each

00:05:40.000 --> 00:05:42.000
of the sentences go back and listen to

00:05:42.000 --> 00:05:45.000
it so after you've listened to all 20 go

00:05:45.000 --> 00:05:46.000
back in and listen to them individually

00:05:46.000 --> 00:05:48.000
so this time just listening to number

00:05:48.000 --> 00:05:51.000
one an ample wall needs at least four

00:05:51.000 --> 00:05:54.000
layers of bricks and um rate it's rate

00:05:54.000 --> 00:05:55.000
of speech and Pitch variation on the

00:05:55.000 --> 00:05:58.000
10-point scales listed here so for rate

00:05:58.000 --> 00:06:00.000
of speech we're going to go from 10

00:06:00.000 --> 00:06:02.000
which would correspond to very fast all

00:06:02.000 --> 00:06:04.000
the way down to one which would be mean

00:06:04.000 --> 00:06:06.000
not not very

00:06:06.000 --> 00:06:09.000
fast um you will also rate how much the

00:06:09.000 --> 00:06:10.000
you think the person varies the pitch of

00:06:10.000 --> 00:06:12.000
their voice when they're speaking from

00:06:12.000 --> 00:06:15.000
one no pitch variation to 10 extremely

00:06:15.000 --> 00:06:17.000
variable pitch um you know if these if

00:06:17.000 --> 00:06:19.000
this was a real experiment I would think

00:06:19.000 --> 00:06:21.000
a little bit harder about these scales

00:06:21.000 --> 00:06:23.000
and the labels um you know you should

00:06:23.000 --> 00:06:25.000
never just sort of use labels casually

00:06:25.000 --> 00:06:27.000
in rating experiments I've never used

00:06:27.000 --> 00:06:28.000
these labels before and I'm not in love

00:06:28.000 --> 00:06:30.000
with them but you we got to get this

00:06:30.000 --> 00:06:32.000
homework assignment assigned so forgive

00:06:32.000 --> 00:06:35.000
me for having less elegant labels for

00:06:35.000 --> 00:06:37.000
these so then you're going to go through

00:06:37.000 --> 00:06:39.000
and make those perceptual ratings for

00:06:39.000 --> 00:06:41.000
all 20 of these sentences and you'll see

00:06:41.000 --> 00:06:42.000
if you look through the um nuts and

00:06:42.000 --> 00:06:44.000
bolts of the assignment that I think

00:06:44.000 --> 00:06:46.000
it's a good idea in fact why don't you

00:06:46.000 --> 00:06:48.000
just do it why don't I just say it um

00:06:48.000 --> 00:06:51.000
copy the tables from you know this uh

00:06:51.000 --> 00:06:54.000
canvas assignment into a a Word document

00:06:54.000 --> 00:06:56.000
or a Google Document or an open Office

00:06:56.000 --> 00:06:58.000
document um because I'd like you to be

00:06:58.000 --> 00:07:01.000
I'm asking you to hand in these ratings

00:07:01.000 --> 00:07:02.000
um as well as the objective measures

00:07:02.000 --> 00:07:05.000
we'll talk about in a second all right

00:07:05.000 --> 00:07:06.000
so now you've gone through and you've

00:07:06.000 --> 00:07:09.000
made your 20 ratings and now it's time

00:07:09.000 --> 00:07:12.000
to do some objective measurements of

00:07:12.000 --> 00:07:16.000
rate of speech and um Pitch variation so

00:07:16.000 --> 00:07:17.000
the way that we're going to measure rate

00:07:17.000 --> 00:07:19.000
of speech is in syllables per second

00:07:19.000 --> 00:07:20.000
there are lots of ways that you can

00:07:20.000 --> 00:07:22.000
measure rate of speech syllables per

00:07:22.000 --> 00:07:24.000
second words per second um all of them

00:07:24.000 --> 00:07:26.000
have their pluses and their minuses um

00:07:26.000 --> 00:07:29.000
syllables per second is um is in my

00:07:29.000 --> 00:07:31.000
opinion sort of the well certainly the

00:07:31.000 --> 00:07:33.000
easiest to use one of the easiest to use

00:07:33.000 --> 00:07:35.000
you know if you're talking about words

00:07:35.000 --> 00:07:36.000
per second you have to worry about

00:07:36.000 --> 00:07:38.000
different texts having some words that

00:07:38.000 --> 00:07:40.000
are just inherently longer than others

00:07:40.000 --> 00:07:42.000
um although you could say the same about

00:07:42.000 --> 00:07:45.000
syllables but but I digress all right

00:07:45.000 --> 00:07:47.000
let me just scroll back up

00:07:47.000 --> 00:07:49.000
here yeah make sure I didn't miss

00:07:49.000 --> 00:07:52.000
anything there okay so um how are you g

00:07:52.000 --> 00:07:55.000
to make these well the syllables per

00:07:55.000 --> 00:07:57.000
second measure um because everybody is

00:07:57.000 --> 00:07:59.000
saying the same sentence and let's

00:07:59.000 --> 00:08:01.000
listen list to it an ample wall needs at

00:08:01.000 --> 00:08:04.000
least four layers of bricks an ample

00:08:04.000 --> 00:08:07.000
wall needs at least four layers of

00:08:07.000 --> 00:08:10.000
bricks 12 syllables so everybody is

00:08:10.000 --> 00:08:12.000
going to have the same number of

00:08:12.000 --> 00:08:15.000
syllables but the duration um that they

00:08:15.000 --> 00:08:17.000
say those syllables in is going to vary

00:08:17.000 --> 00:08:19.000
so really your task here um in measuring

00:08:19.000 --> 00:08:21.000
rate of speech is to measure sentence

00:08:21.000 --> 00:08:24.000
duration how you going to do that well

00:08:24.000 --> 00:08:26.000
you'll Click on each one of these files

00:08:26.000 --> 00:08:28.000
you'll do this um one by one in the file

00:08:28.000 --> 00:08:30.000
you'll Click on each one of these files

00:08:30.000 --> 00:08:33.000
and then have um click on view and edit

00:08:33.000 --> 00:08:35.000
so it's got to be highlighted click on

00:08:35.000 --> 00:08:38.000
view and edit and whatever comes up in

00:08:38.000 --> 00:08:40.000
your prop window it's probably going to

00:08:40.000 --> 00:08:42.000
look a little bit different um because I

00:08:42.000 --> 00:08:43.000
use prop so often it doesn't

00:08:43.000 --> 00:08:46.000
automatically go to the defaults that

00:08:46.000 --> 00:08:47.000
would come up if you had just downloaded

00:08:47.000 --> 00:08:51.000
prop for example um but one thing is you

00:08:51.000 --> 00:08:53.000
you probably will see by default um the

00:08:53.000 --> 00:08:56.000
waveform and the spectrogram U but the

00:08:56.000 --> 00:08:57.000
spectrogram might look quite a bit

00:08:57.000 --> 00:09:00.000
different um your you know you can

00:09:00.000 --> 00:09:02.000
change some of the settings for example

00:09:02.000 --> 00:09:03.000
if you wanted to just look at a narrower

00:09:03.000 --> 00:09:06.000
frequency range um you could change this

00:09:06.000 --> 00:09:08.000
to say 5,000 and now we're just looking

00:09:08.000 --> 00:09:12.000
at the lowest 5,000 um you could go much

00:09:12.000 --> 00:09:14.000
higher than that if you want to see a

00:09:14.000 --> 00:09:15.000
bigger frequency range I believe these

00:09:15.000 --> 00:09:18.000
were sampled at 44.1 khz as a sampling

00:09:18.000 --> 00:09:20.000
rate um so you probably aren't going to

00:09:20.000 --> 00:09:24.000
be able to see anything over 22.05 kertz

00:09:24.000 --> 00:09:26.000
um as per Nyquist theorem that we've

00:09:26.000 --> 00:09:27.000
talked about in class and we'll continue

00:09:27.000 --> 00:09:31.000
to talk about at nausea all right so how

00:09:31.000 --> 00:09:32.000
are you going to make these duration

00:09:32.000 --> 00:09:34.000
measures well you might just look at

00:09:34.000 --> 00:09:35.000
this and and look at this uh you might

00:09:35.000 --> 00:09:37.000
look at this window and look at the part

00:09:37.000 --> 00:09:39.000
down here that says 3.11 seconds and say

00:09:39.000 --> 00:09:41.000
well that was easy all I had to do was

00:09:41.000 --> 00:09:43.000
look in the window but we have a little

00:09:43.000 --> 00:09:45.000
bit of padding on the beginning and the

00:09:45.000 --> 00:09:47.000
end here so uh at the beginning here

00:09:47.000 --> 00:09:50.000
there's some silence a little bit for

00:09:50.000 --> 00:09:51.000
those who haven't worked with Pro yet

00:09:51.000 --> 00:09:53.000
which probably many of you so don't feel

00:09:53.000 --> 00:09:55.000
like you're behind if you haven't um

00:09:55.000 --> 00:09:57.000
take a little bit of time to get um

00:09:57.000 --> 00:09:59.000
familiar with Pro uh if you want to

00:09:59.000 --> 00:10:01.000
listen to Simply uh if you want to

00:10:01.000 --> 00:10:03.000
listen to subsets of the sentence you

00:10:03.000 --> 00:10:05.000
can highlight them in the waveform and

00:10:05.000 --> 00:10:07.000
then click this box here that's just

00:10:07.000 --> 00:10:10.000
that

00:10:10.000 --> 00:10:13.000
interval come on M

00:10:13.000 --> 00:10:15.000
M

00:10:15.000 --> 00:10:19.000
M um all right so how are we going to

00:10:19.000 --> 00:10:22.000
make these measurements um what I want

00:10:22.000 --> 00:10:25.000
you to do first is to find the beginning

00:10:25.000 --> 00:10:28.000
of uh of the first word so it's uh an

00:10:28.000 --> 00:10:33.000
ample an ample wall um an is the first

00:10:33.000 --> 00:10:36.000
word so to look at where that starts we

00:10:36.000 --> 00:10:37.000
can actually select this and then click

00:10:37.000 --> 00:10:41.000
seel which will Zoom to the selection

00:10:41.000 --> 00:10:43.000
and try and find here and it should

00:10:43.000 --> 00:10:45.000
hopefully be not too terribly unclear

00:10:45.000 --> 00:10:48.000
for each of these sentences where um the

00:10:48.000 --> 00:10:51.000
first U instance of vocalization of

00:10:51.000 --> 00:10:56.000
speech begins um you'll see in the uh

00:10:56.000 --> 00:10:59.000
rubric for this assignment I have um a

00:10:59.000 --> 00:11:01.000
some guideline or I have some criteria

00:11:01.000 --> 00:11:03.000
for accuracy of measurements but the

00:11:03.000 --> 00:11:06.000
criteria are like 40 milliseconds within

00:11:06.000 --> 00:11:08.000
plus or minus 40 milliseconds so I don't

00:11:08.000 --> 00:11:11.000
want anybody to to sort of sweat whether

00:11:11.000 --> 00:11:13.000
or not the beginning of speech is here

00:11:13.000 --> 00:11:16.000
or here or here or here um as I'm moving

00:11:16.000 --> 00:11:18.000
the cursor around because those are

00:11:18.000 --> 00:11:20.000
those are going to fall well within the

00:11:20.000 --> 00:11:22.000
acceptable range of eror of the

00:11:22.000 --> 00:11:23.000
acceptable margin of error for this

00:11:23.000 --> 00:11:26.000
assignment I kind of like the first

00:11:26.000 --> 00:11:27.000
click that I had because it was a nice

00:11:27.000 --> 00:11:30.000
round number it was about 0.12

00:11:30.000 --> 00:11:35.000
five so so if I were you um for sentence

00:11:35.000 --> 00:11:37.000
number one and that's actually one of

00:11:37.000 --> 00:11:38.000
your sentences so you might as well just

00:11:38.000 --> 00:11:41.000
do as I say is I'd say the start of that

00:11:41.000 --> 00:11:43.000
sentence is at 0.125 seconds and that

00:11:43.000 --> 00:11:47.000
just means there's 0.125 seconds of um

00:11:47.000 --> 00:11:50.000
silence before it or not sentence

00:11:50.000 --> 00:11:52.000
silence and background noise so we've

00:11:52.000 --> 00:11:55.000
zoomed in there and let's go into our

00:11:55.000 --> 00:11:57.000
table um that's one of the things that

00:11:57.000 --> 00:11:59.000
we're going to need to calculate the

00:11:59.000 --> 00:12:01.000
duration of the spoken sentence so it

00:12:01.000 --> 00:12:03.000
begins at

00:12:03.000 --> 00:12:06.000
0.125 where does it end well let's click

00:12:06.000 --> 00:12:08.000
on all here to zoom out to see the

00:12:08.000 --> 00:12:11.000
entire sentence and let's go to the end

00:12:11.000 --> 00:12:16.000
let's zoom into the selection here

00:12:16.000 --> 00:12:18.000
um finding the end of the sentence is

00:12:18.000 --> 00:12:20.000
going to be a little harder than finding

00:12:20.000 --> 00:12:21.000
the beginning of the sentence and part

00:12:21.000 --> 00:12:23.000
of that is because at the end of the

00:12:23.000 --> 00:12:25.000
sentences some people have an extra

00:12:25.000 --> 00:12:29.000
exhale so at the end of a sentence you

00:12:29.000 --> 00:12:34.000
might have and is not speech right so

00:12:34.000 --> 00:12:37.000
for this sentence

00:12:37.000 --> 00:12:39.000
um at least four layers of bricks so

00:12:39.000 --> 00:12:42.000
this last word is

00:12:42.000 --> 00:12:45.000
bricks

00:12:45.000 --> 00:12:49.000
bricks um I would look for the end of

00:12:49.000 --> 00:12:51.000
this frication interval here the end of

00:12:51.000 --> 00:12:55.000
this sort of blob of noise at the end

00:12:55.000 --> 00:12:56.000
that corresponds to the S the voiceless

00:12:56.000 --> 00:12:57.000
alular

00:12:57.000 --> 00:12:59.000
fricative um

00:12:59.000 --> 00:13:01.000
again you want to give yourself a little

00:13:01.000 --> 00:13:03.000
bit of Grace here we've got you know a

00:13:03.000 --> 00:13:05.000
little bit of allowable slop in these

00:13:05.000 --> 00:13:09.000
measurements um on this one here uh G

00:13:09.000 --> 00:13:13.000
listen to it again there's a

00:13:13.000 --> 00:13:16.000
bricks there's a bricks so this little

00:13:16.000 --> 00:13:18.000
acoustic event here I don't think that's

00:13:18.000 --> 00:13:20.000
part of the S I think that's actually

00:13:20.000 --> 00:13:24.000
sort of a noise at the end um but just

00:13:24.000 --> 00:13:26.000
for the sake of it being a really clear

00:13:26.000 --> 00:13:28.000
acoustic Landmark you you might want to

00:13:28.000 --> 00:13:30.000
choose choose this one you could also go

00:13:30.000 --> 00:13:32.000
back and choose something that's closer

00:13:32.000 --> 00:13:34.000
to kind of the end of the frickative

00:13:34.000 --> 00:13:35.000
here and you might want to try it a

00:13:35.000 --> 00:13:37.000
little and just listen you know you can

00:13:37.000 --> 00:13:39.000
listen to the to the left of the cursor

00:13:39.000 --> 00:13:43.000
by clicking down on this uh window

00:13:43.000 --> 00:13:46.000
here there's a bricks click to the right

00:13:46.000 --> 00:13:52.000
of the cursor to listen to what's after

00:13:52.000 --> 00:13:54.000
it yeah that doesn't even sound like

00:13:54.000 --> 00:13:55.000
speech that sounds like somebody

00:13:55.000 --> 00:13:57.000
clicking a key on a a keyboard or

00:13:57.000 --> 00:13:59.000
something

00:13:59.000 --> 00:14:00.000
so I'm actually going to go with where

00:14:00.000 --> 00:14:04.000
the cursor is right now so that's 2.88 s

00:14:04.000 --> 00:14:06.000
seconds I'm just going to the the third

00:14:06.000 --> 00:14:08.000
decimal place and not worrying about the

00:14:08.000 --> 00:14:12.000
rounding rule um so 2.88 s is the end of

00:14:12.000 --> 00:14:16.000
that so if 2.88 s is the end and 0.125

00:14:16.000 --> 00:14:18.000
is the beginning the duration of the

00:14:18.000 --> 00:14:20.000
sentence in seconds is just the

00:14:20.000 --> 00:14:22.000
difference between them I'm not going to

00:14:22.000 --> 00:14:25.000
to show my bad mental math again um as I

00:14:25.000 --> 00:14:26.000
already have multiple times this

00:14:26.000 --> 00:14:28.000
semester so if you wanted to get the

00:14:28.000 --> 00:14:30.000
duration of this sentence it would be

00:14:30.000 --> 00:14:33.000
2.88 7 minus

00:14:33.000 --> 00:14:35.000
.125 whatever that difference is that's

00:14:35.000 --> 00:14:39.000
the duration of the sentence in seconds

00:14:39.000 --> 00:14:40.000
if you want to figure out the rate of

00:14:40.000 --> 00:14:44.000
speech in syllables per second um we'll

00:14:44.000 --> 00:14:46.000
do it let's do the um Pitch variation

00:14:46.000 --> 00:14:49.000
calculations so let me so we will in a

00:14:49.000 --> 00:14:51.000
second I will um remind you how to do

00:14:51.000 --> 00:14:53.000
the uh syllables per second calculation

00:14:53.000 --> 00:14:56.000
but while we're in the um crout window

00:14:56.000 --> 00:14:59.000
here let's look at pitch variations so

00:14:59.000 --> 00:15:02.000
if we want to see fundamental frequency

00:15:02.000 --> 00:15:04.000
we go under the pitch menu and select

00:15:04.000 --> 00:15:07.000
show pitch so what you're seeing down

00:15:07.000 --> 00:15:11.000
here in blue is the um Pro estimate of

00:15:11.000 --> 00:15:14.000
what the fundamental frequency is at

00:15:14.000 --> 00:15:16.000
every moment in time and this uses the

00:15:16.000 --> 00:15:18.000
autocorrelation algorithm that's in

00:15:18.000 --> 00:15:21.000
section 2.3 of the reading for module 2

00:15:21.000 --> 00:15:24.000
and that we talked about in class and

00:15:24.000 --> 00:15:27.000
this autocorrelation uh F0 is perfect a

00:15:27.000 --> 00:15:30.000
perfectly fine estimate to use for this

00:15:30.000 --> 00:15:32.000
particular exercise there are going to

00:15:32.000 --> 00:15:36.000
be some cases where it gets it wrong and

00:15:36.000 --> 00:15:37.000
um we'll look at one in just a second

00:15:37.000 --> 00:15:40.000
but let's finish um number one first

00:15:40.000 --> 00:15:43.000
let's finish pov1 first so this is

00:15:43.000 --> 00:15:45.000
showing you the estimate of the

00:15:45.000 --> 00:15:47.000
fundamental frequency and if you want to

00:15:47.000 --> 00:15:49.000
get the since we're looking at

00:15:49.000 --> 00:15:51.000
fundamental frequency range we want to

00:15:51.000 --> 00:15:53.000
get the Min and the max the minimum and

00:15:53.000 --> 00:15:54.000
the maximum because the fundamental

00:15:54.000 --> 00:15:56.000
frequency range is just going to be the

00:15:56.000 --> 00:15:58.000
difference between those the maximum

00:15:58.000 --> 00:16:00.000
subtracted by the Min the minimum

00:16:00.000 --> 00:16:03.000
subtracted from the maximum and prod

00:16:03.000 --> 00:16:04.000
already makes that very easy for you

00:16:04.000 --> 00:16:07.000
because under pitch we can just select

00:16:07.000 --> 00:16:09.000
get maximum pitch oops we have to select

00:16:09.000 --> 00:16:11.000
something actually with our cursor so

00:16:11.000 --> 00:16:14.000
here we'll select the sentence now it

00:16:14.000 --> 00:16:17.000
should work get maximum pitch here it

00:16:17.000 --> 00:16:22.000
gives gave us a maximum of 165 Hertz f0o

00:16:22.000 --> 00:16:24.000
I keep Pro says pitch we I I should be

00:16:24.000 --> 00:16:26.000
correcting it in saying f0o instead of

00:16:26.000 --> 00:16:29.000
pitch fundamental frequency F0

00:16:29.000 --> 00:16:31.000
we perceive it as pitch but actual pitch

00:16:31.000 --> 00:16:32.000
is slightly different and outside the

00:16:32.000 --> 00:16:36.000
scope of this course so 165 Hertz is the

00:16:36.000 --> 00:16:39.000
maximum if we go into that same

00:16:39.000 --> 00:16:41.000
selection and get the

00:16:41.000 --> 00:16:44.000
minimum that's 80 so our pitch range is

00:16:44.000 --> 00:16:53.000
just going to be 165 minus

00:16:53.000 --> 00:16:55.000
80 okay sorry I had a little cough there

00:16:55.000 --> 00:16:57.000
so I had to mute it all right so we've

00:16:57.000 --> 00:17:00.000
got our pitch Max imum and Pitch minimum

00:17:00.000 --> 00:17:02.000
and from that we can get our pitch range

00:17:02.000 --> 00:17:06.000
in this case uh it was 165 minus 80 um

00:17:06.000 --> 00:17:09.000
to get our rate in syllables per second

00:17:09.000 --> 00:17:11.000
we have our entire sentence duration

00:17:11.000 --> 00:17:13.000
which um I forgot what that is but it's

00:17:13.000 --> 00:17:17.000
2 point something 2.7 something um and

00:17:17.000 --> 00:17:19.000
because we're doing this in syllables

00:17:19.000 --> 00:17:21.000
per second we and there are 12 syllables

00:17:21.000 --> 00:17:22.000
in the sentence we want to take that

00:17:22.000 --> 00:17:24.000
duration and divide it by 12 that's

00:17:24.000 --> 00:17:27.000
going to give give us the average um

00:17:27.000 --> 00:17:29.000
syllables per second average rate of

00:17:29.000 --> 00:17:32.000
speech in syllables per second and I

00:17:32.000 --> 00:17:33.000
don't want to do those calculations for

00:17:33.000 --> 00:17:34.000
you I mean they're kind of um

00:17:34.000 --> 00:17:37.000
straightforward calculations but um I

00:17:37.000 --> 00:17:38.000
want to make sure that you get as much

00:17:38.000 --> 00:17:40.000
practice as possible in this exercise so

00:17:40.000 --> 00:17:41.000
I'm not going to actually do those

00:17:41.000 --> 00:17:43.000
calculations for you plus if I try and

00:17:43.000 --> 00:17:44.000
do them in my head we'll end up with

00:17:44.000 --> 00:17:50.000
another 145 time 2 uh event all right so

00:17:50.000 --> 00:17:52.000
that's very straightforward one there

00:17:52.000 --> 00:17:53.000
was um

00:17:53.000 --> 00:17:57.000
nothing weird about the the um fzo

00:17:57.000 --> 00:18:00.000
estimate let's look at at O2 here

00:18:00.000 --> 00:18:03.000
because if I remember correctly the um

00:18:03.000 --> 00:18:05.000
Pitch tracker the f-zero estimator

00:18:05.000 --> 00:18:07.000
doesn't do as good a job for this person

00:18:07.000 --> 00:18:09.000
as for

00:18:09.000 --> 00:18:12.000
o1 right so first of all this person has

00:18:12.000 --> 00:18:15.000
a much um lower fzero right and if we

00:18:15.000 --> 00:18:16.000
listen to this we can hear it as lower

00:18:16.000 --> 00:18:19.000
PCH an ample wall needs at least four

00:18:19.000 --> 00:18:22.000
layers of bricks what if we select this

00:18:22.000 --> 00:18:25.000
whole what if we do the the very same

00:18:25.000 --> 00:18:27.000
thing that we did before um select the

00:18:27.000 --> 00:18:30.000
entire um Cent and go in and get minimum

00:18:30.000 --> 00:18:35.000
pitch here it's about 75 which is quite

00:18:35.000 --> 00:18:38.000
low um but if we get maximum pitch look

00:18:38.000 --> 00:18:42.000
it gives us 498 so 498 is right at the

00:18:42.000 --> 00:18:45.000
very top of what can be estimated by Pro

00:18:45.000 --> 00:18:47.000
and if we look that 498 is just a little

00:18:47.000 --> 00:18:50.000
jump here in uh the middle of the

00:18:50.000 --> 00:18:53.000
sentence and that's probably a well

00:18:53.000 --> 00:18:55.000
probably why am I being so koi um that

00:18:55.000 --> 00:18:57.000
is definitely a case where the pitch

00:18:57.000 --> 00:19:01.000
tracker has failed um pov2 did not

00:19:01.000 --> 00:19:04.000
suddenly have a jump in in fzo up to 500

00:19:04.000 --> 00:19:07.000
HZ I mean 500 HZ is very high kind of

00:19:07.000 --> 00:19:09.000
toward the limits of what a human can

00:19:09.000 --> 00:19:11.000
produce as a fundamental frequency so if

00:19:11.000 --> 00:19:14.000
it really was a 500 htz fundamental

00:19:14.000 --> 00:19:17.000
frequency it would sound very jarring

00:19:17.000 --> 00:19:19.000
instead this is just a case where the

00:19:19.000 --> 00:19:22.000
fundamental frequency estimate is wrong

00:19:22.000 --> 00:19:24.000
so what are you GNA do in cases like

00:19:24.000 --> 00:19:26.000
this well what I would like you to do is

00:19:26.000 --> 00:19:29.000
to go in in cases like this and find

00:19:29.000 --> 00:19:32.000
where you think the actual maximum the

00:19:32.000 --> 00:19:35.000
maximum correctly tracked pitch is think

00:19:35.000 --> 00:19:38.000
in this case the the lower bound the 75

00:19:38.000 --> 00:19:41.000
Hertz I trust that right um but what is

00:19:41.000 --> 00:19:44.000
the maximum well if we look at you know

00:19:44.000 --> 00:19:46.000
where there are not jumps these two

00:19:46.000 --> 00:19:48.000
things here are these big jumps that are

00:19:48.000 --> 00:19:50.000
not consistent with the rest you know

00:19:50.000 --> 00:19:53.000
probably the maximum non- jumped F0 is

00:19:53.000 --> 00:19:58.000
here around 134 Hertz so for this person

00:19:58.000 --> 00:20:00.000
you know it's a little bit tougher

00:20:00.000 --> 00:20:01.000
because there's going to be a judgment

00:20:01.000 --> 00:20:03.000
call that you have to make I would um

00:20:03.000 --> 00:20:08.000
count this person's maximum F0 is 134

00:20:08.000 --> 00:20:10.000
their minimum is 75 and I would

00:20:10.000 --> 00:20:13.000
calculate the difference of that as the

00:20:13.000 --> 00:20:15.000
um the pitch range for this person the

00:20:15.000 --> 00:20:18.000
f-zero range um just to give you another

00:20:18.000 --> 00:20:20.000
example of finding the beginning and the

00:20:20.000 --> 00:20:21.000
end of the

00:20:21.000 --> 00:20:23.000
sentence i' put the beginning of the

00:20:23.000 --> 00:20:25.000
sentence here at around

00:20:25.000 --> 00:20:28.000
time1 seconds

00:20:28.000 --> 00:20:30.000
and the end which I promised you would

00:20:30.000 --> 00:20:32.000
be a little tougher or a little less

00:20:32.000 --> 00:20:38.000
straightforward I'd put that at about

00:20:38.000 --> 00:20:41.000
2582 and remember once again the

00:20:41.000 --> 00:20:43.000
criteria that we're going to use in the

00:20:43.000 --> 00:20:45.000
grading gives you about 40 milliseconds

00:20:45.000 --> 00:20:47.000
of slop on either end so you don't have

00:20:47.000 --> 00:20:49.000
to worry about being perfect but be you

00:20:49.000 --> 00:20:50.000
know

00:20:50.000 --> 00:20:52.000
conscientious um so first time we've

00:20:52.000 --> 00:20:54.000
done this homework assignment uh it's

00:20:54.000 --> 00:20:56.000
you know there are 20 sentences that

00:20:56.000 --> 00:20:58.000
you're going to make two ratings of um

00:20:58.000 --> 00:21:01.000
perceptually and then two calculations

00:21:01.000 --> 00:21:03.000
of with pitch range and um rate of

00:21:03.000 --> 00:21:05.000
speech in syllables per second um and

00:21:05.000 --> 00:21:07.000
you'll see in looking at the description

00:21:07.000 --> 00:21:08.000
of the assignment that um you're going

00:21:08.000 --> 00:21:10.000
to compare your ratings with the

00:21:10.000 --> 00:21:12.000
objective measures and answer a few

00:21:12.000 --> 00:21:15.000
interpretation questions as well and

00:21:15.000 --> 00:21:17.000
we'll talk about it more in class on

00:21:17.000 --> 00:21:19.000
September 26th just two days from now

00:21:19.000 --> 00:21:20.000
because I'm recording this Sunday

00:21:20.000 --> 00:21:22.000
afternoon all right so I'm going to stop

00:21:22.000 --> 00:21:26.000
sharing here and stop recording I hope

00:21:26.000 --> 00:21:27.000
this is

00:21:27.000 --> 00:21:28.000
useful bye