Quantitative Analysis - Elementary
Political Science 685
Lab 2
Due Tuesday, January 23
I. The following data set is an example of time-series data, as contrasted with crosssectional data.  Enter the data into SPSS for Windows.  To do this, open SPSS for Windows.  On the screen, you will see the "Newdata" file.  To enter data, double click on "var."  Name the variable.  Then, go to "Type..." For y, m, and v, you will need to change the decimal setting to match the data.  Then, enter the data.

t: time trend
y: year of presidential election
m: annual growth rate of real personal disposable income
p: cpi-based annual rate of inflation
u: annual unemployment rate
v: Democratic share of the two-party vote

 t   y      m      p    u    v   
=================================
 1 1932 -13.8936 -9.9 23.6 0.592 
 2 1936  11.2831  1.5 16.9 0.625 
 3 1940   5.3778  0.7 14.6 0.550 
 4 1944   2.6156  1.7  1.2 0.538 
 5 1948   3.7344  8.1  3.8 0.523 
 6 1952  1.33760  1.9  3.0 0.446 
 7 1956  2.92265  1.5  4.1 0.423 
 8 1960  0.14933  1.7  5.5 0.501 
 9 1964  5.47193  1.3  5.2 0.613 
10 1968  2.86171  4.2  3.6 0.496 
11 1972  2.88392  3.2  5.6 0.382 
12 1976  2.58274  5.8  7.7 0.511 
13 1980 -1.08862 13.5  7.1 0.447 
14 1984  4.92447  4.3  7.5 0.409 
15 1988  3.34549  4.0  5.5 0.461 
 
 

(1)  Provide descriptive statistics for m, p, u, and v.  To do this, select Statistics.  Then, select Summarize.  Then, select Descriptives.  Choose the descriptive statistics that are appropriate and print the output.

(2)  Graph the variables m, p, u, and v over time.  Select Graphs and then Sequence.  Make four different graphs, one for each variable, using y (year of presidential election) for the time axis.  Print these graphs.
 

II.  <chapter 4, especially 4-3, will be helpful in answering this question>
 Three assumptions are made about American presidential elections:
<1> Only two parties, the Democrats and the Republicans, are competing for the control of the presidency. <2> The probability for a change of party control occurring in any election is a constant, p. <3> The result of an election does not depend on the result of any previous election.

If we use '1' to represent the occurrence of a change of party control and '0' for no change, the "event data" from 1828 to 1988 is the following series:
  10011110100000111100010100100001010101100

(1) Based on the data provided, what would be a reasonable estimate of p?

(2) A random variable, S, is defined as the number of changes of party control (i.e., the number of 1's) in a four (4) election period. What theoretical distribution does S follow? Write down the formula for the distribution.

(3) A second random variable, D, is defined as the "duration" of party control, i.e., the number of consecutive terms a party (whether it is the Democrats or the Republicans) is in control. Compute the theoretical probabilities Pr(D) for D=1, 2, 3, 4, 5, and 6+ (i.e., a duration of six terms or more).

(4) As you can check from the data provided, since 1829 there have been 18 changes in party control of the presidency. Write down the durations of all the 18 observed controls and present them in the form of a frequency distribution using absolute frequencies. Also, compute the theoretical (or expected) absolute frequency distribution on the basis of your results from (3). Compare the empirical distribution with the theoretical distribution. Do you think the theoretical distribution fits the empirical distribution well?
 

III. Sampling Distribution

(1) Consider the respondents of the 1984 Gallup Survey who voted for either Reagan or Mondale as a "population." Compute the proportion of the population who voted for Reagan. Note that you need to use the following SPSS command to select cases to be included in the "active file." As you should find out after running this job, the population has a size of N=1085.  In the data set, variable 4 (v004) is the Presidential vote choice variable, with 1=Reagan and 2=Mondale.

SYNTAX:
select if (v004=1 or v004= 2).
(Alternatively, you can use:  select if any (V004,1,2).)
(Both of these merely select which cases are to be used; do not run the line alone.)
frequencies variables = v004.

Running these two lines will provide the information needed for III (1).

(2) Draw 25 random samples of size N=12 from this population. For each sample, compute the number of respondents who voted for Reagan. In doing this, you need to repeat the following block of commands 25 times. (To avoid typing, you can select Edit, Copy, and then Paste the three lines of syntax 24 times.)  Remember that these three lines need to follow the syntax line which tells SPSS to select only the respondents who voted for either Reagan or Mondale.

SYNTAX:
temporary.
sample 12 from 1085.
frequencies variables=v004.
 

(3) Provide the frequency distribution of the 25 numbers you get in (2).  (This may be done by hand.)

(4) What's the theoretical distribution of the random variable S = the number of respondents who voted for Reagan in a sample of 12?  Compare your empirical distribution in (3) with this theoretical distribution.