Classifying Obama/McCain Voters Using A Decision Tree

US National Election Voters

What data are going to use?

The synthetic dataset contains information about voting preferences of a number of voters in 2008 US Presidential Elections together with some demographic information available for the voters.

The data set consists of the following attributes:

Id. Unique Id of each row of a file.
Party. Political party affiliation of the voter.
1 Democratic
2 Republican
3 Independent

Ideology. Political ideology of the voter.
1 Liberal
2 Moderate
3 Conservative

Race. Race of the voter.
1 Black (African-American)
2 White (Caucasian)
3 Other

Gender. Gender of the voter.
1 Male
2 Female

Religion. Religion of the voter.
1 Protestant
2 Catholic
3 Other

Income. The income bracket (annual income) of the voter's family.
1 Less than $30,000
2 $30,000 - $49,999
3 $50,000 - $74,999
4 $75,000 - $99,999
5 $100,000 - $149,999
6 Over $150,000

Education. The highest level of education for the voter.
1 High school diploma or less
2 Undergraduate study/degree
3 Postgraduate study/degree

Age. The age group of the voter.
1 18 - 29
2 30 - 44
3 45 - 64
4 65 and over

Region. The geographic region where the voter lives.
1 Northeast ME, NH, VT, MA, RI, CT, PA, NY, NJ, DE, MD, DC
2 South(east) VA, WV, KY, NC, SC, TN, GA, FL, AL, MS, LA, AR, TX, OK
3 Midwest OH, IN, MI, IL, MO, IA, MN, WI, ND, SD, NE, KS
4 West MT, ID, WA, AK, HI, WY, CO, UT, OR, NV, AZ, NM, CA

BushApproval. Indicator whether the voter approves of George W. Bush in his capacity as the President of the US.
1 Approve'
2 Disapprove

Goal

Determine who voters will vote using demographic data.

RCode



Decision Tree Vizualization 1

Decision Tree Vizualization 2

Decision Tree Vizualization

Decision Tree Vizualization 3

Comments