May 22, 2010

SAS Project 4: Govt....SAS Technical Problem to solve....DIY

SAS Technical Screening Problem asked at CDC, Atlanta…
The following 3 datasets exist in a hypothetical system:
COUNTY
COUNTY_IDSTATE_NAMECOUNTY_NAME
1TexasCollin
2TexasDallas
3GeorgiaDeKalb
AGE_DISTRIBUTION
COUNTY_IDAGE_0_10AGE_10_20AGE_20_40AGE_40_PLUS
1100204060
210104050
3451005667
AGE_DISTRIBUTION_DESC
CATEGORY_NAMECATEGORY_DESCRIPTION
AGE_0_10< 10 years
AGE_10_20Between 10 and 20 years
AGE_20_40Between 20 and 40 years
AGE_40_PLUS> 40 years
Requirement 1: Write a SAS program (with ODS) to use the datasets given above to generate HTML and MS EXCEL file that will look like the following (the files must have the 3 tables given below):
County Name
Age Distribution
Category
Total Number
Collin
< 10 years
100
Between 10 and 20 years
20
Between 20 and 40 years
40
> 40 years
60
Dallas
< 10 years
10
Between 10 and 20 years
10
Between 20 and 40 years
40
> 40 years
50
DeKalb
< 10 years
45
Between 10 and 20 years
100
Between 20 and 40 years
56
> 40 years
67
Category
County
Collin
Dallas
DeKalb
< 10 years
100
10
45
Between 10 and 20 years
20
10
100
Between 20 and 40 years
40
40
56
> 40 years
60
50
67
Total
220
110
268
Category
State
Texas
Georgia
< 10 years
110
45
Between 10 and 20 years
30
100
Between 20 and 40 years
80
56
> 40 years
110
67
Total
330
268
Note: The color scheme and other style elements (bold, italics etc) MUST be incorporated in the output.
Requirement 2: With a different dataset structure as given below, produce the same HTML and MS EXCEL file result as given above:
COUNTY
COUNTY_IDSTATE_NAMECOUNTY_NAME
1TexasCollin
2TexasDallas
3GeorgiaDeKalb
AGE_DISTRIBUTION
COUNTY_IDAGE_DISTR_CAT_IDDISTRIBUTION_COUNT
11.1100
12.220
13.340
14.450
21.110
22.210
23.340
24.450
31.145
32.2100
33.356
34.467
AGE_DISTRIBUTION_DESC
AGE_DISTR_CAT_IDCATEGORY_DESCRIPTION
1.1< 10 years
2.2Between 10 and 20 years
3.3Between 20 and 40 years
4.4> 40 years
What to submit: Please provide two separate self-contained SAS Programs (datasets and data in these datasets must be declared in the ‘DATA’ step of the programs, an external data source should not be used) which can be executed in the SAS console. The programs must generate a HTML file called ‘results.html’ and an MS Excel file called ‘results.xls’, the look and feel of which has been depicted earlier.
The following aspects must be kept in mind when developing the programs:
•Presentation: The generated HTML/MS Excel must resemble the output table structure given above.
•Program efficiency: The program must be as short and efficient as possible.
•In each program a macro variable must be used to determine whether to produce HTML output or MS-Excel output. We will alter the variable value (the possible values must be documented in the program) during execution to generate the desired output.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.