SAS Technical Screening Problem asked at CDC, Atlanta…
The following 3 datasets exist in a hypothetical system:
COUNTY
COUNTY_ID | STATE_NAME | COUNTY_NAME |
1 | Texas | Collin |
2 | Texas | Dallas |
3 | Georgia | DeKalb |
AGE_DISTRIBUTION
COUNTY_ID | AGE_0_10 | AGE_10_20 | AGE_20_40 | AGE_40_PLUS |
1 | 100 | 20 | 40 | 60 |
2 | 10 | 10 | 40 | 50 |
3 | 45 | 100 | 56 | 67 |
AGE_DISTRIBUTION_DESC
CATEGORY_NAME | CATEGORY_DESCRIPTION |
AGE_0_10 | < 10 years |
AGE_10_20 | Between 10 and 20 years |
AGE_20_40 | Between 20 and 40 years |
AGE_40_PLUS | > 40 years |
Requirement 1: Write a SAS program (with ODS) to use the datasets given above to generate HTML and MS EXCEL file that will look like the following (the files must have the 3 tables given below):
County Name
|
Age Distribution
| |
Category
|
Total Number
| |
Collin
|
< 10 years
|
100
|
Between 10 and 20 years
|
20
| |
Between 20 and 40 years
|
40
| |
> 40 years
|
60
| |
Dallas
|
< 10 years
|
10
|
Between 10 and 20 years
|
10
| |
Between 20 and 40 years
|
40
| |
> 40 years
|
50
| |
DeKalb
|
< 10 years
|
45
|
Between 10 and 20 years
|
100
| |
Between 20 and 40 years
|
56
| |
> 40 years
|
67
|
Category
|
County
| ||
Collin
|
Dallas
|
DeKalb
| |
< 10 years
|
100
|
10
|
45
|
Between 10 and 20 years
|
20
|
10
|
100
|
Between 20 and 40 years
|
40
|
40
|
56
|
> 40 years
|
60
|
50
|
67
|
Total
|
220
|
110
|
268
|
Category
|
State
| |
Texas
|
Georgia
| |
< 10 years
|
110
|
45
|
Between 10 and 20 years
|
30
|
100
|
Between 20 and 40 years
|
80
|
56
|
> 40 years
|
110
|
67
|
Total
|
330
|
268
|
Note: The color scheme and other style elements (bold, italics etc) MUST be incorporated in the output.
Requirement 2: With a different dataset structure as given below, produce the same HTML and MS EXCEL file result as given above:
COUNTY
COUNTY_ID | STATE_NAME | COUNTY_NAME |
1 | Texas | Collin |
2 | Texas | Dallas |
3 | Georgia | DeKalb |
AGE_DISTRIBUTION
COUNTY_ID | AGE_DISTR_CAT_ID | DISTRIBUTION_COUNT |
1 | 1.1 | 100 |
1 | 2.2 | 20 |
1 | 3.3 | 40 |
1 | 4.4 | 50 |
2 | 1.1 | 10 |
2 | 2.2 | 10 |
2 | 3.3 | 40 |
2 | 4.4 | 50 |
3 | 1.1 | 45 |
3 | 2.2 | 100 |
3 | 3.3 | 56 |
3 | 4.4 | 67 |
AGE_DISTRIBUTION_DESC
AGE_DISTR_CAT_ID | CATEGORY_DESCRIPTION |
1.1 | < 10 years |
2.2 | Between 10 and 20 years |
3.3 | Between 20 and 40 years |
4.4 | > 40 years |
What to submit: Please provide two separate self-contained SAS Programs (datasets and data in these datasets must be declared in the ‘DATA’ step of the programs, an external data source should not be used) which can be executed in the SAS console. The programs must generate a HTML file called ‘results.html’ and an MS Excel file called ‘results.xls’, the look and feel of which has been depicted earlier.
The following aspects must be kept in mind when developing the programs:
•Presentation: The generated HTML/MS Excel must resemble the output table structure given above.
•Program efficiency: The program must be as short and efficient as possible.
•In each program a macro variable must be used to determine whether to produce HTML output or MS-Excel output. We will alter the variable value (the possible values must be documented in the program) during execution to generate the desired output.
•Presentation: The generated HTML/MS Excel must resemble the output table structure given above.
•Program efficiency: The program must be as short and efficient as possible.
•In each program a macro variable must be used to determine whether to produce HTML output or MS-Excel output. We will alter the variable value (the possible values must be documented in the program) during execution to generate the desired output.