Nov 9, 2009

SAS Interview Questions and Answers(1)

What SAS statements would you code to read an external raw data file to a DATA step?
We use SAS statements –
FILENAME – to specify the location of the file
INFILE – Identifies an external file to read with an INPUT statement
INPUT – to specify the variables that the data is identified with.
How do you read in the variables that you need?
Using Input statement with column /line pointers, informats and length specifiers.
Are you familiar with special input delimiters? How are they used?
DLM, DSD are the special input delimiters…
DELIMITER= delimiter(s)
specifies an alternate delimiter (other than a blank) to be used for LIST input
DSD (delimiter-sensitive data)
specifies that when data values are enclosed in quotation marks, delimiters within the value be treated as character data. The DSD option changes how SAS treats delimiters when you use LIST input and sets the default delimiter to a comma. When you specify DSD, SAS treats two consecutive delimiters as a missing value and removes quotation marks from character values
If reading a variable length file with fixed input, how would you prevent SAS from reading the next record if the last variable didn’t have a value?
Options MISSOVER and TRUNCOVER options..
MISSOVER
prevents an INPUT statement from reading a new input data record if it does not find values in the current input line for all the variables in the statement. When an INPUT statement reaches the end of the current input data record, variables without any values assigned are set to missing.
TRUNCOVER
overrides the default behavior of the INPUT statement when an input data record is shorter than the INPUT statement expects. By default, the INPUT statement automatically reads the next input data record. TRUNCOVER enables you to read variable-length records when some records are shorter than the INPUT statement expects. Variables without any values assigned are set to missing.
What is the difference between an informat and a format? Name three informats or formats.
INFORMAT Statement – Associates informats with variables
It’s basically used in an input / SQL create table statements to read external file raw data or data that is not in a SAS format.
http://support.sas.com/onlinedoc/913/getDoc/en/lrdict.hlp/a000178244.htm
eg: commaw. datew. Wordatew. dollarw. $varyinglengthw.
FORMAT Statement Associates formats with variables
It’s basically used in a datastep format / SQL select / Procedure format statements to output SAS data to a file/report etc
Formats can look-like informats but are differentiated as to which statement they are used in…
eg. Datew., Worddatew., mmddyyw.
Name and describe three SAS functions that you have used, if any?
The most common functions that would be used are-
Conversion functions – Input / Put / int / ceil / floor
Character functions – Scan / substr / index / Left / trim / compress / cat / catx / upcase,lowcase
Arithmetic functions – Sum / abs /
Attribute info functions – Attrn / length
Dataset – open / close / exist
Directory – dexist / dopen / dclose / dcreate / dinfo
File functions – fexist / fopen/ filename / fileref
SQL functions – coalesce / count / sum/ mean
Date functions – date / today / datdif / datepart / datetime / intck / mdy
Array functions – dim
How would you code the criteria to restrict the output to be produced?
In view of in-sufficient clarity as to what the interviewer refers to –
Global statement – options obs=;
Dataset options – obs=
Proc SQL – NOPRINT option for reporting / inobs= , outobs= for SQL select
Proc datasets – NOLIST option
What is the purpose of the trailing @ and the @@? How would you use them?
Line-hold specifiers keep the pointer on the current input record when
  • a data record is read by more than one INPUT statement (trailing @)
  • one input line has values for more than one observation (double trailing @)
  • a record needs to be reread on the next iteration of the DATA step (double trailing @).
Use a single trailing @ to allow the next INPUT statement to read from the same record. Use a double trailing @ to hold a record for the next INPUT statement across iterations of the DATA step.
Normally, each INPUT statement in a DATA step reads a new data record into the input buffer. When you use a trailing @, the following occurs:
  • The pointer position does not change.
  • No new record is read into the input buffer.
  • The next INPUT statement for the same iteration of the DATA step continues to read the same record rather than a new one.
SAS releases a record held by a trailing @ when
  • a null INPUT statement executes: 
    input;
  • an INPUT statement without a trailing @ executes
  • the next iteration of the DATA step begins.
Normally, when you use a double trailing @ (@@), the INPUT statement for the next iteration of the DATA step continues to read the same record. SAS releases the record that is held by a double trailing @
  • immediately if the pointer moves past the end of the input record
  • immediately if a null INPUT statement executes: 
    input;
  • when the next iteration of the DATA step begins if an INPUT statement with a single trailing @ executes later in the DATA step:

    input @;
A record held by the double trailing at sign (@@) is not released until
  • the input pointer moves past the end of the record. Then the input pointer moves down to the next record.
 
>—-+—-10–V+-
 
102
 
92
 
78
 
103
 
84 23 36 75
 
 
 
 
 
  • an INPUT statement without a line-hold specifier executes.
 
input ID $4. @@;
.
.    
input Department 5.;
  • enables the next INPUT statement to read from the same record
  • releases the current record when a subsequent INPUT statement executes without a line-hold specifier.
Unlike the @@, the single @ also releases a record when control returns to the top of the DATA step for the next iteration.
data perm.sales97;
        infile data97  missover;
        input ID $4. @;
        do Quarter=1 to 4;
           input Sales : comma. @;
           output;
        end;
     run;
Raw Data File Data97
>—-V—-10—+—-20—+—-30—+—-40
 
0734
 
1,323.34
 
2,472.85
 
3,276.65
 
5,345.52
 
0943 1,908.34 2,560.38
1009 2,934.12 3,308.41 4,176.18 7,581.81
 
data perm.people (drop=type);
        infile census;
        retain Address;
        input type $1. @;
        if type='H' then input @3 Address $15.;
        if type='P';
        input @3 Name $10. @13 Age 3. @15 Gender $1.;
      run;
>V—+—-10—+—-
 
H
 321 S. MAIN ST
 
P
 MARY E    21 F
 
P
P
 WILLIAM M 23 M
 SUSAN K    3 F
 
data perm.residnts;
infile census;
retain Address;
input type $1. @;
if type='H' then do;
   if _n_ > 1 then output;
   Total=0;
   input Address $ 3-17;
end;
else if type='P'  then total+1;
>—-+—-10—+—-20
 
H
P
P
P
H
P
P
P
P
P
H
P
P
H
P
P
 321 S. MAIN ST
 MARY E    21 F
 WILLIAM M 23 M
 SUSAN K    3 F
 324 S. MAIN ST
 THOMAS H  79 M
 WALTER S  46 M
 ALICE A   42 F
 MARYANN A 20 F
 JOHN S    16 M
 325A S. MAIN ST
 JAMES L 34 M
 LIZA A 31 F
 325B S. MAIN ST
 MARGO K 27 F
 WILLIAM R 27 M
 
P ROBERT W 1 M
 
Under what circumstances would you code a SELECT construct instead of IF statements?
The SELECT statement begins a SELECT group. SELECT groups contain WHEN statements that identify SAS statements that are executed when a particular condition is true. Use at least one WHEN statement in a SELECT group. An optional OTHERWISE statement specifies a statement to be executed if no WHEN condition is met. An END statement ends a SELECT group.
Null statements that are used in WHEN statements cause SAS to recognize a condition as true without taking further action. Null statements that are used in OTHERWISE statements prevent SAS from issuing an error message when all WHEN conditions are false.
Using Select-When improves processing efficiency and understandability in programs that needed to check a series of conditions for the same variable.
Use IF-THEN/ELSE statements for programs with few statements.
Using a subsetting IF statement without a THEN clause could be dangerous because it would process only those records that meet the condition specified in the IF clause.
What statement you code to tell SAS that it is to write to an external file?
FILENAME / FILE/ PUT
The FILENAME statement is an optional statement that species the location of the external file.
PUT Statement – Writes the variable values to the external file.
The FILE statement specifies the current output file for PUT statements in the DATA step.
When multiple FILE statements are present, the PUT statement builds and writes output lines to the file that was specified in the most recent FILE statement. If no FILE statement was specified, the PUT statement writes to the SAS log. The specified output file must be an external file, not a SAS data library, and it must be a valid access type.
If reading an external file to produce an external file, what is the shortcut to write that record without coding every single variable on the record?
Use the _infile_ option in the put statement
filename some 'c:\cool.dat';
filename cool1 'c:\cool1.dat';
data _null_;
infile some;
input some;
file cool1;
put _infile_;
run;

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.