SAS row comparisons and queries? or other solution needed. I’m stumped

sas

I have a situation where for each unique observation of casenum I would like to run varies queries and arithmatic operations between various observations of 'code' for that 'casenum' (see below). For example for casenum 1234567 I would like subtract data for code 0200 – code 0234 or 531 – 53. Please keep in mind that there are thousands of observations in this dataset. Is there an easy way to do this or to do row comparisons with the particular.

Please note casenum and code are character variables and data is a numeric variable

Here is an example of how the dataset is structured:

casenum  code  data

1234567  0123  4597  
1234567  0234    53  
1234567  0100   789  
1234567  0200   531  
1234567  0300   354  
1111112  0123    79  
1111112  0234    78  
1111112  0100    77   
1111112  0200  7954  
1111112  0300    35

Here is the logic although likely syntactically incorrect of what I am trying to do.

For code observations where casenum is the same, within those casenums
I would like it to determine, if data for code 0234 + data for code 0100 – data for code 0123 ne data for code 0200 then newvariable = 'YES'

In other words I'd like it to test if 53 + 789 – 4597 ne 531. after that and other similar kinds of tests run within casenum 1234567, I'd like it to move onto the next casenum, and run those same tests for that casenum.

Keep in mind this dataset has hundreds of thousands of observations in it.

Best Solution

I'm unclear on what your logic is for the subtraction part of the code, but for the selection of a group of rows I can suggest. At first glance I would obtain a list of distinct values for casenum.

proc sql;
select distinct casenum 
into :casenum_list separated by ' '
from dataset;
quit;

Now that you have a list of all distinct casenum values, I would iterate through the rows following whatever logic you need.

Possibly using another proc sql like:

%MACRO DOIT;
%LET COUNT=1;
%DO %UNTIL (%SCAN(&casenum_list,&COUNT) EQ);

%LET CASENUM_VAR=%SCAN(&casenum_list,&COUNT);

PROC SQL;
SELECT
<INSERT SOME SQL LOGIC HERE>
FROM 
DATASET
WHERE CASENUM=&CASENUM_VAR;
QUIT;

%LET COUNT=%EVAL(&COUNT+1);        

%MEND DOIT;

%DOIT;

I hope this helps. If you can provide more insight into what you are trying to accomplish within the rows, I can be more specific.

Related Question