Creating an agegroup variable in SAS
I need help creating this age group variable. In my data age is measured to 9 decimal places. I can decide the categories I just picked the quartiles. But I keep getting these errors...
"ERROR 388-185: Expecting an arithmetic operator.
ERROR 200-322: The symbol is not recognized and will be ignored."
I have tried rounding and changing the le to <= but it still gives the same error... :(
data sta310.hw4;
set sta310.gbcshort;
age_cat=.;
if age le 41.950498302 then age_cat = 1;
if age > 41.950498302 and le 49.764538386 then age_cat=2;
if age > 49.764538386 and le 56.696966378 then age_cat=3;
if age > 56.696966378 then age_cat=4;
run;
sas
add a comment |
I need help creating this age group variable. In my data age is measured to 9 decimal places. I can decide the categories I just picked the quartiles. But I keep getting these errors...
"ERROR 388-185: Expecting an arithmetic operator.
ERROR 200-322: The symbol is not recognized and will be ignored."
I have tried rounding and changing the le to <= but it still gives the same error... :(
data sta310.hw4;
set sta310.gbcshort;
age_cat=.;
if age le 41.950498302 then age_cat = 1;
if age > 41.950498302 and le 49.764538386 then age_cat=2;
if age > 49.764538386 and le 56.696966378 then age_cat=3;
if age > 56.696966378 then age_cat=4;
run;
sas
add a comment |
I need help creating this age group variable. In my data age is measured to 9 decimal places. I can decide the categories I just picked the quartiles. But I keep getting these errors...
"ERROR 388-185: Expecting an arithmetic operator.
ERROR 200-322: The symbol is not recognized and will be ignored."
I have tried rounding and changing the le to <= but it still gives the same error... :(
data sta310.hw4;
set sta310.gbcshort;
age_cat=.;
if age le 41.950498302 then age_cat = 1;
if age > 41.950498302 and le 49.764538386 then age_cat=2;
if age > 49.764538386 and le 56.696966378 then age_cat=3;
if age > 56.696966378 then age_cat=4;
run;
sas
I need help creating this age group variable. In my data age is measured to 9 decimal places. I can decide the categories I just picked the quartiles. But I keep getting these errors...
"ERROR 388-185: Expecting an arithmetic operator.
ERROR 200-322: The symbol is not recognized and will be ignored."
I have tried rounding and changing the le to <= but it still gives the same error... :(
data sta310.hw4;
set sta310.gbcshort;
age_cat=.;
if age le 41.950498302 then age_cat = 1;
if age > 41.950498302 and le 49.764538386 then age_cat=2;
if age > 49.764538386 and le 56.696966378 then age_cat=3;
if age > 56.696966378 then age_cat=4;
run;
sas
sas
asked Nov 20 '18 at 14:37
Anne PetersAnne Peters
61
61
add a comment |
add a comment |
4 Answers
4
active
oldest
votes
this things are better of using proc format. You are missing your variable name after your and arthimetic operator. also you do not need age_cat = . in the beginning. please add your age variable after and before your arthimetic operator as shown below
data sta310.hw4;
set sta310.gbcshort;
age_cat=.;
if age le 41.950498302 then age_cat = 1;
if age > 41.950498302 and age le 49.764538386 then age_cat=2;
if age > 49.764538386 and age le 56.696966378 then age_cat=3;
if age > 56.696966378 then age_cat=4;
run;
add a comment |
The and le
or and <=
syntax is incorrect. Such a syntax might be something out of COBOL.
Try this form of a SAS Expression
value<
variable<=
value
Example
data sta310.hw4;
set sta310.gbcshort;
age_cat=.;
if age <= 41.950498302 then age_cat = 1;
if 41.950498302 < age <= 49.764538386 then age_cat=2;
if 49.764538386 < age <= 56.696966378 then age_cat=3;
if 56.696966378 < age then age_cat=4;
run;
A similar and safer sieve of logic can be accomplished using a select
statement.
select;
when (age <= 41.950498302) age_cat=1;
when (age <= 49.764538386) age_cat=2;
when (age <= 56.696966378) age_cat=3;
otherwise age_cat=4;
end;
The SAS select
is different than C switch
statement in that an affirming when
statement flows past the select
(and does not require a break
as is often seen in switch/case
)
add a comment |
The problem was in your if statements with multiple conditions. Also, because the age_cat is not a numeric variable (i.e you do not want to sum up this variable), I would put it as a character var of length 1, specifying it in an format statement upfront (best practice in SAS data management).
Finally, I would also suggest reformulating your if else construct as to make it more memory efficient:
data sta310.hw4;
set sta310.gbcshort;
format age_cat $1.;
if age <= 41.950498302 then age_cat = "1";
else if 41.950498302 < age <= 49.764538386 then age_cat= "2";
else if 49.764538386 < age <= 56.696966378 then age_cat="3";
else age_cat="4";
run;
Hope this helps,
1
Why you are attaching the$1.
format the the new variable? SAS already knows how to print character variables. To define the variable's type and length before using it in other statements then use aLENGTH
orATTRIB
statements likeFORMAT
or assignment statements.
– Tom
Nov 20 '18 at 15:28
Yes the purpose is to lock in the length , FORMAT statement is more memory efficient than LENGTH statement from a PDV construction perspective
– Daniel Vieira
Nov 20 '18 at 15:30
Whose memory? Once you have an analysis use the wrong number of groups because somehow too short a format was attached to a character variable you will remember for a long time that attaching $xx formats is a dangerous thing.
– Tom
Nov 20 '18 at 15:39
I mean the total memory usage when build the Program Data Vector or PDV for short. LENGTH statement could also have been used of course, Regarding the situation you describe that could happen however that is not relevant to the question at hand so I do not understand why your call out, all the question states is regarding a simple class variable which I am recommending to be made a character variable as it would only occupy 1 byte vs the standard 8 bytes a normal numeric var would. Both LENGTH and FORMAT statements are correct choices for this particular problem / question
– Daniel Vieira
Nov 20 '18 at 15:46
1
A pet peeve of mine because of the danger and the confusion it causes for novice SAS programmers that see that usage and think that the format statement is actually a way of defining the variable's length. Instead the length is being set as a side effect of the variable's first appearance being in the format statement.
– Tom
Nov 20 '18 at 16:02
|
show 1 more comment
If you're grouping with quartiles avoid the hard coding and use PROC RANK with GROUPS=4. The groups will be 0 to 3 but same idea.
proc rank data=sta310.gbcshort out=sta310.hw4 groups=4;
var age;
rank age_cat;
run;
In your current program, this line/logic is your issue:
if age > 41.950498302 and le 49.764538386 then age_cat=2;
It should be:
if 41.950498302 < age <= 49.764538386 then age_cat=2;
You should also switch those to IF/ELSE IF rather than IF statements. You should do this because once it finds the category it stops evaluating the conditions so it's not checking each IF condition which makes it slightly faster. This isn't something you'll notice in your homework but if you ever work on larger data sets this is really important to know.
if age <= 41.950498302 then age_cat = 1;
else if 41.950498302 < age <= 49.764538386 then age_cat=2;
else if 49.764538386 < age <= 56.696966378 then age_cat=3;
else if 56.696966378 < age then age_cat=4;
1
Once you add theELSE
you can simplify the conditions.else if age <= 49.764538386
– Tom
Nov 20 '18 at 15:36
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53395384%2fcreating-an-agegroup-variable-in-sas%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
this things are better of using proc format. You are missing your variable name after your and arthimetic operator. also you do not need age_cat = . in the beginning. please add your age variable after and before your arthimetic operator as shown below
data sta310.hw4;
set sta310.gbcshort;
age_cat=.;
if age le 41.950498302 then age_cat = 1;
if age > 41.950498302 and age le 49.764538386 then age_cat=2;
if age > 49.764538386 and age le 56.696966378 then age_cat=3;
if age > 56.696966378 then age_cat=4;
run;
add a comment |
this things are better of using proc format. You are missing your variable name after your and arthimetic operator. also you do not need age_cat = . in the beginning. please add your age variable after and before your arthimetic operator as shown below
data sta310.hw4;
set sta310.gbcshort;
age_cat=.;
if age le 41.950498302 then age_cat = 1;
if age > 41.950498302 and age le 49.764538386 then age_cat=2;
if age > 49.764538386 and age le 56.696966378 then age_cat=3;
if age > 56.696966378 then age_cat=4;
run;
add a comment |
this things are better of using proc format. You are missing your variable name after your and arthimetic operator. also you do not need age_cat = . in the beginning. please add your age variable after and before your arthimetic operator as shown below
data sta310.hw4;
set sta310.gbcshort;
age_cat=.;
if age le 41.950498302 then age_cat = 1;
if age > 41.950498302 and age le 49.764538386 then age_cat=2;
if age > 49.764538386 and age le 56.696966378 then age_cat=3;
if age > 56.696966378 then age_cat=4;
run;
this things are better of using proc format. You are missing your variable name after your and arthimetic operator. also you do not need age_cat = . in the beginning. please add your age variable after and before your arthimetic operator as shown below
data sta310.hw4;
set sta310.gbcshort;
age_cat=.;
if age le 41.950498302 then age_cat = 1;
if age > 41.950498302 and age le 49.764538386 then age_cat=2;
if age > 49.764538386 and age le 56.696966378 then age_cat=3;
if age > 56.696966378 then age_cat=4;
run;
answered Nov 20 '18 at 15:03
Kiran Kiran
2,8153919
2,8153919
add a comment |
add a comment |
The and le
or and <=
syntax is incorrect. Such a syntax might be something out of COBOL.
Try this form of a SAS Expression
value<
variable<=
value
Example
data sta310.hw4;
set sta310.gbcshort;
age_cat=.;
if age <= 41.950498302 then age_cat = 1;
if 41.950498302 < age <= 49.764538386 then age_cat=2;
if 49.764538386 < age <= 56.696966378 then age_cat=3;
if 56.696966378 < age then age_cat=4;
run;
A similar and safer sieve of logic can be accomplished using a select
statement.
select;
when (age <= 41.950498302) age_cat=1;
when (age <= 49.764538386) age_cat=2;
when (age <= 56.696966378) age_cat=3;
otherwise age_cat=4;
end;
The SAS select
is different than C switch
statement in that an affirming when
statement flows past the select
(and does not require a break
as is often seen in switch/case
)
add a comment |
The and le
or and <=
syntax is incorrect. Such a syntax might be something out of COBOL.
Try this form of a SAS Expression
value<
variable<=
value
Example
data sta310.hw4;
set sta310.gbcshort;
age_cat=.;
if age <= 41.950498302 then age_cat = 1;
if 41.950498302 < age <= 49.764538386 then age_cat=2;
if 49.764538386 < age <= 56.696966378 then age_cat=3;
if 56.696966378 < age then age_cat=4;
run;
A similar and safer sieve of logic can be accomplished using a select
statement.
select;
when (age <= 41.950498302) age_cat=1;
when (age <= 49.764538386) age_cat=2;
when (age <= 56.696966378) age_cat=3;
otherwise age_cat=4;
end;
The SAS select
is different than C switch
statement in that an affirming when
statement flows past the select
(and does not require a break
as is often seen in switch/case
)
add a comment |
The and le
or and <=
syntax is incorrect. Such a syntax might be something out of COBOL.
Try this form of a SAS Expression
value<
variable<=
value
Example
data sta310.hw4;
set sta310.gbcshort;
age_cat=.;
if age <= 41.950498302 then age_cat = 1;
if 41.950498302 < age <= 49.764538386 then age_cat=2;
if 49.764538386 < age <= 56.696966378 then age_cat=3;
if 56.696966378 < age then age_cat=4;
run;
A similar and safer sieve of logic can be accomplished using a select
statement.
select;
when (age <= 41.950498302) age_cat=1;
when (age <= 49.764538386) age_cat=2;
when (age <= 56.696966378) age_cat=3;
otherwise age_cat=4;
end;
The SAS select
is different than C switch
statement in that an affirming when
statement flows past the select
(and does not require a break
as is often seen in switch/case
)
The and le
or and <=
syntax is incorrect. Such a syntax might be something out of COBOL.
Try this form of a SAS Expression
value<
variable<=
value
Example
data sta310.hw4;
set sta310.gbcshort;
age_cat=.;
if age <= 41.950498302 then age_cat = 1;
if 41.950498302 < age <= 49.764538386 then age_cat=2;
if 49.764538386 < age <= 56.696966378 then age_cat=3;
if 56.696966378 < age then age_cat=4;
run;
A similar and safer sieve of logic can be accomplished using a select
statement.
select;
when (age <= 41.950498302) age_cat=1;
when (age <= 49.764538386) age_cat=2;
when (age <= 56.696966378) age_cat=3;
otherwise age_cat=4;
end;
The SAS select
is different than C switch
statement in that an affirming when
statement flows past the select
(and does not require a break
as is often seen in switch/case
)
edited Nov 20 '18 at 15:19
answered Nov 20 '18 at 15:03
RichardRichard
9,08221227
9,08221227
add a comment |
add a comment |
The problem was in your if statements with multiple conditions. Also, because the age_cat is not a numeric variable (i.e you do not want to sum up this variable), I would put it as a character var of length 1, specifying it in an format statement upfront (best practice in SAS data management).
Finally, I would also suggest reformulating your if else construct as to make it more memory efficient:
data sta310.hw4;
set sta310.gbcshort;
format age_cat $1.;
if age <= 41.950498302 then age_cat = "1";
else if 41.950498302 < age <= 49.764538386 then age_cat= "2";
else if 49.764538386 < age <= 56.696966378 then age_cat="3";
else age_cat="4";
run;
Hope this helps,
1
Why you are attaching the$1.
format the the new variable? SAS already knows how to print character variables. To define the variable's type and length before using it in other statements then use aLENGTH
orATTRIB
statements likeFORMAT
or assignment statements.
– Tom
Nov 20 '18 at 15:28
Yes the purpose is to lock in the length , FORMAT statement is more memory efficient than LENGTH statement from a PDV construction perspective
– Daniel Vieira
Nov 20 '18 at 15:30
Whose memory? Once you have an analysis use the wrong number of groups because somehow too short a format was attached to a character variable you will remember for a long time that attaching $xx formats is a dangerous thing.
– Tom
Nov 20 '18 at 15:39
I mean the total memory usage when build the Program Data Vector or PDV for short. LENGTH statement could also have been used of course, Regarding the situation you describe that could happen however that is not relevant to the question at hand so I do not understand why your call out, all the question states is regarding a simple class variable which I am recommending to be made a character variable as it would only occupy 1 byte vs the standard 8 bytes a normal numeric var would. Both LENGTH and FORMAT statements are correct choices for this particular problem / question
– Daniel Vieira
Nov 20 '18 at 15:46
1
A pet peeve of mine because of the danger and the confusion it causes for novice SAS programmers that see that usage and think that the format statement is actually a way of defining the variable's length. Instead the length is being set as a side effect of the variable's first appearance being in the format statement.
– Tom
Nov 20 '18 at 16:02
|
show 1 more comment
The problem was in your if statements with multiple conditions. Also, because the age_cat is not a numeric variable (i.e you do not want to sum up this variable), I would put it as a character var of length 1, specifying it in an format statement upfront (best practice in SAS data management).
Finally, I would also suggest reformulating your if else construct as to make it more memory efficient:
data sta310.hw4;
set sta310.gbcshort;
format age_cat $1.;
if age <= 41.950498302 then age_cat = "1";
else if 41.950498302 < age <= 49.764538386 then age_cat= "2";
else if 49.764538386 < age <= 56.696966378 then age_cat="3";
else age_cat="4";
run;
Hope this helps,
1
Why you are attaching the$1.
format the the new variable? SAS already knows how to print character variables. To define the variable's type and length before using it in other statements then use aLENGTH
orATTRIB
statements likeFORMAT
or assignment statements.
– Tom
Nov 20 '18 at 15:28
Yes the purpose is to lock in the length , FORMAT statement is more memory efficient than LENGTH statement from a PDV construction perspective
– Daniel Vieira
Nov 20 '18 at 15:30
Whose memory? Once you have an analysis use the wrong number of groups because somehow too short a format was attached to a character variable you will remember for a long time that attaching $xx formats is a dangerous thing.
– Tom
Nov 20 '18 at 15:39
I mean the total memory usage when build the Program Data Vector or PDV for short. LENGTH statement could also have been used of course, Regarding the situation you describe that could happen however that is not relevant to the question at hand so I do not understand why your call out, all the question states is regarding a simple class variable which I am recommending to be made a character variable as it would only occupy 1 byte vs the standard 8 bytes a normal numeric var would. Both LENGTH and FORMAT statements are correct choices for this particular problem / question
– Daniel Vieira
Nov 20 '18 at 15:46
1
A pet peeve of mine because of the danger and the confusion it causes for novice SAS programmers that see that usage and think that the format statement is actually a way of defining the variable's length. Instead the length is being set as a side effect of the variable's first appearance being in the format statement.
– Tom
Nov 20 '18 at 16:02
|
show 1 more comment
The problem was in your if statements with multiple conditions. Also, because the age_cat is not a numeric variable (i.e you do not want to sum up this variable), I would put it as a character var of length 1, specifying it in an format statement upfront (best practice in SAS data management).
Finally, I would also suggest reformulating your if else construct as to make it more memory efficient:
data sta310.hw4;
set sta310.gbcshort;
format age_cat $1.;
if age <= 41.950498302 then age_cat = "1";
else if 41.950498302 < age <= 49.764538386 then age_cat= "2";
else if 49.764538386 < age <= 56.696966378 then age_cat="3";
else age_cat="4";
run;
Hope this helps,
The problem was in your if statements with multiple conditions. Also, because the age_cat is not a numeric variable (i.e you do not want to sum up this variable), I would put it as a character var of length 1, specifying it in an format statement upfront (best practice in SAS data management).
Finally, I would also suggest reformulating your if else construct as to make it more memory efficient:
data sta310.hw4;
set sta310.gbcshort;
format age_cat $1.;
if age <= 41.950498302 then age_cat = "1";
else if 41.950498302 < age <= 49.764538386 then age_cat= "2";
else if 49.764538386 < age <= 56.696966378 then age_cat="3";
else age_cat="4";
run;
Hope this helps,
answered Nov 20 '18 at 15:24
Daniel VieiraDaniel Vieira
1236
1236
1
Why you are attaching the$1.
format the the new variable? SAS already knows how to print character variables. To define the variable's type and length before using it in other statements then use aLENGTH
orATTRIB
statements likeFORMAT
or assignment statements.
– Tom
Nov 20 '18 at 15:28
Yes the purpose is to lock in the length , FORMAT statement is more memory efficient than LENGTH statement from a PDV construction perspective
– Daniel Vieira
Nov 20 '18 at 15:30
Whose memory? Once you have an analysis use the wrong number of groups because somehow too short a format was attached to a character variable you will remember for a long time that attaching $xx formats is a dangerous thing.
– Tom
Nov 20 '18 at 15:39
I mean the total memory usage when build the Program Data Vector or PDV for short. LENGTH statement could also have been used of course, Regarding the situation you describe that could happen however that is not relevant to the question at hand so I do not understand why your call out, all the question states is regarding a simple class variable which I am recommending to be made a character variable as it would only occupy 1 byte vs the standard 8 bytes a normal numeric var would. Both LENGTH and FORMAT statements are correct choices for this particular problem / question
– Daniel Vieira
Nov 20 '18 at 15:46
1
A pet peeve of mine because of the danger and the confusion it causes for novice SAS programmers that see that usage and think that the format statement is actually a way of defining the variable's length. Instead the length is being set as a side effect of the variable's first appearance being in the format statement.
– Tom
Nov 20 '18 at 16:02
|
show 1 more comment
1
Why you are attaching the$1.
format the the new variable? SAS already knows how to print character variables. To define the variable's type and length before using it in other statements then use aLENGTH
orATTRIB
statements likeFORMAT
or assignment statements.
– Tom
Nov 20 '18 at 15:28
Yes the purpose is to lock in the length , FORMAT statement is more memory efficient than LENGTH statement from a PDV construction perspective
– Daniel Vieira
Nov 20 '18 at 15:30
Whose memory? Once you have an analysis use the wrong number of groups because somehow too short a format was attached to a character variable you will remember for a long time that attaching $xx formats is a dangerous thing.
– Tom
Nov 20 '18 at 15:39
I mean the total memory usage when build the Program Data Vector or PDV for short. LENGTH statement could also have been used of course, Regarding the situation you describe that could happen however that is not relevant to the question at hand so I do not understand why your call out, all the question states is regarding a simple class variable which I am recommending to be made a character variable as it would only occupy 1 byte vs the standard 8 bytes a normal numeric var would. Both LENGTH and FORMAT statements are correct choices for this particular problem / question
– Daniel Vieira
Nov 20 '18 at 15:46
1
A pet peeve of mine because of the danger and the confusion it causes for novice SAS programmers that see that usage and think that the format statement is actually a way of defining the variable's length. Instead the length is being set as a side effect of the variable's first appearance being in the format statement.
– Tom
Nov 20 '18 at 16:02
1
1
Why you are attaching the
$1.
format the the new variable? SAS already knows how to print character variables. To define the variable's type and length before using it in other statements then use a LENGTH
or ATTRIB
statements like FORMAT
or assignment statements.– Tom
Nov 20 '18 at 15:28
Why you are attaching the
$1.
format the the new variable? SAS already knows how to print character variables. To define the variable's type and length before using it in other statements then use a LENGTH
or ATTRIB
statements like FORMAT
or assignment statements.– Tom
Nov 20 '18 at 15:28
Yes the purpose is to lock in the length , FORMAT statement is more memory efficient than LENGTH statement from a PDV construction perspective
– Daniel Vieira
Nov 20 '18 at 15:30
Yes the purpose is to lock in the length , FORMAT statement is more memory efficient than LENGTH statement from a PDV construction perspective
– Daniel Vieira
Nov 20 '18 at 15:30
Whose memory? Once you have an analysis use the wrong number of groups because somehow too short a format was attached to a character variable you will remember for a long time that attaching $xx formats is a dangerous thing.
– Tom
Nov 20 '18 at 15:39
Whose memory? Once you have an analysis use the wrong number of groups because somehow too short a format was attached to a character variable you will remember for a long time that attaching $xx formats is a dangerous thing.
– Tom
Nov 20 '18 at 15:39
I mean the total memory usage when build the Program Data Vector or PDV for short. LENGTH statement could also have been used of course, Regarding the situation you describe that could happen however that is not relevant to the question at hand so I do not understand why your call out, all the question states is regarding a simple class variable which I am recommending to be made a character variable as it would only occupy 1 byte vs the standard 8 bytes a normal numeric var would. Both LENGTH and FORMAT statements are correct choices for this particular problem / question
– Daniel Vieira
Nov 20 '18 at 15:46
I mean the total memory usage when build the Program Data Vector or PDV for short. LENGTH statement could also have been used of course, Regarding the situation you describe that could happen however that is not relevant to the question at hand so I do not understand why your call out, all the question states is regarding a simple class variable which I am recommending to be made a character variable as it would only occupy 1 byte vs the standard 8 bytes a normal numeric var would. Both LENGTH and FORMAT statements are correct choices for this particular problem / question
– Daniel Vieira
Nov 20 '18 at 15:46
1
1
A pet peeve of mine because of the danger and the confusion it causes for novice SAS programmers that see that usage and think that the format statement is actually a way of defining the variable's length. Instead the length is being set as a side effect of the variable's first appearance being in the format statement.
– Tom
Nov 20 '18 at 16:02
A pet peeve of mine because of the danger and the confusion it causes for novice SAS programmers that see that usage and think that the format statement is actually a way of defining the variable's length. Instead the length is being set as a side effect of the variable's first appearance being in the format statement.
– Tom
Nov 20 '18 at 16:02
|
show 1 more comment
If you're grouping with quartiles avoid the hard coding and use PROC RANK with GROUPS=4. The groups will be 0 to 3 but same idea.
proc rank data=sta310.gbcshort out=sta310.hw4 groups=4;
var age;
rank age_cat;
run;
In your current program, this line/logic is your issue:
if age > 41.950498302 and le 49.764538386 then age_cat=2;
It should be:
if 41.950498302 < age <= 49.764538386 then age_cat=2;
You should also switch those to IF/ELSE IF rather than IF statements. You should do this because once it finds the category it stops evaluating the conditions so it's not checking each IF condition which makes it slightly faster. This isn't something you'll notice in your homework but if you ever work on larger data sets this is really important to know.
if age <= 41.950498302 then age_cat = 1;
else if 41.950498302 < age <= 49.764538386 then age_cat=2;
else if 49.764538386 < age <= 56.696966378 then age_cat=3;
else if 56.696966378 < age then age_cat=4;
1
Once you add theELSE
you can simplify the conditions.else if age <= 49.764538386
– Tom
Nov 20 '18 at 15:36
add a comment |
If you're grouping with quartiles avoid the hard coding and use PROC RANK with GROUPS=4. The groups will be 0 to 3 but same idea.
proc rank data=sta310.gbcshort out=sta310.hw4 groups=4;
var age;
rank age_cat;
run;
In your current program, this line/logic is your issue:
if age > 41.950498302 and le 49.764538386 then age_cat=2;
It should be:
if 41.950498302 < age <= 49.764538386 then age_cat=2;
You should also switch those to IF/ELSE IF rather than IF statements. You should do this because once it finds the category it stops evaluating the conditions so it's not checking each IF condition which makes it slightly faster. This isn't something you'll notice in your homework but if you ever work on larger data sets this is really important to know.
if age <= 41.950498302 then age_cat = 1;
else if 41.950498302 < age <= 49.764538386 then age_cat=2;
else if 49.764538386 < age <= 56.696966378 then age_cat=3;
else if 56.696966378 < age then age_cat=4;
1
Once you add theELSE
you can simplify the conditions.else if age <= 49.764538386
– Tom
Nov 20 '18 at 15:36
add a comment |
If you're grouping with quartiles avoid the hard coding and use PROC RANK with GROUPS=4. The groups will be 0 to 3 but same idea.
proc rank data=sta310.gbcshort out=sta310.hw4 groups=4;
var age;
rank age_cat;
run;
In your current program, this line/logic is your issue:
if age > 41.950498302 and le 49.764538386 then age_cat=2;
It should be:
if 41.950498302 < age <= 49.764538386 then age_cat=2;
You should also switch those to IF/ELSE IF rather than IF statements. You should do this because once it finds the category it stops evaluating the conditions so it's not checking each IF condition which makes it slightly faster. This isn't something you'll notice in your homework but if you ever work on larger data sets this is really important to know.
if age <= 41.950498302 then age_cat = 1;
else if 41.950498302 < age <= 49.764538386 then age_cat=2;
else if 49.764538386 < age <= 56.696966378 then age_cat=3;
else if 56.696966378 < age then age_cat=4;
If you're grouping with quartiles avoid the hard coding and use PROC RANK with GROUPS=4. The groups will be 0 to 3 but same idea.
proc rank data=sta310.gbcshort out=sta310.hw4 groups=4;
var age;
rank age_cat;
run;
In your current program, this line/logic is your issue:
if age > 41.950498302 and le 49.764538386 then age_cat=2;
It should be:
if 41.950498302 < age <= 49.764538386 then age_cat=2;
You should also switch those to IF/ELSE IF rather than IF statements. You should do this because once it finds the category it stops evaluating the conditions so it's not checking each IF condition which makes it slightly faster. This isn't something you'll notice in your homework but if you ever work on larger data sets this is really important to know.
if age <= 41.950498302 then age_cat = 1;
else if 41.950498302 < age <= 49.764538386 then age_cat=2;
else if 49.764538386 < age <= 56.696966378 then age_cat=3;
else if 56.696966378 < age then age_cat=4;
answered Nov 20 '18 at 15:30
ReezaReeza
13.2k21227
13.2k21227
1
Once you add theELSE
you can simplify the conditions.else if age <= 49.764538386
– Tom
Nov 20 '18 at 15:36
add a comment |
1
Once you add theELSE
you can simplify the conditions.else if age <= 49.764538386
– Tom
Nov 20 '18 at 15:36
1
1
Once you add the
ELSE
you can simplify the conditions. else if age <= 49.764538386
– Tom
Nov 20 '18 at 15:36
Once you add the
ELSE
you can simplify the conditions. else if age <= 49.764538386
– Tom
Nov 20 '18 at 15:36
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53395384%2fcreating-an-agegroup-variable-in-sas%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown