On the second line, the INPUT statement tells SAS the names and order of the variables in the dataset. In the first line, we declare a new dataset with the name dataset-name. INPUT variable-name-1 VARIABLE-1-INFORMAT variable-name-2 VARIABLE-2-INFORMAT Its general syntax is: DATA dataset-name In both of these cases, we can include our informats as part of the INPUT statement, which spells out the name and order of the variables in the dataset being created. Specifically, they are relevant if you will be reading data from a file using an INFILE statement, or manually creating cases using the DATALINES command. In order to view date variables "normally", you must apply a date format to the variable.īecause informats define how variables should be "read" or "interpreted", their use is generally limited to inside the data step. This display method makes perfect sense for doing date arithmetic, but is inconvenient for human readers. If you supply an informat for a date variable but not a format, SAS will default to displaying the number of days before/since January 1, 1960. For example, the date Jwill be stored in SAS as the number 14425 because Jwas 14,425 days after January 1, 1960. This means that stored date values can be negative (if the date is before January 1, 1960) or positive (if the date is after January 1, 1960). Regardless of the informat, date values in SAS are stored as the number of days since January 1, 1960. This is particularly useful when you have numerically coded categorical variables for example, a variable representing a multiple-choice question.įor more on defining your own formats, check out the User-Defined Formats tutorial. In addition to the built-in formats, it's possible to define your own formats in SAS. you can add, subtract, multiply, and divide them), but arbitrarily change the formatted display of those numbers without sacrificing the "numeric-ness" of the variable.įor a full list of built-in formats, see the SAS documentation: This may seem like a small matter, but it's incredibly powerful: it allows you to have variables in your dataset that function as numbers (i.e. Writing numbers as roman numerals ( ROMANw.Formatting for percentages ( PERCENTw.d format).Formatting for dollar amounts ( DOLLARw.d format).Comma formatting for large numbers ( COMMAw.d format).Here's a small selection of built-in SAS formats that can change the display of numeric variables: In addition to the above generic informats, there are also many specific display formats. SAS will not recognize the informat name without the dot. This helps SAS recognize that it is an informat name rather than a variable name. Notice that the format/informat names contain a period. In these codes, w denotes the width of the variable, and d denotes the number of decimal places. Reads in numeric data of length w with d decimal points Generically, the informat/format codes follow these patterns: Type There are three main types of built-in informats in SAS: character, numeric, and date. Built-In Formats & Informatsįormats and informats in SAS use a common set of pattern codes. So informats and formats are a shared set of common patterns for reading and writing data values - the only difference is whether we apply them at the "interpretation" stage (informats) or at the "display" stage (formats). Informats are the way we give SAS an explicit rule to follow so that it makes the right judgements.įormats, on the other hand, allow SAS to change the display value "after the fact" - i.e., once SAS knows that 12-01-99 should be interpreted as MM-DD-YY, it knows that date could also be displayed as "" or "December 1, 1999" or "1 December 1999". When reading data, SAS also must make the same judgement to interpret the true meaning of the values. But if you're in Europe or Canada, you probably write your dates using DD-MM-YY order, and therefore interpret the date as "12 January 1999". If you're in the United States, you probably write your dates using MM-DD-YY order, and would therefore interpret the date as "December 1, 1999". How do you know if I mean "December 1, 1999" or "January 12, 1999"? Suppose I tell you that a person's birthday is on 12-01-99. To understand the need for informats and formats, let's start with a simple example. In this tutorial, we'll focus on SAS's built-in formats, which mostly cover numeric, date, and character variables. In SAS, formats and informats are pre-defined patterns for interpreting or displaying data values.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |