Avoid data truncation in SAS when importing CSV files

SAS’s Proc Import is great for importing a CSV or other delimited files:things just “work” most of the time. We don’t need to specify variable names, variable type, etc. However, data truncation or mis-matched variable type can happen as the procedure determines the data type and length of the variables based on the first few rows of the delimited file.

As this post suggests, one could use the guessingrows=32767; statement in Proc Import so SAS uses the first 32k rows to determine data type and length.

Alternatively, the safer solution would be to import the delimited file by using the Data step and explicitly use the length statement with a long length option to ensure that no truncation occurs (e.g., length my_var $100). One would also need to specify the data type with the input statement here as well. Note: Do not specify the variable length using the input statement here because SAS might read in characters from other fields as it starts reading from the last delimiter all the way to the character length.

About Vinh Nguyen

Statistician

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>